1. Overview
When we run a Java application, the JVM undertakes an intricate series of steps before our code executes. In this tutorial, we’ll look at what happens from the moment we run the java command until our application starts up.
Using a simple HelloWorld program as an example, we’ll break down each stage of the process. Understanding these internal mechanisms can significantly enhance the effectiveness of debugging and performance tuning.
2. From java Command to JVM Launch
Before the JVM can execute any code, it must first launch, validate inputs, and configure its environment. Here, we walk through the early startup sequence, from invoking the java command to initializing the JVM runtime.
2.1. The java Command and Initial Invocation
When we run the java command, the JVM startup sequence begins by invoking the JNI method JNI_CreateJavaVM(). This method carries out several essential initialization tasks that prepare the environment for executing Java applications. The Java Native Interface (JNI) acts as a bridge between the JVM and the native system libraries. It enables seamless two-way communication between Java code and platform-specific functionality.
Throughout this article, we’ll use detailed logging to observe how the JVM operates internally using:
java -Xlog:all=trace HelloWorld
First, the JVM validates the arguments we pass:
[0.006s][info][arguments] VM Arguments:
[arguments] jvm_args: -Xlog:all=trace:file=helloworld.log
[arguments] java_command: HelloWorld
[arguments] java_class_path (initial): .
[arguments] Launcher Type: SUN_STANDARD
The JVM verifies the target artifact, classpath, and any JVM arguments to make sure they’re valid before moving forward. This validation step helps catch many common configuration errors early in the startup process, preventing them from causing issues later on.
2.3. Detecting System Resources
Next, the JVM identifies the system resources available to it, such as the number of processors, the amount of memory, and key system services:
[0.007s][debug][os ] Process is running in a job with 20 active processors.
[os ] Initial active processor count set to 20
[os ] Process is running in a job with 20 active processors.
[gc,heap ] Maximum heap size 4197875712
[gc,heap ] Initial heap size 262367232
[gc,heap ] Minimum heap size 6815736
[os ] Host Windows OS automatically schedules threads across all processor groups.
[os ] 20 logical processors found.
This information guides several internal decisions the JVM makes, such as which garbage collector to select by default. The number of available CPUs and the total memory directly influence the JVM’s heuristics. Most of these settings, however, can be overridden with explicit JVM arguments. During this stage, the JVM also checks whether Native Memory Tracking is available and verifies access to various operating system utilities it might depend on. We can customize system resources with JVM parameters.
2.4. Preparing the Environment
The JVM then prepares the runtime environment by generating HotSpot performance data. This data is used by profiling tools like JConsole and VisualVM:
[perf,datacreation] name = sun.rt._sync_Inflations, dtype = 11, variability = 2, units = 4, dsize = 8, vlen = 0, pad_length = 4, size = 56, on_c_heap = FALSE, address = 0x000001f3085f0020, data address = 0x000001f3085f0050
This performance data typically gets stored in the system’s /tmp directory and continues to be generated throughout the startup process, concurrent with other initialization tasks.
3. Loading, Linking, and Initialization
Once the JVM environment is ready, it begins preparing our program for execution.
3.1. Choosing the Garbage Collector
One main step inside the JVM is the selection of garbage collection. As of JDK23, by default, the JVM selects the G1 GC unless the system has less than 1792MB and/or is a single-processor system:
[gc ] Using G1
[gc,heap,coops ] Trying to allocate at address 0x0000000705c00000 heap of size 0xfa400000
[os ] VirtualAlloc(0x0000000705c00000, 4198498304, 2000, 4) returned 0x0000000705c00000.
[os,map ] Reserved [0x0000000705c00000 - 0x0000000800000000), (4198498304 bytes)
[gc,heap,coops ] Heap address: 0x0000000705c00000, size: 4004 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
[pagesize ] Heap: min=8M max=4004M base=0x0000000705c00000 size=4004M page_size=4K
We can choose another garbage collector: Parallel GC, ZGC, and others, depending upon the specific JDK version and its distribution.
3.2. Cached Data Storage Loading
At this point, the JVM starts looking towards optimizations. The CDS is an archive of class files that have already been pre-processed, which improves the startup performance of the JVM:
[cds] trying to map [Java home]/lib/server/classes.jsa
[cds] Opened archive [Java home]/lib/server/classes.jsa
However, CDS is being replaced by AOT as a part of Project Leyden, which we’ll discuss later.
3.3. Creating the Method Area
The JVM then creates the method area, a special off-heap memory location that stores class data. HotSpot JVM implementations call this area metaspace. Class data stored here is eligible for removal if the associated class loader is no longer in scope:
[metaspace,map ] Trying anywhere...
[metaspace,map ] Mapped at 0x000001f32b000000
While the method area is not located in the JVM’s heap, the GC still manages it.
3.4. Class Loading
Class Loading is a three-step process – locating the binary representation, deriving the class from it, and loading it into the method area. It’s the ability to dynamically load classes that allows frameworks like Spring and Mockito to load classes generated on demand during the JVM’s runtime.
There are two ways we can load classes: either with the bootstrap class loader or with a custom class loader.
Now, with the help of the HelloWorld class, let’s understand what the JVM must first do:
public class HelloWorld extends Object {
public static void main(String[] args) {
System.out.println("Hello World!");
}
}
The JVM will load java.lang.Object and all its dependencies first. When classes are initially loaded, they exist in a largely hidden state, allowing important validation and housekeeping steps to occur.
Let’s look at the java.lang.Object‘s methods:
public class Object {
public final native Class<?> getClass()
public String toString()
public boolean equals(Object obj)
}
These methods reference java.lang.Class and java.lang.String that must be loaded first. The JVM follows a lazy loading strategy, only loading classes when they’re actively referenced. However, the classes we discussed above in this section are eagerly loaded, as they’re fundamental to the JVM’s operation. The bootstrap class loader, instantiated during the JNI_CreateJavaVM(), handles all the class loading for a simple HelloWorld program.
3.5. Class Linking
Class Linking involves three sub-processes – Verification, Preparation, and Resolution. These steps don’t occur in order, as Resolution can happen anytime before Verification, to after class initialization. Verification ensures that the class structure is correct:
[class,init] Start class verification for: HelloWorld
[verification] Verifying class HelloWorld with new format
[verification] Verifying method HelloWorld.<init>()V
Classes inside the CDS have already been verified, so they skip this step, improving the startup performance. This is one of the key benefits of the CDS providers. During Preparation, the JVM initializes static fields with their default values. Any static variable without an explicit initializer gets its default value automatically.
At Resolution, the JVM resolves symbolic references in the Constant Pool. The Constant Pool stores all symbolic references for a class, and the JVM must resolve them before it can execute the corresponding instructions.
We can examine this using javap:
javap -verbose HelloWorld
This shows us the Constant Pool:
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#7 = Fieldref #8.#9 // java/lang/System.out:Ljava/io/PrintStream;
#13 = String #14 // Hello World
The constructor’s bytecode doesn’t directly contain addresses. It references symbolic entries in the Constant Pool (like #1), which describe methods and fields. During resolution, the JVM turns those symbolic entries into real memory references that can be executed:
public HelloWorld();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 2: 0
line 4: 4
The invokespecial instruction at line 1 references the Constant Pool entry #1, which provides the information needed to link to java.lang.Object’s constructor. The init indicates that this is a special method automatically generated by javac for each constructor. The JVM performs resolution lazily, triggering it only when it tries to execute an instruction in a class. Not all loaded classes will have their instructions executed.
3.6. Class Initialization
Class Initialization assigns values to static fields and executes static initializers. This is distinct from instance initialization, which occurs when we call a constructor. The special clinit method, which javac automatically generates, handles class initialization.
Even though the JVM’s startup is efficient, there’s room for improvement. Let’s see some pointers.
4.1. Impact of Class Loading
To measure the total time taken for the JVM to start, load classes, link them, and execute our simple program, we can use the system’s time utility:
time java HelloWorld
This measures the wall-clock time from the moment the JVM process starts to when it exits. It includes class loading, linking, JIT warm-up, and program execution – not just the user code. For HelloWorld, the JVM loads approximately 400-450 classes during startup. On modern hardware, this entire process completes in around 60 ms, even with verbose logging enabled.
4.2. Project Leyden
Project Leyden aims to reduce startup time, time-to-peak performance, and memory footprint. JDK 24 introduced JEP 483: Ahead-of-Time Class Loading and Linking, which performs these operations ahead of time rather than at startup.Â
This feature records the JVM behavior during a training run, stores that in a cache, and loads from that cache on subsequent startups. This will supersede the CDS acronym transition and ultimately lead to AOT to better encompass the new capabilities.
4.3. JVM Flags and Tuning
While there might be opportunities to optimize the startup performance through the use of static fields and initializers, we should approach this carefully. Spending time refactoring classes to push behavior into class loading phases might not yield measurable results. Given that most executed code comes from dependencies rather than our application code.
5. Conclusion
In this article, we explored the complex process the JVM goes through during startup, from validating user input to detecting the system resources to loading, linking, and initializing classes. We saw how the JVM prepares an entire runtime environment before executing our code, loading hundreds of classes, even for a simple HelloWorld application.
With upcoming improvements like Project Leyden’s AOT features, the startup performance will continue to improve.