DeveloperTom404

Posted on Jul 25 • Edited on Aug 8

Method Handles faster reflection (sometimes)

#java #jvm #programming #development

How does Java support dynamic calls? From slow reflection to the optimized MethodHandle and invokedynamic—let's explore the evolution of dynamism on the JVM and dive into how MethodHandle works under the hood, along with the roles played by CallSite and invokedynamic.

Why Java needs dynamic calls?

Let's start with a brief retrospective: just a year after Java 1.0 was released, the next version brought the first version of the reflection mechanism. The Java code got the opportunity to obtain class data (their fields, methods, and constructors) and dynamically manipulate objects at runtime, even if their types were unknown at compile time. Then, JPython, one of the first well-known dynamically typed languages on the JVM, was released and started to extensively leverage reflection.

However, why does Java need these calls? Just imagine that:

You need to analyze annotations in a class.
You're developing a framework that must work with classes unknown at compile time.
You need to invoke a method whose name is specified in a configuration file.

Reflection comes to the rescue! It gives the flexibility needed for such tasks. However, this convenience doesn't come for free, as it comes with a performance cost.

Overheads and reflection limitations

When developers used reflection to implement dynamic method call on the JVM, they faced several key issues that still remain relevant today.

First, all reflective operations handle the Object type, which prevents compile-time type checking and delays all potential errors to runtime.

Second, each reflective call requires access and type verification that significantly degrade performance.

Third, the JIT compiler can't optimize these calls because they can't be inlined, compiled to machine code, or analyzed during profiling.

Inlining is an optimization that replaces a method call with its body to eliminate overheads and boost performance. Once inlined, the JIT compiler can optimize further, for example, by eliminating dead code. This practice works best with small methods.

Unfortunately, reflection can't solve these problems because its core is dynamic type handling at runtime. That's why deferred errors, performance, and a lack of optimizations aren't just compromises and drawbacks. They're the inevitable price of the flexibility it offers.

What and how MethodHandle solves

JSR 292, which introduced the MethodHandle API to Java, addressed many of reflection's issues—though it didn't fully replace it. Unlike reflection, which focuses on introspecting metadata, the MethodHandle API encapsulates method calls in a way that allows the JVM to perform optimizations via inlining and speculative compilation.

Speculative compilation (or speculation) is a JIT optimization where the code is compiled specifically for the execution paths based on statistical assumptions (about types, values, etc.) to minimize call delays. If assumptions turn out wrong, the optimization is rolled back and turns into deoptimization.

To create a MethodHandle, use MethodHandles.Lookup, a factory with access control.

The main methods for handling MethodHandle are #invokeExact and #invoke.

Look at them:

@IntrinsicCandidate
public final native @PolymorphicSignature Object invokeExact(Object... args)
    throws Throwable;

@IntrinsicCandidate
public final native @PolymorphicSignature Object invoke(Object... args)
    throws Throwable;

Although the methods are declared as accepting and returning Object, the @PolyMorphicSignature annotation makes the compiler to ignore this declaration. Instead, the actual types of arguments and return values are determined at compile time based on the types of the passed arguments and the return values.

This means that methods handle specific types in their code, so devs can't use MethodHandle with Object.

Let's look at example—we create a MethodHandle that compares numbers using Integer#compare:

MethodHandle Integer_compare = MethodHandles.lookup().findStatic(
        Integer.class, "compare",
        MethodType.methodType(int.class, int.class, int.class)
);

We can't call #invokeExact without specifying int, as a result, that leads to the crash at runtime:

Object result = Integer_compare.invokeExact(1, 2); // WrongMethodTypeException!

The code will only work with explicit conversion to int:

int result = (int) Integer_compare.invokeExact(1, 2);

The call to #invokeExact or #invoke is not direct. They're intrinsics; the virtual machine replaces them with optimized code.

Intrinsic is a method that the JVM treats in a special way and replaces with highly optimized native code or machine instructions at runtime.

#invokeExact requires exact signature matching, while #invoke allows type conversions as in a normal method call. However, if the call signature is unknown, or developers need more flexibility, there's another way to work with MethodHandle:

public Object invokeWithArguments(Object... arguments) throws Throwable {
    ....
}

Unlike the other methods, #invokeWithArguments really uses Object and dynamically checks the types of arguments at runtime. That's versatile, sure, but it impacts performance.

Even though #invokeExact and #invoke are intrinsics, we can't just replace them with method calls. The JVM first rewrites the call to LambdaForm, an internal intermediate representation that breaks the call into more primitive operations.

LambdaForm within MethodHandle

Before exploring LambdaForm, it's important to note that it's a non-public implementation detail specific to OpenJDK. In other JDKs, LambdaForm may significantly differ, or might not exist at all, as in the case with ART.

So, LambdaForm represents a sequence of low-level operations that can be changed using methods in the MethodHandles class.

Let's look at MethodHandle in the following code snippet:

MethodHandle Integer_compare = MethodHandles.lookup().findStatic(
        Integer.class, "compare",
        MethodType.methodType(int.class, int.class, int.class)
);

The simplified form of its LambdaForm looks like this:

(a0:int, a1:int) => {
    return Integer.compare(a0, a1);
}

Then we use the MethodHandles#permuteArguments decorator, which swaps the arguments:

Integer_compare = MethodHandles.permuteArguments(
        Integer_compare,
        MethodType.methodType(int.class, int.class, int.class),
        1, 0
);

Now the simplified version of LambdaForm looks like this:

(a0:int, a1:int) => {
    swap_a0 = a1
    swap_a1 = a0
    return Integer.compare(swap_a0, swap_a1);
}

The MethodHandles adapters create lightweight LambdaForm chains that help the JIT compiler to inline call logic, specialize machine code, and remove intermediate steps. That's why #invokeExact and #invoke outperform reflection after warm-up: they work with primitives that are transparent to JVM optimizations, unlike reflection, which remains "a black box" to the compiler.

JVM warm-up is an optimization process at runtime where the JVM dynamically adapts to the actual load by applying JIT optimizations, collecting profiling data, and adjusting garbage collection.

CallSite over MethodHandle

Imagine that MethodHandle is a multi-tool like a screwdriver with multiple bits. At first glance, it seems perfect for all kinds of jobs. But there's a catch: each adapter like bits for different slots can only be fixed once during creation. It's like we solder the bit to the handle. This is convenient, but only if the task never changes.

Why is this a disaster for dynamic calls?

Let's dive even deeper into the abstract world: we're working on a conveyor belt (its metaphor for script engine), where:

Details (the obj object) are constantly changing: today they're square, tomorrow they're round.
The required operations (the doSomething method) are different for each part.
The processing tools (the x and y arguments) are also different.

What happens to MethodHandle in this chaos?

Our screwdriver with a soldered bit works perfectly—but only for a single, specific part and operation. The moment a different part shows up (a new type of obj) or a new task is required (a different doSomething signature), the screwdriver becomes useless. We have to toss it out, make a new one again, solder on a different bit, and put it into every spot on the conveyor belt where it's needed—in other words, we create a new MethodHandle with the right adapters.

That's where CallSite comes in: a smart screwdriver holder with a quick-swap mechanism: we mount it onto the conveyor and leave it be. When the system launches for the first time, we call a bootstrap method like a service engineer: it inspects the incoming part and the required operation, then inserts the appropriate screwdriver into the holder. If a differently shaped part arrives, the conveyor halts and an emergency engineer—a rebinder we've prepared in advance—quickly reconfigures the tool. The conveyor resumes without missing a beat.

Wait, why we always solder on the bit?

In most cases, the tool is already perfectly suited and the JVM optimistically assumes that once the tool is installed, it won't need to be changed—which lets the JIT compiler optimize it heavily and execute repeated operations at blazing speed.

Like any good tool, CallSite comes in different modifications:

ConstantCallSite is the simplest option: a fixed holder soldered in place. Once set, it can't be changed. It's perfect when you know that the parts and operations won't ever vary.
MutableCallSite is like a quick-swap holder with #setTarget. But there's a catch: the system doesn't guarantee synchronization, so multiple engineers (threads) might try changing the tool at the same time.
VolatileCallSite is an improved version of the previous one. When engineers change a screwdriver, they always activate the lock (volatile semantics). This ensures all workers (threads) see a new tool instead of an old one lingering in memory (cache). You can read more about volatile semantics in another one of our articles.

Now, imagine our conveyor belt is comparing numbers.

We mount the holder with the initial screwdriver (a standard comparison):

CallSite callSite = new MutableCallSite(Integer_compare);

Receive a control panel for the current screwdriver in the holder:

MethodHandle dynamic_Integer_compare = callSite.dynamicInvoker();

Then handle the detail:

result = (int) dynamic_Integer_compare.invokeExact(1, 2);
// result = -1

Suddenly, an order comes down from above: numbers must now be compared in reverse order. Without stopping the conveyor belt, we change the comparison logic on the fly:

callSite.setTarget(
    MethodHandles.permuteArguments(
        Integer_compare,
        MethodType.methodType(int.class, int.class, int.class),
        1, 0
    )
);

The new detail is processed using the new logic:

result = (int) dynamic_Integer_compare.invokeExact(1, 2);
// result = 1

The main entry point for this mechanism is the bytecode instruction, invokedynamic. When it is executed for the first time, it calls a bootstrap method to obtain CallSite, and then refers to CallSite.

MethodHandle is faster (and no)

Although MethodHandle APIs are promoted as a high-performance alternative to reflection, their speed greatly depends on the usage context.

On the first call, MethodHandle performs similarly to reflection—or sometimes even slower—due to the overhead of compiling LambdaForm and initializing internal JVM structures. However, after basic JIT optimizations, MethodHandle gains advantages due to inlining and specialized machine code generation.

To prove it, let's use a JMH benchmark on Oracle OpenJDK 23 to compare direct and reflective calls, #invoke, #invokeExact, and #invokeWithArguments of the Integer#compare(int, int) method.

Here are results of the "hot" run after JIT compiler optimizations:

Method	Operations per µs	Average execution time (ns)	Relative speed
Direct call	3123.7 ± 64.8	1	1.0x
`MethodHandle#invokeExact`	424.6 ± 7.7	2 ± 1	7.4x slower
`MethodHandle#invoke`	422.0 ± 12.1	2 ± 1	7.4x slower
`Method#invoke`	195.9 ± 11.1	5 ± 1	15.9x slower
`MethodHandle#invokeWithArguments`	11.1 ± 0.4	88 ± 2	281.4x slower

The key points:

#invokeExact or #invoke is 7 times slower due to type checks, passing through a native adapter, and processing a polymorphic signature.

#invokeWithArguments is significantly slower—it's 17.6 times slower than reflection because each call requires type checks, argument boxing, and signature analysis, while reflection caches some of its checks after the first call.

Below are the results of the percentile analysis, measured in microseconds per operation.

Percentile is the value below which a certain percentage of observations fall. For example, the 90th percentile (p0.90) at 10 ms, means that 90% of measurements were less than or equal to 10 ms.

Metrics	Direct call (µs)	`MethodHandle#invokeExact` (µs)	`MethodHandle#invoke` (µs)	`Method#invoke` (µs)	`MethodHandle#invokeWithArguments` (µs)
p0.50	≈ 0	≈ 0	≈ 0	≈ 0	≈ 0
p0.90	0.1	0.1	0.1	0.1	0.1
p0.99	0.1	0.1	0.1	0.1	0.2
p0.9999	12.0	9.7	12.5	7.4	15.8
p1.0	83.584	87.3	85.7	848.9	1,943.6

Even if the typical performance of the methods is similar up to the 99th percentile, the tail latencies radically differ. Reflection experiences spikes reaching 848 microseconds—about 10 times slower than invokeExact. Meanwhile, invokeWithArguments is disastrous: its maximum execution time is 1,943 microseconds, which is 22 times slower than invokeExact! This makes it a bad choice for high-load systems.

Tail latency is latency values in the worst-case scenarios of system operation, measured by high percentiles (p0.99–p0.9999). It shows how slow even the longest operations can be.

How Java uses MethodHandle: lambdas, concatenation, and VarHandles

One of the key applications of the MethodHandle API in Java has been the implementation of lambda expressions.

Let's take the following expression:

UnaryOperator<String> quote = str -> '"' + str + '"';

In the byte code, it looks like this:

invokedynamic #7,  0
    // InvokeDynamic #0:apply:()Ljava/util/function/UnaryOperator;

....

private static java.lang.String lambda$main$0(java.lang.String);

....

BootstrapMethods:
  0: #36 REF_invokeStatic java/lang/invoke/LambdaMetafactory
             .metafactory:(....)Ljava/lang/invoke/CallSite;
    Method arguments:
      #28 (Ljava/lang/Object;)Ljava/lang/Object;
      #30 REF_invokeStatic MethodHandleSandbox
              .lambda$main$0:(Ljava/lang/String;)Ljava/lang/String;
      #33 (Ljava/lang/String;)Ljava/lang/String;

The entry point is the invokedynamic call, which refers to the bootstrap method specified below. LambdaMetaFactory#metafactory creates a lambda factory that calls the generated #lambda$main$0 method, where the concatenation happens.

Dynamic calls are used in concatenation: since Java 9, it also works via invokedynamic where bootstrap methods are located in java.lang.invoke.StringConcatFactory. These calls replace the cumbersome StringBuilder and allow the JIT compiler to optimize the process for specific data.

Another important application of the MethodHandle API is VarHandle, which addresses the fundamental issues of sun.misc.Unsafe. With Unsafe, developers had to calculate memory offsets manually, risking memory corruption or JVM crashes. VarHandle offers a type-safe and flexible way to access fields, array elements, or off-heap memory. It uses MethodHandle to generate specialized operation adapters like #get, #set, #compareAndSet, and so on. This lets the JVM optimize access patterns while ensuring type and boundary safety.

Here's an example of an atomic Compare-And-Set operation using Unsafe:

class MyClass {
    int value;
}

Field Unsafe_theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
Unsafe_theUnsafe.setAccessible(true);
Unsafe unsafe = (Unsafe) Unsafe_theUnsafe.get(null);

long MyClass_value_offset =
    unsafe.objectFieldOffset(MyClass.class.getDeclaredField("value"));

MyClass instance = new MyClass();
boolean result = unsafe.compareAndSwapInt(instance, MyClass_value_offset, 0, 1);

Here is the same case with VarHandle:

class MyClass {
    int value;
}

VarHandle MyClass_value = MethodHandles.privateLookupIn(
                MyClass.class, MethodHandles.lookup()
        )
        .findVarHandle(MyClass.class, "value", int.class);

MyClass instance = new MyClass();
boolean result = MyClass_value.compareAndSet(instance, 0, 1);

First, VarHandle eliminates dangerous manual offset calculations in Unsafe, which can lead to corrupt memory or cause other critical errors.

Secondly, strict type safety guarantees that operations comply with the declared field type. For example, #compareAndSet for int restricts arguments of long, whereas Unsafe has such issues.

Thirdly, the code on VarHandle remains portable between different JVM implementations, whereas Unsafe has always been platform-dependent.

When should we use MethodHandle?

Dynamic calls in Java have traditionally relied on reflection. Its key weakness is performance because JIT can't optimize non-transparent calls.

Addressing the reflection issues, the MethodHandle API became a good alternative that optimizes dynamic operations. Its main pros for calls are the integration with JVM via invokedynamic that applies optimizations. After warming up, #invokeExact and #invoke work approximately twice as fast as reflection, although they remain seven times slower than direct calls.

The perfect scenario for using MethodHandle is the dynamic execution of predefined operations in hot code. For example, we can implement specialized proxies using Java's built-in Proxy mechanism. In cases where the call signature may also change, we can use CallSite to switch implementations and maintain performance.

Despite all the pros, the MethodHandle API does not allow us to introspect classes and their members, and it may perform worse than reflection when call signatures are frequently changing. So, whether to use reflection or the MethodHandle API depends on the specific task, and there's no versatile answer.

This evolution—from reflection to MethodHandle and CallSite—illustrates one of the key Java trends: the desire to mix the dynamic language flexibility, the type safety, and the compiled code performance. Thanks to this trio, Java remains a strictly typed language that can adapt to a wide range of execution scenarios.

CodeNewbie Community 🌱