Performance Considerations
Java has significantly improved its performance in recent years and now provides APIs that allow developers to greatly enhance execution speed.
These include:
- virtual threads
- VarHandle
- Arena, MemorySegment, MemoryLayout
A major performance bottleneck was often the garbage collector (GC). The GC requires time to manage objects (“backpressure”) and can slow down the system. The APIs mentioned above directly address this issue.
Even with the optimized records, the GC often cannot keep up with the application. The solution is off‑heap memory, which resides outside the JVM.
In my SNN (spiking neural network), the graph contains many nodes such as dendrites, soma, axons, and synapses. To access these structures efficiently —while still maintaining an object‑oriented view— I analyzed the following approaches.
1) VarHandle
A typical, always‑identical structure in a SNN is the plasticity block used for STP and LTP. It consists of 9 \(double\) values that are modified by various methods.
With 1 million neurons, we have:
- 2 × plasticity × 1,000,000 → 18 million attributes
- plus 1 million lock attributes
Approach
All 18 million `double` values are stored in a single array, and a view is placed on top of this data.public static final double[] GLOBAL_DATA = new double[NUM_PLASTICITY * (PARAMS + 1)];
private static final VarHandle VH = MethodHandles.arrayElementVarHandle(double[].class);
These values are no longer managed individually by the GC. For the OO perspective, an instance provides a view into the array:
public PlasticityView(int baseIndex) { this.baseIndex = baseIndex; }
Operations on this data must be protected with optimized locking.
Performance
A benchmark with 2000 virtual threads showed that within 10 seconds, approximately 490 million computations were executed. → roughly 49 microseconds per computation (rough estimate on an i9, no warmup).
2) MemoryLayout
An alternative is to use Arena + MemoryLayout.- More flexible structure definitions
- Structures can easily be written to files
- Structures can be embedded into other structures
- Off‑heap → reduced GC pressure
Example layout
public static final GroupLayout ELEMENT_LAYOUT = MemoryLayout.structLayout(
JAVA_BOOLEAN.withName("lock"),
MemoryLayout.paddingLayout(7),
JAVA_DOUBLE.withName("potential"),
JAVA_DOUBLE.withName("rate")
// ... more fields
);
Sequence for 1 million elements
public static final SequenceLayout ARRAY_LAYOUT =
MemoryLayout.sequenceLayout(1_000_000, ELEMENT_LAYOUT);
This results in 9 million doubles + 1 million lock values in memory.
Access via VarHandle
public static final VarHandle VH_LOCK =
ARRAY_LAYOUT.varHandle(sequenceElement(), groupElement("lock"));
public static final VarHandle VH_POTENTIAL =
ARRAY_LAYOUT.varHandle(sequenceElement(), groupElement("potential"));
String names do not affect performance but improve readability.
Allocation
private static final MemorySegment GLOBAL_SEGMENT =
Arena.ofShared().allocate(PlasticityLayout.ARRAY_LAYOUT);