Java JMH Benchmark Tutorial

By：Roy.LiuLast updated：2019-08-11

Benchmark                        (N)  Mode  Cnt   Score   Error  Units
BenchmarkLoop.loopFor       10000000  avgt   10  61.673 ± 1.251  ms/op
BenchmarkLoop.loopForEach   10000000  avgt   10  67.582 ± 1.034  ms/op
BenchmarkLoop.loopIterator  10000000  avgt   10  66.087 ± 1.534  ms/op
BenchmarkLoop.loopWhile     10000000  avgt   10  60.660 ± 0.279  ms/op

In Java, we can use JMH (Java Microbenchmark Harness) framework to measure the performance of a function.

Tested with

JMH 1.21
Java 10
Maven 3.6
CPU i7-7700

In this tutorial, we will show you how to use JMH to measure the performance of different looping methods – for, while, iterator and foreach.

1. JMH

To use JHM, we need to declare jmh-core and jmh-generator-annprocess (JMH annotations)

pom.xml

	<properties>
        <jmh.version>1.21</jmh.version>
    </properties>
	<dependencies>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-core</artifactId>
            <version>${jmh.version}</version>
        </dependency>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-generator-annprocess</artifactId>
            <version>${jmh.version}</version>
        </dependency>
    </dependencies>

2. JMH – Mode.AverageTime

2.1 JMH Mode.AverageTime example to measure the performance of different looping methods to loop a List containing 10 millions Strings.

BenchmarkLoop.java

package com.mkyong.benchmark;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"})
//@Warmup(iterations = 3)
//@Measurement(iterations = 8)
public class BenchmarkLoop {
    @Param({"10000000"})
    private int N;
    private List<String> DATA_FOR_TESTING;
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(BenchmarkLoop.class.getSimpleName())
                .forks(1)
                .build();
        new Runner(opt).run();
    @Setup
    public void setup() {
        DATA_FOR_TESTING = createData();
    @Benchmark
    public void loopFor(Blackhole bh) {
        for (int i = 0; i < DATA_FOR_TESTING.size(); i++) {
            String s = DATA_FOR_TESTING.get(i); //take out n consume, fair with foreach
            bh.consume(s);
    @Benchmark
    public void loopWhile(Blackhole bh) {
        int i = 0;
        while (i < DATA_FOR_TESTING.size()) {
            String s = DATA_FOR_TESTING.get(i);
            bh.consume(s);
            i++;
    @Benchmark
    public void loopForEach(Blackhole bh) {
        for (String s : DATA_FOR_TESTING) {
            bh.consume(s);
    @Benchmark
    public void loopIterator(Blackhole bh) {
        Iterator<String> iterator = DATA_FOR_TESTING.iterator();
        while (iterator.hasNext()) {
            String s = iterator.next();
            bh.consume(s);
    private List<String> createData() {
        List<String> data = new ArrayList<>();
        for (int i = 0; i < N; i++) {
            data.add("Number : " + i);
        return data;

2.2 In the above code, JMH will create 2 forks, each fork containing 5 warmup iterations (JVM warmup, result is ignored) and 5 measuring iterations (for calculation), for example :

# Run progress: 0.00% complete, ETA 00:13:20
# Fork: 1 of 2
# Warmup Iteration   1: 60.920 ms/op
# Warmup Iteration   2: 60.745 ms/op
# Warmup Iteration   3: 60.818 ms/op
# Warmup Iteration   4: 60.659 ms/op
# Warmup Iteration   5: 60.765 ms/op
Iteration   1: 63.579 ms/op
Iteration   2: 61.622 ms/op
Iteration   3: 61.869 ms/op
Iteration   4: 61.730 ms/op
Iteration   5: 62.207 ms/op
# Run progress: 12.50% complete, ETA 00:11:50
# Fork: 2 of 2
# Warmup Iteration   1: 60.915 ms/op
# Warmup Iteration   2: 61.527 ms/op
# Warmup Iteration   3: 62.329 ms/op
# Warmup Iteration   4: 62.729 ms/op
# Warmup Iteration   5: 61.693 ms/op
Iteration   1: 60.822 ms/op
Iteration   2: 61.220 ms/op
Iteration   3: 61.216 ms/op
Iteration   4: 60.652 ms/op
Iteration   5: 61.818 ms/op
Result "com.mkyong.benchmark.BenchmarkLoop.loopFor":
  61.673 ±(99.9%) 1.251 ms/op [Average]
  (min, avg, max) = (60.652, 61.673, 63.579), stdev = 0.828
  CI (99.9%): [60.422, 62.925] (assumes normal distribution)

2.3 Warmup iteration and measuring iteration are configurable :

@Warmup(iterations = 3) 		// Warmup Iteration = 3
@Measurement(iterations = 8) 	// Iteration = 8

2.4 We even can warm up the entire fork, before started the real fork for measuring.

@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"}, warmups = 2)

3. How to run JMH - #1 Maven

There are two ways to run the JMH benchmark, uses Maven or run it via a JMH Runner class directly.

3.1 Maven, package it as a JAR and run it via org.openjdk.jmh.Main class.

pom.xml

<build>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-shade-plugin</artifactId>
			<version>3.2.0</version>
			<executions>
				<execution>
					<phase>package</phase>
					<goals>
						<goal>shade</goal>
					</goals>
					<configuration>
						<finalName>benchmarks</finalName>
						<transformers>
							<transformer
									implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
								<mainClass>org.openjdk.jmh.Main</mainClass>
							</transformer>
						</transformers>
					</configuration>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

3.2 mvn package, it will generate a benchmarks.jar, just start the JAR normally.

Terminal

$ mvn package 
$ java -jar target\benchmarks.jar BenchmarkLoop

4. How to run JMH - #2 JMH Runner

You can run the benchmark via a JMH Runner class directly.

BenchmarkLoop.java

package com.mkyong.benchmark;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"})
public class BenchmarkLoop {
    private static final int N = 10_000_000;
    private static List<String> DATA_FOR_TESTING = createData();
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(BenchmarkLoop.class.getSimpleName())
                .forks(1)
                .build();
        new Runner(opt).run();
    // Benchmark code

5. Result

5.1 Review the result, to loop a List containing 10 million String objects, the classic while loop is the fastest loop. However, the difference isn't that significant.

Benchmark                        (N)  Mode  Cnt   Score   Error  Units
BenchmarkLoop.loopFor       10000000  avgt   10  61.673 ± 1.251  ms/op
BenchmarkLoop.loopForEach   10000000  avgt   10  67.582 ± 1.034  ms/op
BenchmarkLoop.loopIterator  10000000  avgt   10  66.087 ± 1.534  ms/op
BenchmarkLoop.loopWhile     10000000  avgt   10  60.660 ± 0.279  ms/op

5.2 Full detail, good for reference.

$ java -jar target\benchmarks.jar BenchmarkLoop
# JMH version: 1.21
# VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10
# VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe
# VM options: -Xms2G -Xmx2G
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopFor
# Parameters: (N = 10000000)
# Run progress: 0.00% complete, ETA 00:13:20
# Fork: 1 of 2
# Warmup Iteration   1: 60.920 ms/op
# Warmup Iteration   2: 60.745 ms/op
# Warmup Iteration   3: 60.818 ms/op
# Warmup Iteration   4: 60.659 ms/op
# Warmup Iteration   5: 60.765 ms/op
Iteration   1: 63.579 ms/op
Iteration   2: 61.622 ms/op
Iteration   3: 61.869 ms/op
Iteration   4: 61.730 ms/op
Iteration   5: 62.207 ms/op
# Run progress: 12.50% complete, ETA 00:11:50
# Fork: 2 of 2
# Warmup Iteration   1: 60.915 ms/op
# Warmup Iteration   2: 61.527 ms/op
# Warmup Iteration   3: 62.329 ms/op
# Warmup Iteration   4: 62.729 ms/op
# Warmup Iteration   5: 61.693 ms/op
Iteration   1: 60.822 ms/op
Iteration   2: 61.220 ms/op
Iteration   3: 61.216 ms/op
Iteration   4: 60.652 ms/op
Iteration   5: 61.818 ms/op
Result "com.mkyong.benchmark.BenchmarkLoop.loopFor":
  61.673 ±(99.9%) 1.251 ms/op [Average]
  (min, avg, max) = (60.652, 61.673, 63.579), stdev = 0.828
  CI (99.9%): [60.422, 62.925] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10
# VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe
# VM options: -Xms2G -Xmx2G
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopForEach
# Parameters: (N = 10000000)
# Run progress: 25.00% complete, ETA 00:10:08
# Fork: 1 of 2
# Warmup Iteration   1: 67.938 ms/op
# Warmup Iteration   2: 67.921 ms/op
# Warmup Iteration   3: 68.064 ms/op
# Warmup Iteration   4: 68.172 ms/op
# Warmup Iteration   5: 68.181 ms/op
Iteration   1: 68.378 ms/op
Iteration   2: 68.069 ms/op
Iteration   3: 68.487 ms/op
Iteration   4: 68.300 ms/op
Iteration   5: 67.635 ms/op
# Run progress: 37.50% complete, ETA 00:08:27
# Fork: 2 of 2
# Warmup Iteration   1: 67.303 ms/op
# Warmup Iteration   2: 67.062 ms/op
# Warmup Iteration   3: 66.516 ms/op
# Warmup Iteration   4: 66.973 ms/op
# Warmup Iteration   5: 66.843 ms/op
Iteration   1: 67.157 ms/op
Iteration   2: 66.763 ms/op
Iteration   3: 67.237 ms/op
Iteration   4: 67.116 ms/op
Iteration   5: 66.679 ms/op
Result "com.mkyong.benchmark.BenchmarkLoop.loopForEach":
  67.582 ±(99.9%) 1.034 ms/op [Average]
  (min, avg, max) = (66.679, 67.582, 68.487), stdev = 0.684
  CI (99.9%): [66.548, 68.616] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10
# VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe
# VM options: -Xms2G -Xmx2G
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopIterator
# Parameters: (N = 10000000)
# Run progress: 50.00% complete, ETA 00:06:46
# Fork: 1 of 2
# Warmup Iteration   1: 67.336 ms/op
# Warmup Iteration   2: 73.008 ms/op
# Warmup Iteration   3: 66.646 ms/op
# Warmup Iteration   4: 70.157 ms/op
# Warmup Iteration   5: 68.373 ms/op
Iteration   1: 66.385 ms/op
Iteration   2: 66.309 ms/op
Iteration   3: 66.474 ms/op
Iteration   4: 68.529 ms/op
Iteration   5: 66.447 ms/op
# Run progress: 62.50% complete, ETA 00:05:04
# Fork: 2 of 2
# Warmup Iteration   1: 65.499 ms/op
# Warmup Iteration   2: 65.540 ms/op
# Warmup Iteration   3: 67.328 ms/op
# Warmup Iteration   4: 65.926 ms/op
# Warmup Iteration   5: 65.790 ms/op
Iteration   1: 65.350 ms/op
Iteration   2: 65.634 ms/op
Iteration   3: 65.353 ms/op
Iteration   4: 65.164 ms/op
Iteration   5: 65.225 ms/op
Result "com.mkyong.benchmark.BenchmarkLoop.loopIterator":
  66.087 ±(99.9%) 1.534 ms/op [Average]
  (min, avg, max) = (65.164, 66.087, 68.529), stdev = 1.015
  CI (99.9%): [64.553, 67.621] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10
# VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe
# VM options: -Xms2G -Xmx2G
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopWhile
# Parameters: (N = 10000000)
# Run progress: 75.00% complete, ETA 00:03:22
# Fork: 1 of 2
# Warmup Iteration   1: 60.290 ms/op
# Warmup Iteration   2: 60.161 ms/op
# Warmup Iteration   3: 60.245 ms/op
# Warmup Iteration   4: 60.613 ms/op
# Warmup Iteration   5: 60.697 ms/op
Iteration   1: 60.842 ms/op
Iteration   2: 61.062 ms/op
Iteration   3: 60.417 ms/op
Iteration   4: 60.650 ms/op
Iteration   5: 60.514 ms/op
# Run progress: 87.50% complete, ETA 00:01:41
# Fork: 2 of 2
# Warmup Iteration   1: 60.845 ms/op
# Warmup Iteration   2: 60.927 ms/op
# Warmup Iteration   3: 60.832 ms/op
# Warmup Iteration   4: 60.817 ms/op
# Warmup Iteration   5: 61.078 ms/op
Iteration   1: 60.612 ms/op
Iteration   2: 60.516 ms/op
Iteration   3: 60.647 ms/op
Iteration   4: 60.607 ms/op
Iteration   5: 60.733 ms/op
Result "com.mkyong.benchmark.BenchmarkLoop.loopWhile":
  60.660 ±(99.9%) 0.279 ms/op [Average]
  (min, avg, max) = (60.417, 60.660, 61.062), stdev = 0.184
  CI (99.9%): [60.381, 60.939] (assumes normal distribution)
# Run complete. Total time: 00:13:31
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark                        (N)  Mode  Cnt   Score   Error  Units
BenchmarkLoop.loopFor       10000000  avgt   10  61.673 ± 1.251  ms/op
BenchmarkLoop.loopForEach   10000000  avgt   10  67.582 ± 1.034  ms/op
BenchmarkLoop.loopIterator  10000000  avgt   10  66.087 ± 1.534  ms/op
BenchmarkLoop.loopWhile     10000000  avgt   10  60.660 ± 0.279  ms/op

Note
Hope this tutorial give you a quick started guide to use JMH benchmark, for more advance JMH examples, please visit this official JMH sample link

Note
How about Forward loop vs Reverse loop? Which one is faster? Visit this JMH test

References

From：一号门

Tags: benchmark java jmh loop performance

Previous:JMH Java Forward loop vs Reverse loop

Next:Maven How to force re-download project dependencies?

COMMENTS