Java 21 made virtual threads a permanent language feature, ending a multi-year preview cycle and giving the JVM its first real answer to high-concurrency workloads since the original thread model from 1995. The pitch is compelling: write blocking, synchronous code that looks exactly like you’ve always written it, and let the runtime multiplex tens of thousands of those tasks onto a small pool of OS threads. The reality, after a year of teams adopting them in production, is that virtual threads are a strict win in the cases they were designed for and a real footgun in a couple of cases where the documentation doesn’t warn you loudly enough. This article walks through both, with the specific code patterns that cause problems and how to spot them.

What virtual threads actually are

The mental model that gets you 90% of the way there: a virtual thread is a Java Thread that doesn’t correspond to an OS thread. When the virtual thread blocks on I/O, the JVM unmounts it from its underlying carrier thread and schedules another virtual thread on the same carrier. When the I/O completes, the original virtual thread is remounted onto whatever carrier is available next and continues from where it left off. The application code never knows any of this happened — the calling code looks like a normal blocking call, the API surface is identical to java.lang.Thread.

The technical change underneath is that virtual threads use continuations, a low-level JVM primitive that lets the runtime save and restore the execution stack of a method mid-call. When the virtual thread blocks, the continuation is captured to the heap; when it resumes, the stack is rebuilt and execution continues. The continuation infrastructure isn’t directly exposed to user code (it’s marked as internal in the JEP) but it’s the reason virtual threads work without requiring you to rewrite anything as async/await callbacks.

JEP 444 specification page for Virtual Threads
JEP 444 made virtual threads a permanent feature in Java 21 after years as a preview API. The spec is short, the implications are not.

Where they’re a strict win

Virtual threads are unambiguously the right call for any workload where most of your threads are blocked on I/O most of the time. The canonical example is a request handler in a web server or RPC service: it accepts a request, calls a database, calls another service, calls a cache, formats a response, and returns. Each of those calls is a blocking I/O operation that takes 1-50 milliseconds, and during that time the thread is doing nothing — it’s waiting for bytes to arrive on a socket.

In the platform-thread model, you’d typically pool maybe 200-500 threads and accept that you can serve at most that many concurrent requests. Beyond that, requests queue up and latency spikes. With virtual threads, the same handler code can serve tens of thousands of concurrent requests because the carrier thread pool only needs enough threads to do the actual CPU work while one request is between I/O calls. Tomcat 11, Jetty 12, and Helidon Nima all support virtual-thread executors out of the box, and the migration is usually a one-line config change.

The other clear win is anywhere you’re already using a CompletableFuture chain to express what is fundamentally a sequential set of I/O calls. The CompletableFuture style is hard to read, harder to debug, and produces stack traces that are useless for tracking down where an exception came from. Replacing a CompletableFuture chain with a virtual thread that calls each step sequentially, in order, restores readable code without losing any concurrency — the virtual thread doesn’t block the carrier during the I/O waits, so the runtime cost is the same.

The pinning problem

The first place virtual threads bite back is what the JVM calls “pinning”. When a virtual thread is inside a synchronized block, the JVM cannot unmount it from its carrier — the carrier is pinned to that virtual thread until the synchronized block exits. If the code inside the synchronized block does I/O, the carrier thread is blocked too, and you’ve effectively lost one of your carriers for the duration of the I/O.

The same applies to native code (JNI calls) and to a few internal JDK constructs that still use synchronized monitors. The JEP describes all of these as “limitations” and Java 24 has work in progress to remove the synchronized restriction, but as of Java 21 and 22 it’s a real source of performance regressions.

The worst case I’ve seen in production: an HTTP client wrapped in a synchronized method to ensure only one request runs at a time per connection. Under platform threads, that’s a small contention point. Under virtual threads, every request through that client pinned its carrier thread for the entire duration of the HTTP call, and a service that should have scaled to thousands of concurrent virtual threads serialized down to the size of the carrier pool. Symptoms: high CPU on the carrier threads (because they’re idle but holding the pin), low throughput, and JFR events showing virtual threads in the PINNED state.

The fix is mechanical: replace synchronized with java.util.concurrent.locks.ReentrantLock. Lock acquisition through that API does not pin the carrier. The change is two or three lines per affected class but you have to find every affected class first, and “synchronized that contains I/O” is not a static-analysis-friendly query — you have to grep for synchronized methods and audit each one.

Detecting pinning in real applications

JFR (Java Flight Recorder) emits a jdk.VirtualThreadPinned event every time a virtual thread is pinned to its carrier. Enable JFR in your application:

java -XX:StartFlightRecording=duration=60s,filename=trace.jfr,settings=profile MyApp

Then open the resulting file in JDK Mission Control and look for the Virtual Thread Pinned events under the Threads tab. Each event includes a stack trace showing exactly where the pinning happened. In a healthy application this list is short or empty. In an application with hidden synchronized-on-I/O patterns, the list will be long and the same stack frames will repeat.

For lower-friction monitoring, the JVM flag -Djdk.tracePinnedThreads=full prints a stack trace to stdout every time a virtual thread pins its carrier. Don’t run this in production — the output volume is huge — but it’s the fastest way to find pinning sites during local testing.

Oracle documentation page for Java virtual threads
Oracle’s docs cover the API surface but the failure modes — pinning, monopolization, debugging — are mostly learned in production.

Where virtual threads make things worse

Beyond pinning, there are two specific patterns where virtual threads are actually slower or harder to work with than platform threads:

  • CPU-bound workloads. A virtual thread that’s doing pure computation never unmounts from its carrier, so you’re paying the unmount/mount overhead with no benefit. For workloads where every thread is computing rather than waiting, a fixed pool of platform threads sized to your CPU count is faster and simpler. The recommendation in the JEP is explicit: don’t use virtual threads for CPU-bound work.
  • ThreadLocal-heavy code. ThreadLocal still works with virtual threads but the cost model is different. With platform threads you have hundreds of threads, each with its own ThreadLocal storage. With virtual threads you might have hundreds of thousands, and if each one allocates ThreadLocal state at startup, the heap footprint balloons. Frameworks that use ThreadLocal for context propagation (Spring’s request context, MDC for logging, OpenTelemetry’s current-span tracking) all need updating to use the newer ScopedValue API instead.

The ThreadLocal issue is the more insidious one because it doesn’t show up as a runtime error — it shows up as silently inflated memory usage and degraded GC performance. A service that worked fine on platform threads with 200 threads and 50MB of ThreadLocal state can chew through 25GB if you switch to virtual threads with 100,000 concurrent tasks and the same ThreadLocal-per-task pattern.

Migration without breaking your observability

Most real Java services run on top of an instrumentation stack — Micrometer, OpenTelemetry, Datadog or New Relic agents — that injects bytecode at startup to record per-request metrics. The instrumentation typically uses ThreadLocals or a thread-name lookup to associate metrics with the request that triggered them. When you switch to virtual threads, three things happen:

  1. Thread names become opaque (most virtual threads get a name like “VirtualThread[#42]” rather than the descriptive names your platform-thread pool used).
  2. ThreadLocal-based context propagation works mechanically but allocates more state.
  3. JFR events from virtual threads use different event types than equivalent platform thread events, so dashboards and alerts that watched platform thread metrics will see no data.

The fix for observability is to upgrade your instrumentation libraries to versions that explicitly support virtual threads. Micrometer 1.13+, OpenTelemetry SDK 1.36+, and the Datadog agent 1.30+ all have virtual-thread-aware adapters. If you’re on older versions, the migration is straightforward but mandatory before going to production.

Virtual threads are the right default for any new Java service that handles I/O-bound concurrency, and they’re a great fit for migrating existing services where the request handlers are already synchronous and the contention points are I/O rather than CPU. The footguns are pinning under synchronized blocks (fixable with ReentrantLock), CPU-bound work (don’t use virtual threads at all), and ThreadLocal explosion under heavy concurrency (use ScopedValue or per-request lazy storage instead). Plan a migration in three steps: upgrade to JDK 21 or later, audit synchronized blocks for hidden I/O, and update your observability stack to virtual-thread-aware versions. Done in that order, the migration is a clean win.