Manual tracing in Spring Boot with OpenTelemetry: @WithSpan, hand-built spans, and the traps in between
I spent an evening wiring manual tracing into a fresh Spring Boot service with the OpenTelemetry Java SDK — the annotation way (@WithSpan) and the hand-built way (Tracer.spanBuilder()). It's a small amount of code, but the interesting part is everything around it: a Docker gotcha that crashed three containers, a dependency that Spring Boot 4 quietly removed, and one mental model (context inheritance) that makes the whole thing click. Here are the field notes.
The setup: app → Collector → Tempo
Stack is a stock Spring Boot 4.1 app on Java 25, talking OTLP to an OpenTelemetry Collector, which fans out to Tempo (traces), Mimir (metrics) and Loki (logs), all viewed in Grafana. The app never talks to Tempo directly — it ships to the Collector, and the Collector owns batching, retries and routing. Swap a backend by editing the Collector, not by redeploying the app.
Two dependencies do the heavy lifting: the OpenTelemetry Spring Boot starter, pinned via its instrumentation BOM.
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-spring-boot-starter</artifactId>
</dependency>
Trap #1: @WithSpan that silently does nothing
I added the annotation, ran the app, saw my log line — and got zero spans. No error, no warning. The reason: @WithSpan is implemented as a Spring AOP aspect, and the starter only registers that aspect when AspectJ is on the classpath. No AspectJ, no aspect, and the annotation degrades to a silent no-op.
The fix would normally be spring-boot-starter-aop — except Spring Boot 4 dropped that starter (it stops at 3.5.x). The actual requirement is narrower anyway: spring-aop is already transitive via spring-context, so you only need the weaver.
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjweaver</artifactId>
</dependency>
Lesson: when an annotation "does nothing", check whether something needs to weave it before you suspect your own code.
Trap #2: http/protobuf is not gRPC
Pointing the exporter at the Collector is two lines, but the protocol name trips people up:
otel.exporter.otlp.endpoint=http://localhost:4318
otel.exporter.otlp.protocol=http/protobuf
http/protobuf means protobuf bytes over plain HTTP — not gRPC. Both encode protobuf; the difference is the pipe: HTTP/1.1 on port 4318 vs gRPC/HTTP2 on 4317. And the endpoint is a base URL — the SDK appends /v1/traces itself. Hardcode the full path and you get /v1/traces/v1/traces and a 404.
@WithSpan vs. the manual API
The annotation is the 95% case. It needs no Tracer — the aspect starts the span, makes it current for the method body, and ends it on return:
@WithSpan(value = "myStuffYouknow", kind = SpanKind.CONSUMER)
public void doTheFirstThing() {
log.info("doing the first thing");
doTheManualChild("zakaria");
}
You reach for the manual API when you need a sub-method span, custom attributes/events, or explicit error status. This is the one place you actually inject a Tracer (from the auto-configured OpenTelemetry bean):
Span span = tracer.spanBuilder("manualChildSpan")
.setAttribute("app.who", who)
.startSpan();
try (Scope scope = span.makeCurrent()) {
span.addEvent("about-to-do-work");
// ...work...
} catch (RuntimeException e) {
span.recordException(e);
span.setStatus(StatusCode.ERROR, "manual child failed");
throw e;
} finally {
span.end(); // always end — success or failure
}
Two disciplines matter: makeCurrent() returns a Scope you must close (try-with-resources) or context leaks into the next call on that thread; and span.end() belongs in finally so the clock stops even on an exception.
The mental model: context is a ThreadLocal
Notice I never told the manual span who its parent was. It inherited it automatically. That's the key idea: an active span is stored as the current context on the thread (a ThreadLocal). Any span built while it's current defaults to it as parent. So a manual span created inside a @WithSpan method nests under it with zero wiring — and the exact same mechanism carries a trace across services, where HTTP instrumentation extracts the parent from a traceparent header before your handler runs.
The annotation exposes a knob for this: @WithSpan(inheritContext = false) cuts the link upward — it starts a brand-new root trace — but still parents anything created inside it. Handy for background jobs or queue consumers you don't want glued to the triggering request's trace.
The trap on the other side: context lives on the thread, so the moment you hop to a new Thread or a plain executor, the current span isn't there anymore and your "child" silently becomes a second root. Re-attach it explicitly with Context.current() + makeCurrent() on the new thread (or use Context.taskWrapping()).
Proof, from the Collector's own logs
Flip the Collector's debug exporter to verbosity: detailed and you can read the relationship straight off the wire — no UI needed:
Span #0 myStuffYouknow
Trace ID : b38a2059...b18ac9
Parent ID: (root)
Span ID : 6719568e06867292
Span #1 manualChildSpan
Trace ID : b38a2059...b18ac9 <- same trace
Parent ID: 6719568e06867292 <- == parent's Span ID
Same Trace ID, and the child's Parent ID equals the parent's Span ID. That's the entire parent/child contract — no foreign keys, the relationship rides inline on every span, and Tempo needs nothing more to draw the waterfall. Bonus: the two spans report different instrumentation scopes (spring-boot-autoconfigure vs my own otelzak-manual), which is how you tell auto-instrumented spans from hand-rolled ones in a real trace.
Bonus trap: Docker's "is a directory"
Before any of this worked, three Collector-stack containers crash-looped with read /etc/tempo/config.yaml: is a directory. Cause: the compose file mounted config.yaml but the files on disk were config.yml. When a bind-mount's host path doesn't exist, Docker doesn't error — it helpfully creates an empty directory there, and the container then opens a directory where it expected a file. One character of extension mismatch, three dead containers. Worth remembering.
Takeaways
- @WithSpan needs no Tracer — but it does need AspectJ on the classpath to weave.
- Context is a ThreadLocal — nesting is implicit; thread hops break it.
- Same Trace ID + ParentID link is the whole parent/child contract.
- Reach for the annotation by default; drop to
Tracer.spanBuilder()when you need finer control or explicit error status.
Small surface area, a few sharp edges. Once the context model clicks, manual tracing stops feeling like magic and starts feeling like plumbing — which is exactly what you want when you're debugging a production trace at 2am.