ZB Field Notes

Aggregate Roots in Spring Data JDBC: The Delete-Boundary Rule

I spent an evening in a throwaway Spring Boot 4 project poking at Spring Data JDBC, and it crystallised something I keep re-explaining to people coming from JPA: Spring Data JDBC takes the DDD notion of an aggregate literally. Not as a suggestion, not as a naming convention — as an enforced boundary that shapes every query it generates. Once that clicked, a lot of its “limitations” turned out to be the whole point.

This is the distilled version: what an aggregate root actually is here, a one-question rule for drawing the boundary, and how to avoid the N+1 that the boundary seems to invite. Every SQL snippet below is what the framework really emitted, not what I wish it emitted.

No cache, no lazy loading — on purpose

First, the mental model. Unlike JPA/Hibernate, Spring Data JDBC has no persistence context: no first-level cache, no identity map, no dirty checking, no lazy loading. Call a repository method and SQL runs. That's it. Your entities are plain immutable records with no proxies hiding behind them.

This isn't a missing feature — it's what makes the aggregate model coherent. An aggregate is loaded and saved as a whole, so there's no piecemeal loading for a cache or lazy proxy to optimise. You give that up and get a dead-simple guarantee in return: if I call the repository, I know exactly what SQL happens.

An aggregate is a cluster; the root is the only door

An aggregate is a set of objects that must be saved, loaded and deleted as a single unit. The aggregate root is the one object that owns the cluster and is the only way in or out. The classic example is an order and its line items — a line item has no independent life; you never load one on its own, you load the order and its items come along.

Two rules fall straight out of this in Spring Data JDBC:

  • One repository per aggregate root — nothing else. A child entity inside an aggregate never gets its own repository; you reach it through the root.
  • The whole aggregate moves together. Loading the root loads the children; saving the root replaces the children; deleting the root deletes the children.

Case 1: a child that lives inside the aggregate

Say a user has roles that only mean something in the context of that user. I embed them — a Set<Role> right on the User record, no @Id on Role, no RoleRepository:

@Table("users")
public record User(@Id long id,
                   String username,
                   String email,
                   @MappedCollection(idColumn = "user_id") Set<Role> roles) {
}

@Table("roles")
public record Role(String name) { }

Saving a user with two roles fires — from a single save() call:

INSERT INTO "users" ("email", "username") VALUES (?, ?)
INSERT INTO "roles" ("name", "user_id") VALUES (?, ?)   -- once per role

Note the user_id back-reference I never set by hand. Loading it back is eager, in two queries assembled into one object:

SELECT ... FROM "users" WHERE "users"."id" = ?
SELECT "roles"."name" FROM "roles" WHERE "roles"."user_id" = ?

Case 2: a thing that stands on its own

An order is different. It's a real business fact with its own lifecycle — it outlives the user's session, it belongs on the books. So it's a separate aggregate root: its own @Table, its own OrderRepository, and it does not embed a User. It references the user by id via AggregateReference:

@Table("orders")
public record Order(@Id long id,
                    @Column("user_id") AggregateReference<User, Long> user,
                    String product) {
}

That reference is just a typed wrapper around the foreign-key value. Saving an order touches nothing in users or roles — the boundary holds.

The delete-boundary rule

Here's the whole thing in one question. When you're unsure whether a related object belongs inside an aggregate or should be its own aggregate, ask:

“If I delete the root, should this child die with it?” Yes → it's inside the aggregate (embed it). No → it's a separate aggregate (reference by id).

Watch it play out in the generated SQL. Deleting the user root issued, from one delete() call, a cascade in the correct FK order — with no ON DELETE CASCADE in the database:

SELECT "users"."id" FROM "users" WHERE "users"."id" = ? FOR UPDATE OF "users"
DELETE FROM "roles" WHERE "roles"."user_id" = ?   -- children first, automatically
DELETE FROM "users" WHERE "users"."id" = ?        -- then the root

The order got none of that. It's a separate root, so I had to delete it myself — and the user_id FK would have actively blocked me from deleting the user first. That's the rule made physical: roles answered “die with it” and got the automatic cascade; the order answered “I outlive you” and kept its independent lifecycle.

The deeper point: the question isn't “is there a relationship?” — User↔Role and User↔Order are both relationships. It's about ownership. A tell I trust: if you ever want a repository to query the child without going through its parent, the child actually wants to be its own aggregate.

Then how do you avoid N+1?

The obvious objection: if orders are a separate aggregate, loading orders for a page of users is 1 query for the users plus N for their orders. The instinct from JPA is to fetch-join across the relationship. Spring Data JDBC won't let you — there's no graph to traverse across a boundary, on purpose.

The fix isn't to dissolve the boundary. It's to batch-load by id and stitch in memory. One IN query, grouped in code:

@Query("SELECT * FROM orders WHERE user_id IN (:userIds)")
List<Order> findByUserIdIn(Collection<Long> userIds);
List<Long> ids = users.stream().map(User::id).toList();
Map<Long, List<Order>> ordersByUser = orderRepository.findByUserIdIn(ids).stream()
        .collect(Collectors.groupingBy(order -> order.user().getId()));

That's a flat 1 + 1 = 2 queries regardless of how many users you have. The generated SQL confirms it:

SELECT * FROM orders WHERE user_id IN (?, ?)

When you specifically want a single round-trip or a bespoke shape, the other move is a dedicated read model — a projection/DTO backed by a custom JOIN query, or a database view mapped read-only. That's CQRS in miniature: your aggregates are the write model tuned for consistency, and your reads use purpose-built query models. Spring Data JDBC won't auto-nest two aggregates from a join, so you group the flat rows yourself or register a ResultSetExtractor.

What I took away

Spring Data JDBC's refusal to traverse aggregate boundaries felt like a limitation for about ten minutes, then flipped into the thing I like most about it. In JPA, N+1 hides behind a lazy proxy until it bites you in production. Here it can't hide — the boundary forces every cross-aggregate read to be a deliberate choice: a batch query, or a read model. You keep crisp write-side aggregates and efficient reads, and the delete-boundary question keeps the whole thing honest. That's a trade I'll take.