JB Rainsberger on Testing, Design, and Working in Real Teams

핵심 요약

You can get most of your confidence from very small, fast tests, plus a few carefully placed integration checks at boundaries.

Bad tests and bad process are not destiny: delete tests that block you, design around contracts and collaboration, and learn to live sanely with impatient customers and real-world pressure.

Micro Tests, Integrated Tests, and End-to-End Tests

Most confusion in testing comes from people using the same words to mean different things, so it helps to define them by intent and scope, not dogma.

End-to-end tests exercise the system from an external entry point (often a UI or HTTP endpoint) all the way through to external effects (database, APIs, UI) and back; they are broad, slow, and fragile, but good for verifying that "the whole thing basically hangs together."

Micro tests (sometimes called very small unit tests) focus on a tiny piece of logic—often a single function or a small cluster—and run entirely in memory, with no network, database, or framework bootstrapping.

"Integrated test" is best understood as "a test that runs a lot more of the system than you actually care about for this check"—for example, wiring up the full HTTP stack, Spring context, and database just to see if one controller branch chooses the right view.

JB intentionally prefers "integrated tests" over "integration tests" to emphasize the problem: running big slabs of the system when you only need to check a small interaction, which makes failures hard to diagnose and tests slow and brittle.

How to Decide Test Scope: Start from "What Are You Trying to Check?"

Instead of obsessing over where the "unit" boundary is, start from the question: "What exactly am I trying to gain confidence about?"

If you just want to know whether a controller chooses the right view or calls the right collaborator, you don't need a browser, a web server, and a real database; those details are irrelevant noise for that question.

Any time you catch yourself thinking, "Do I really have to click through steps 1–3 of this wizard to test step 4?" that's a signal the test is too high-level—replace the UI and plumbing with direct calls to the core logic ("testing under the skin").

A unit can be anything independently inspectable, big or small; the real smell is not "this isn't a unit test" but "I had to run far more of the system than necessary just to check this one behavior."

The practical heuristic: remove irrelevant details from tests whenever you can, to get faster feedback and cleaner, more focused checks.

Tests as Positive Pressure on Design

When writing a small, focused test feels awkward, overly verbose, or slow, treat that as a design signal rather than a testing problem.

If it's hard to test a piece of code in isolation, that usually means it's too tightly coupled: it reaches directly into global state, plumbs through the framework, or mixes concerns (e.g., HTTP parsing + business rules + persistence).

Using tests as design feedback is a core idea of test-driven development: "this test is annoying" should translate into "there's a design problem here," leading you to refactor production code to make the test simpler and clearer.

When you refactor to improve testability—e.g., extracting a pure domain service, introducing an interface, or separating side effects—you nearly always improve the design in ways that help maintainability, reuse, and flexibility, not just testing.

The shift from "tests are paperwork" to "tests shape my design" is what distinguishes test-first programming (just defect reduction) from test-driven development (tests actively driving architecture).

"Too Many Tests Break My Refactoring": Deletion and Trade-offs

Developers often complain that a huge suite of small tests makes refactoring painful because every interface tweak breaks dozens of tests.

It's true that when you change contracts—method signatures, data structures, or collaboration patterns—related tests must change too; this is not a bug, it's a reflection of reality.

The alternative, however, is to rely mostly on larger integrated or end-to-end tests, which are slower, fewer, and harder to run continuously, giving you weaker and delayed feedback.

You don't have to accept test pain as permanent debt: deleting tests is always allowed; if a test is blocking an important refactor and you don't understand or trust it, throw it away and write better tests around the new design.

A key mental shift is to treat tests as disposable scaffolding rather than sacred artifacts; you can always rebuild them if the underlying code has become clearer and better structured.

Overdoing Testing to Learn "Enough"

You can't learn "how much testing is enough" from books; you learn it by overshooting, feeling the pain, and then trimming back.

A useful heuristic is "test until fear turns into boredom": when you still feel nervous, you probably need more or better tests; when tests feel repetitive and tedious, you're likely beyond the point of diminishing returns.

Beginners should be encouraged to overspecify and overtest for a while, just to experience what "too much" feels like and learn to recognize wasteful patterns.

As you gain experience, you tune your sensitivity: you notice where missing tests made bugs slip through, and where extra tests didn't buy meaningful confidence.

The goal isn't perfect coverage; it's a practiced, calibrated sense of confidence versus cost, driven by your own anxiety level and past experience.

Collaboration Tests and Contract Tests: A Better "Integration" Strategy

Instead of large integrated tests that exercise blocks of the system end-to-end, JB proposes focusing on two precise test types at boundaries: collaboration tests and contract tests.

A collaboration test asks, "Do I talk to you correctly?"—it checks that a component calls its collaborators with the right methods, arguments, and responses; for example, a controller calling findPrice(barcode) and then displayPrice(price) or some fallback when null is returned.

A contract test asks, "Do you behave as I'm expecting?"—it checks that an implementation of a dependency fulfills an agreed set of behaviors (semantics), not just the method signature (syntax).

Syntax of a contract is the shape of the interface (parameters and return types); semantics are the behavioral rules: for instance, "if there is a product with this barcode, return its price; if not, return null (not throw)."

If the client has collaboration tests and each implementation of the dependency passes the shared contract tests, you need far fewer integrated tests, because you already know the pieces fit (syntax) and behave as expected (semantics) when composed.

Example: Controller and Repository with Contracts

Imagine a point-of-sale system with a controller that handles "sell one item" and a product repository that looks up prices by barcode.

The controller's collaboration tests might say: "When findPrice(barcode) returns a price, the controller must show that price on the display," and "When findPrice(barcode) returns null, the controller must signal 'product not found' (e.g., beep or error message)."

The product repository contract tests might say: "Given stored data with a known barcode-price pair, findPrice(barcode) returns that price," and "Given a barcode not in the catalog, findPrice(barcode) returns null (not an exception, not a default)."

Any implementation of the repository—SQL database, in-memory map, CSV file—must pass the same contract tests to be considered valid; the controller does not care how it's implemented, only that the contract holds.

Because clients and suppliers are both anchored by these tests, wiring the real controller to a real repository is much less risky, and large, slow integrated tests become "nice-to-have regression smoke checks" rather than the main safety net.

From Modular Monoliths to "Microservices in a Single Process"

Many teams struggle with microservices not because of distribution itself, but because their components are not really modular: services secretly depend on each other's internals and timing, leading to "distributed monoliths."

If you rigorously design your in-process code as decoupled components communicating via clear contracts, tested with collaboration and contract tests, you effectively get "microservices in a single process."

In this style, each component has a well-defined API and contract, and is tested as if it could be deployed separately, even though everything currently runs in one JVM or process.

Once you have these clean seams, splitting some components into separate processes (true microservices) becomes a largely mechanical and far less risky step, instead of a terrifying rewrite.

Conversely, if you try to split a tightly coupled monolith into distributed services without contracts and collaboration tests, you just move your integration hell over the network.

Working with Manual Testers and QA

Manual testers are most valuable when they use their creativity and domain knowledge to discover new failure modes, confusing behaviors, and mismatches with user expectations.

When they are forced to follow the same regression scripts release after release, they're doing low-value, boring work that is better handled by automation.

Whenever manual testers notice they are executing the same scenario for the third time, that's a good moment to automate it and free them to explore new edge cases and stress scenarios.

Programmers and testers should collaborate: programmers automate the predictable checks; testers push into uncharted territory, acting as agents of the user and discovering gaps that drive new automated tests.

The myth that "programmers shouldn't test their own code" is harmful; programmers should test their code to avoid wasting testers' time, but no one person should be the only line of defense.

Dealing with Deadlines, Impatient Customers, and Corporate Pressure

The "impatient customer" (real customer, PM, VP, etc.) is not a problem to eliminate but a defining feature of a business; without someone asking for more than you can easily deliver, you don't have a business, you have a hobby.

Deadlines are often empty threats: teams frequently discover that missing a date is annoying but not catastrophic; learning this early helps you stop treating every date as existential.

Programmers tend to internalize external pressure and feel responsible for everything, even decisions far outside their control; this leads to guilt and burnout rather than better outcomes.

A powerful phrase for junior developers under pressure is: "I'm sorry, I don't know how to do that," delivered honestly and calmly; it disarms unrealistic demands and opens a discussion about trade-offs and alternatives.

Real psychological safety comes not from the absence of conflict, but from knowing you can admit limits, make mistakes, or miss a deadline without being destroyed for it—and that both "business" and "tech" sides see themselves as on the same team, constrained by the system they share.

AI and Testing: Using LLMs Without Losing Control

LLMs can be excellent at bootstrapping small applications or unfamiliar stacks: they generate initial code, wiring, and examples so you can start experimenting sooner.

They often behave like over-eager junior developers: they produce more code than asked for, sometimes helpful, sometimes off-target, and you must still read and understand what they've done.

To stay safe, combine LLMs with small, incremental steps and tests: describe a tiny behavior, generate code, run tests, adjust, and repeat, rather than letting the model dump an entire architecture in one go.

Tests become a self-defense mechanism: they help you validate AI-generated code, constrain its changes, and prevent silent regressions as prompts evolve.

One real risk is social: if developers spend most of their time talking to machines instead of humans, their collaboration skills atrophy, making team-level design and conflict resolution harder just when systems are becoming more complex.

인사이트

Aim for a testing strategy where 95–99% of your checks are small, fast, in-memory tests (including collaboration and contract tests), with a slim layer of true end-to-end tests as a smoke net on top.

When tests feel painful, treat that as a design smell, not a reason to abandon testing: refactor production code to make tests simpler, or delete and rewrite tests that no longer reflect how the system works.

If you work with multiple teams or microservices, invest early in clear contracts—with shared, executable contract tests—so you can parallelize work safely instead of relying on giant "integration phases" at the end.

In your own practice, deliberately overtest for a while, then use your boredom and frustration to learn where you're overshooting; over time, you'll develop a reliable gut feel for "enough testing" that's better than any rigid rule.

Finally, remember that testing is not just about code quality; it's about trust—trust in your design, in your teammates, in your tools, and even in yourself under pressure—and better tests, better contracts, and more honest conversations all serve that same goal.