All articles
66 articles · updated weekly See our Tools
All articles
Comparisons

Integration tests vs unit tests: what each one actually catches

High coverage, confident deploy — and the bug still shows up in production. Learn what separates unit from integration tests and when to reach for each.

COVER · Comparisons

The user repository was covered. Ninety-one percent coverage, all tests green, deploy out the door. Then the team noticed that signup was saving users without sending the confirmation email. The bug was in the integration between the user service and the email service — two individually perfect components that had never been tested together.

That's the gap the "unit test vs integration test" debate needs to address: not which one is better in the abstract, but which one catches which category of problem.


What each type of test actually sees

A unit test isolates a unit of code — function, method, class — and verifies its behavior with controlled inputs. Everything outside that unit is replaced with mocks or stubs. The test is fast, deterministic, and surgical.

An integration test verifies that two or more components work correctly when coupled together. It could be the repository layer talking to a real database, an HTTP handler calling a service that calls a repository, or any combination of parts that in production need to communicate.

The difference isn't granularity — it's what you're trying to guarantee. Unit tests guarantee that the internal logic of a function is correct. Integration tests guarantee that the function behaves correctly inside the system.

The email bug above wouldn't have been caught by any unit test, no matter how thorough. The email service had its mocks, the user service had its own. Nobody tested the contact point — the real interface between them.

Cost and speed: the numbers that matter

When you're deciding where to invest testing effort, two parameters matter: how much it costs to write and how much it costs to run.

Unit tests:

  • Writing: fast, no infrastructure needed
  • Execution: milliseconds per test, thousands run in seconds
  • Maintenance: proportional to the number of mocks — the more you mock, the more fragile it becomes when implementation changes

Integration tests:

  • Writing: slower, requires setup (database, containers, seeds)
  • Execution: seconds to tens of seconds per test, large suites take minutes
  • Maintenance: more stable than unit tests when refactors don't change external behavior

The math seems to favor unit tests. But there's a hidden variable: the cost of an integration bug in production is orders of magnitude higher than the cost of the integration test that would have caught it.

Teams that only write unit tests tend to have high coverage and still deploy changes that break real flows. Teams that only write integration tests have slow suites and slow development feedback loops. The right proportion depends on the system, not on a universal rule.

The test pyramid — and why the model got more complicated

Mike Cohn's classic test pyramid says: many unit tests at the base, fewer integration tests in the middle, few E2E tests at the top. The logic was cost-based: unit tests are cheap, E2E tests are expensive.

The problem is that the pyramid was designed with rich, complex business logic in mind. In data-oriented systems — CRUD APIs, processing pipelines, microservices that primarily transform and persist data — most of the real behavior lives in the integrations, not in the internal logic of each piece.

For those systems, Kent C. Dodds' testing trophy model makes more sense: the largest layer is integration tests, with unit tests for complex domain logic and a few E2E tests for critical flows. It's not a revolution — it's calibration for the type of system.

The question that helps decide: "where do our production bugs come from?" If most come from business logic with many branches, unit tests pay off more. If most come from integrations between components, database, queues, external APIs — integration tests pay off more.

When to use unit tests

Domain logic with multiple paths is the natural candidate. If you have a discount calculation function with eight combinations of customer type and purchase value, you want to test all eight in milliseconds — not spinning up a database every time.

The post What to test in unit tests covers that territory in detail: when the test actually protects real behavior vs when it becomes coverage theater. The conclusion there is that unit tests deliver maximum value when the function is pure, has multiple code paths, and the behavior is stable enough that the test won't break on every refactor.

Other good candidates for unit tests:

  • Parsers and data transformations with edge cases
  • Validations with complex rules
  • Formatting functions (dates, currency, documents) with non-obvious inputs
  • Authorization rules with combinations of permissions

What's not worth unit testing: anything whose only "logic" is calling something else. A controller that delegates to a service, a repository that calls the ORM, an adapter that just converts formats — no logic, nothing to test at the unit level.

When to use integration tests

The repository is the most obvious case. Mocking the database in a repository test defeats the purpose: you need to know if the query is correct, if the indexes work, if the mapping is right. A repository test that doesn't hit the database is, literally, a mock test.

# This doesn't test the repository — it tests whether you know how to configure a mock
def test_find_user_mock():
    repo = UserRepository(session=mock_session)
    mock_session.execute.return_value = MockResult(fake_user)
    result = repo.find_by_email("test@example.com")
    assert result.email == "test@example.com"

# This tests the repository
def test_find_user_database(db_session):
    repo = UserRepository(session=db_session)
    db_session.add(User(email="test@example.com", name="Test"))
    db_session.commit()
    result = repo.find_by_email("test@example.com")
    assert result.email == "test@example.com"

The second test catches real problems: wrong query, field mapped with a different name, violated constraint, migration not applied. The first catches none of that.

Other cases where integration is the right choice:

  • API handlers: verify that the endpoint returns the correct status with the correct payload, given a known database state. Don't mock the internal service — test the complete flow.
  • Queue processing: verify that a published message is consumed and processed correctly end-to-end.
  • External service integrations: when viable, use the real API in a sandbox environment. When not, use a service mock (WireMock, Prism) that behaves like the real service — not an in-code object mock.
  • Transformation pipelines: when data passes through multiple stages, testing the full pipeline is more useful than testing each step in isolation.

The most expensive mistake: mocking what should be integrated

There's a pattern that looks prudent but is a trap: mocking the database in repository tests, mocking the email service in notification tests, mocking the external API in payment gateway tests — and never having a test that verifies the real integration.

The result is a suite with 90%+ coverage where each component is "tested" but the system as a whole has never been exercised. It's coverage theater: the numbers are good, the confidence is false.

The heuristic I use: if two components are separated in code but coupled in production behavior, the relevant test is an integration test. Layer boundaries, callbacks, events — those are the points where bugs are born, and they're exactly the points unit tests skip by design.

Infrastructure for integration tests

The biggest argument against integration tests is setup cost. And it's real — but solvable.

Testcontainers solved the database problem for local and CI testing: you spin up a Postgres (or MySQL, Redis, whatever) container at the start of the suite, use it in tests, and discard it at the end. No shared state between runs, no dependency on a locally configured database.

# With testcontainers-python
@pytest.fixture(scope="session")
def postgres_container():
    with PostgresContainer("postgres:16") as pg:
        yield pg.get_connection_url()

In Go, the same approach with testcontainers-go. In Java, with the original Testcontainers library. The startup cost is real (10–30 seconds to bring up the container), but amortized across the entire suite it's acceptable.

For external APIs, I use the contract testing pattern: define the API contract (what it receives and returns), test against a mock that implements that contract, and have a separate test (less frequent) that validates the contract against the real API. That way you don't depend on connectivity on every run but you're also not testing completely in the dark.

Frequently asked questions

What's the practical difference between integration tests and E2E tests?

Integration tests verify the integration between system components — usually without a user interface, through code. E2E (end-to-end) tests drive the complete system through the interface — usually a browser with Playwright or Cypress. Integration tests are faster and easier to debug when they fail; E2E tests are closer to real user behavior but much slower and more fragile. The practical rule: use E2E only for critical flows that can't be tested any other way.

Can I use an in-memory database (SQLite) instead of the production database in integration tests?

Technically yes, but with caveats. SQLite has different semantics from Postgres in several areas: data types, transaction behavior, JSON support, upserts. A test that passes on SQLite may fail on Postgres due to dialect differences. If your production database is Postgres, test with Postgres — Testcontainers makes that simple enough to remove any excuse.

What ratio of unit to integration tests should I have?

There's no universal ratio. The useful question is: where is the risk in my system? In complex domain logic (calculations, business rules, validations), unit tests pay off more. In data-oriented systems (APIs, pipelines, microservices), integration tests pay off more. A typical project ends up with more integration than unit tests — not for philosophical reasons, but because most real bugs happen at the boundaries between components.

Are mocks bad in integration tests?

Mocks have a place — for external services you don't control (payment API, SMS service, transactional email). The problem is mocking internal components you could test for real. If you're mocking your own service inside an integration test, you're probably testing the wrong layer.

Unit and integration tests don't compete — they complement each other

The tension between the two types isn't real. Unit tests handle internal logic problems well; integration tests handle coupling and contract problems between components well. A healthy system uses both where each has an advantage.

What doesn't work is using one as a substitute for the other out of convenience — mocking everything for fast tests, or only writing E2E because "it's more realistic." The most expensive production bugs I've seen were all at integration points that were covered by mocks. Code coverage is not coverage of real behavior.

RD
Author
Rafael Duarte
Desenvolvedor backend com passagem por fintech e SaaS B2B — trabalhou em times que escalaram APIs de zero a milhões de requisições. Carrega cicatrizes de produção suficientes para ter opiniões fortes sobre ferramentas, padrões e decisões de arquitetura. Não é acadêmico: leu a RFC do UUID quando precisou escolher entre v4 e v7 para uma tabela de alta escrita.
View profile