All articles
45 articles · updated weekly See our Tools
All articles
Tips

Unit Testing: What to Test and What to Skip

What separates tests that protect real behavior from tests that just inflate coverage and become a burden when you need to refactor.

COVER · Tips

You spend hours writing tests for getters and setters. Coverage hits 95%. Feels productive. Then a requirement changes, you refactor three lines of business logic, and the entire suite breaks — not because behavior changed, but because you were testing implementation, not behavior. Congratulations, you just created maintenance debt disguised as coverage.

This post is about the decision that separates tests worth the cost from tests that just become a burden.


What a unit test actually protects

A unit test doesn't protect code. It protects behavior. That distinction sounds semantic, but it completely changes what you write.

When you test code — "this method returns the value of property x" — you're testing something any IDE already checks. When you test behavior — "given this order with accumulated discounts over $500, the final amount should include free shipping" — you're documenting a business rule that no compiler will ever verify for you.

The right question before writing any test: "if I delete this test, what loses its guarantee?" If the answer is "nothing, the type checker already handles that," the test probably isn't worth writing.

What's worth testing

Domain logic with multiple paths

Any function with conditionals, edge cases, boundaries, or combinations of state is a candidate. Discount calculations, form validations, data transformations, authorization rules — these are where production bugs are born.

def calculate_discount(amount: float, customer: Customer) -> float:
    if customer.type == "premium" and amount >= 500:
        return amount * 0.15
    if customer.type == "premium":
        return amount * 0.10
    if amount >= 1000:
        return amount * 0.05
    return 0.0

This function deserves tests. Not one — several. One for each combination of type and amount that produces a different result. If you only write the happy path, you're leaving four edge cases unguarded.

Pure functions with non-obvious inputs

Pure functions are the easiest to test and the most valuable to cover. No side effects, no mocks needed, no complex setup — you pass input, verify output.

def format_phone(phone: str) -> str:
    digits = re.sub(r'\D', '', phone)
    if len(digits) not in (10, 11):
        raise ValueError(f"Invalid phone number: {phone}")
    if len(digits) == 11:
        return f"({digits[:2]}) {digits[2:7]}-{digits[7:]}"
    return f"({digits[:2]}) {digits[2:6]}-{digits[6:]}"

Test with: formatted input, unformatted, with spaces, with mixed characters, with 9 digits, with 12. Each variant is a real case someone will pass to this function in production.

Explicit edge cases

Numeric boundaries, empty strings, empty lists, null/None values, month-end dates, negative values where they're unexpected. These don't appear in happy-path tests, but they're where most production bugs live.

The rule is simple: if you asked "what happens if...?" during implementation, that's a test case.

Documented regressions

When a bug reaches production, the first step before fixing it is writing the test that reproduces it. Then fix it. That test guarantees the bug never comes back unnoticed — and it documents the expected behavior for anyone who comes after you.

What's not worth testing

Trivial getters, setters, and properties

class Product:
    def __init__(self, name: str, price: float):
        self._name = name
        self._price = price

    @property
    def name(self) -> str:
        return self._name  # testing this is a waste

There's no logic here. The type checker verifies the type, the runtime verifies access. A getter test will only break when you rename the property — and at that point, the test gives you zero information about what actually broke.

Framework and library code

If you're testing whether your ORM saves correctly to the database, you're testing the ORM, not your code. Trust that SQLAlchemy, Django ORM, Prisma — whatever you're using — works. Your tests should cover what you wrote on top of those tools.

The same applies to simple serialization: if you have a name field and you're testing that the serialized JSON contains "name": "value", you're testing the serialization library.

Excessive mocking

When a test has more mock setup than logic being tested, that's a sign you're testing implementation rather than behavior. If you need to mock five dependencies to test one function, consider whether the function is well-structured — but don't mistake that for the test being useful.

Mocks have their place: network calls, disk I/O, external services. But mocking a discount service to test another service that uses it can mean you're not testing the real flow — and the bug will be exactly in the integration you abstracted away.

Code that only exists as boilerplate

Controllers that just forward requests to services, DTOs without validation, adapters that only convert formats. No logic, nothing to test. Forcing coverage here is coverage theater.

The coverage metric is a well-intentioned lie

100% coverage doesn't mean the suite is good. It means every line was executed at least once. You can have full coverage and test no edge cases, no error paths, no non-trivial state combinations.

Coverage as an absolute number is useful for finding dead code and obvious gaps. As a quality target, it leads teams to write empty tests just to hit the percentage.

The number that matters isn't coverage — it's confidence. Can you refactor without fear? Can you swap your validation library and trust the tests will catch any behavioral regression? If yes, the suite is doing its job.

Integration tests vs unit tests: it's not a competition

Many teams burn energy on the wrong debate. The point isn't which type is better — it's using each where it has an advantage.

Unit tests are fast and precise: great for domain logic with many paths, where you want to test every combination in milliseconds. Integration tests verify that parts work together: great for end-to-end flows, repositories talking to databases, API handlers.

The mistake is using unit tests where integration would be more appropriate (mocking the database when the repository test is what matters) or integration tests where unit would be faster (spinning up the full application context to test a formatting function).

Frequently asked questions

How much coverage should I have?

Depends on the type of code. Domain logic: high — 80–90% makes sense. Infrastructure and adapters: lower — code that only delegates doesn't need aggressive coverage. Project-wide coverage as a single number is an average that hides where you're doing well and where you're not.

Should I write tests before or after the code?

TDD has real value for complex domain code — writing the test first forces you to think about the interface and edge cases before implementation. But it's not a law. Writing tests after understanding the problem is also valid. What's not valid is writing no tests at all because "there's no time right now."

What do I do with legacy code without tests?

Don't try to add coverage retrospectively everywhere. Prioritize: when you're about to change a function, write the test that documents current behavior first. That's the safety net that matters. It's not worth spending hours testing code that won't be touched.

When is a failing test good news?

Any time it breaks because of an unintentional behavioral change. The test did its job. When it breaks because of an implementation change that didn't alter behavior — renamed an internal variable, refactored structure — that's a sign the test was testing implementation details, not the contract.

Good tests are the ones you want to write

The clearest sign of a healthy suite isn't the number of tests or coverage — it's how much friction the team feels about writing more. If tests are seen as bureaucracy, something is wrong: either you're testing the wrong things, the setup is too painful, or tests break on every trivial refactor.

Before writing a test that depends on a regular expression in a validator, I use the Regex Tester to confirm the pattern actually matches what I expect — faster than running the full test suite to find out the regex was wrong to begin with.

Good tests are cheap to write, fast to run, and only break when behavior changes. If yours aren't, the suite isn't protecting you — it's slowing you down.

RD
Author
Rafael Duarte
Desenvolvedor backend com passagem por fintech e SaaS B2B — trabalhou em times que escalaram APIs de zero a milhões de requisições. Carrega cicatrizes de produção suficientes para ter opiniões fortes sobre ferramentas, padrões e decisões de arquitetura. Não é acadêmico: leu a RFC do UUID quando precisou escolher entre v4 e v7 para uma tabela de alta escrita.
View profile