Todos os artigos
37 artigos · atualizado semanalmente Veja nossas Ferramentas
Todos os artigos
Comparisons

Relational vs NoSQL: How to Actually Choose

An honest comparison of relational and NoSQL databases — real tradeoffs on consistency, schema, and scale, with a clear default recommendation.

COVER · Comparisons

The architecture meeting ran two hours. At the end, someone said "let's use MongoDB because it's more flexible" and everyone agreed without discussing what "flexible" actually means in that context. Six months later, the team was doing manual joins in application code for a report that would have taken two seconds in SQL. I've seen this movie before.

The choice between relational and NoSQL isn't about which technology is more modern — it's about understanding what your data is and how you're going to access it.

When to use a relational database (SQL)

A relational database is the right choice for most business systems. If you have data with a known structure, relationships between entities, and need transactional consistency, a relational database handles this better than any alternative.

The classic cases where SQL shines:

  • Financial data: transfers, invoices, ledgers. You need ACID — atomicity, consistency, isolation, and durability. A bank transfer that debits one account without crediting the other is a disaster. That's the problem relational transactions have been solving since 1970.
  • Systems with many relationships: users have orders, orders have line items, items have products, products have categories. SQL was designed for this. JOINs are efficient when indexes are in place.
  • Reporting and analytics: GROUP BY, window functions, CTEs — SQL for data analysis is a mature language with decades of optimization behind it. Trying to replicate this in application code is reinventing the wheel, poorly.
  • Data whose schema will evolve in a controlled way: migrations with ALTER TABLE are predictable and auditable. Knowing that a status field can only hold four possible values because there's a constraint in the database is a guarantee that a schema-free JSON document will never give you.

PostgreSQL in particular sits in a category of its own. It supports JSONB with GIN indexing (you store JSON documents and still run efficient queries on them), custom types, full-text search, PostGIS for geospatial data, and first-class arrays. It's hard to argue you need a NoSQL database for "flexibility" when Postgres already has that built in.

-- PostgreSQL with JSONB: relational and document storage in one place
CREATE TABLE events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  type VARCHAR(50) NOT NULL,
  payload JSONB,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Index on a field inside the JSON
CREATE INDEX ON events USING GIN (payload);

-- Query inside JSON with decent performance
SELECT * FROM events WHERE payload->>'user_id' = '123';

When to use NoSQL

NoSQL isn't a single category — it's an umbrella covering quite different technologies with distinct use cases.

Document stores (MongoDB, Firestore, CouchDB)

Make sense when the data is inherently variable and you'll almost always access it as a whole document, without needing to cross-reference other documents.

A product catalog is the canonical example: a sneaker has fields that a laptop doesn't (size, sole, cushioning technology), and vice versa. Fixed schema here is genuinely inconvenient, not theoretically. A document store handles this well.

What it doesn't handle well: anything that needs joins. When you start embedding a list of orders inside a user document to avoid joins, you're pushing database complexity into application code. That works until the document size grows, until you need to access orders without user context, or until you need a report of orders by time period.

Key-value stores (Redis, DynamoDB, Memcached)

Redis is probably the database you should have in production but don't yet. Not as your primary database — as cache, queues, sessions, rate limiting, and pub/sub. The data model is simple by design: one key, one value. What makes Redis absurdly fast (sub-millisecond operations) is exactly what makes it inadequate for complex queries.

DynamoDB follows the same principle but with guaranteed persistence and horizontally managed scaling by AWS. The cost is a more rigid access model — you need to know your access patterns before modeling the table, not after.

Wide-column stores (Cassandra, HBase)

Cassandra exists for cases where you need writes at absurd volume with distributed geographic availability. Think IoT metrics, clickstream events, usage logs at billions-of-records scale. Cassandra accepts writes across multiple datacenters simultaneously and reconciles afterward — the model is AP in the CAP theorem, not CP.

The cost: no joins, no multi-row transactions, and you model the schema based on the queries you'll run, not the structure of the data. If the queries change, the schema changes. It's a fundamentally different way of thinking compared to relational.

Graph databases (Neo4j, Amazon Neptune)

For data where the relationship between entities is the most important information — social networks, connection-based recommendation systems, dependency graphs, fraud detection. SQL can model graphs with adjacency tables, but "friends of friends of friends" queries in SQL degrade progressively. A graph database solves graph traversal by design.

The tradeoffs nobody mentions in the sales pitch

Eventual consistency: most distributed NoSQL databases sacrifice strong consistency for availability and partition tolerance. You can write to one node and read a stale value from another for milliseconds or seconds. For an Instagram like, that's acceptable. For a bank balance, it's not.

No schema = responsibility in the application: schema-free doesn't mean schema-less — it means the schema lives in the application code, distributed across the places that write and read data. When the format changes, you have old documents with format A and new ones with format B coexisting in the same collection. The application needs to handle both. Controlled migrations in a relational database exist to prevent exactly this.

Atomic operations across documents: many document stores only guarantee atomicity within a single document. Moving an item between two collections atomically may be impossible or require multi-document transactions with worse performance than SQL.

Horizontal scaling is an argument for very few: most business systems will never need to distribute data across multiple shards. PostgreSQL scales vertically into tens of terabytes with read replicas and connection pooling. If you're not facing a scale problem that Postgres can't handle, adding a distributed database is complexity with no real gain.

The rule I follow

My personal heuristic: start with PostgreSQL. If a use case emerges where a specialized database makes sense — Redis for caching, Elasticsearch for full-text search, Cassandra for time-series data at industrial volume — add that database for that specific case. Don't replace the relational database: complement it.

Most business applications, even ones that scale well, work better with a well-modeled relational database than a poorly-modeled NoSQL database. The problem is almost never the database — it's the modeling and the indexes.

For writing and validating SQL queries during the modeling process, I use the SQL Formatter to keep formatting consistent when queries start getting long — especially CTEs and subqueries that need to be readable for the next developer.

Frequently asked questions

Is MongoDB faster than PostgreSQL?

It depends on the workload. MongoDB can be faster for reading whole documents without joins. PostgreSQL is faster for analytical queries, joins with correct indexes, and transactional writes. Generic benchmarks mean nothing — what matters is your specific access pattern. The myth that NoSQL is "faster" comes from benchmarks that favor document retrieval and ignore relational use cases.

Can I use PostgreSQL as a document database?

Yes, and it's often a good decision. PostgreSQL's JSONB type stores documents with GIN indexing and support for queries inside the JSON. You get flexible schema where you need it and rigid schema where you need it, in the same database, with transactions. For most cases people cite as reasons to use MongoDB, PostgreSQL with JSONB handles it — without swapping your primary database.

DynamoDB or RDS for a new AWS project?

For most new projects, RDS (PostgreSQL or Aurora) is the safer choice. DynamoDB makes sense when you know your access patterns well upfront, need managed horizontal scaling, and are willing to accept data model limitations. If you're still discovering how the data will be accessed — which is true for most projects early on — DynamoDB will lock you into modeling decisions that are hard to reverse.

When does mixing SQL and NoSQL in the same system make sense?

When the needs genuinely diverge. An e-commerce platform can use PostgreSQL for orders, payments, and inventory (ACID required) and Redis for sessions and cart (speed, automatic TTL, no need for durable persistence). The problem is when the mixing happens out of trend-chasing — two databases have operational cost: two monitoring setups, two backup strategies, two areas of expertise. Only add if the gain is real and measurable.

SQL as default, NoSQL when it earns its place

The database decision is irreversible in the short term. Migrating from MongoDB to PostgreSQL with data in production is a multi-month project. Starting with PostgreSQL and realizing you need Redis for caching is a sprint. Conservatism here is an advantage — you add complexity when the real problem shows up, not in the architecture meeting where everyone is excited about the new technology.

NoSQL solves real problems. Cassandra at IoT scale, Redis for sessions, Elasticsearch for full-text search — these are the right tools for specific cases. The mistake is using NoSQL as the default because it sounds more modern, and discovering later that your business relationships were a better fit for a relational model all along.

RD
Autor
Rafael Duarte
Desenvolvedor backend com passagem por fintech e SaaS B2B — trabalhou em times que escalaram APIs de zero a milhões de requisições. Carrega cicatrizes de produção suficientes para ter opiniões fortes sobre ferramentas, padrões e decisões de arquitetura. Não é acadêmico: leu a RFC do UUID quando precisou escolher entre v4 e v7 para uma tabela de alta escrita.
Ver perfil