CAP Theorem & BASE Semantics
CAP Theorem & BASE Semantics
In distributed systems, the CAP Theorem defines the fundamental limits of what we can expect from a database. While traditional RDBMS emphasize ACID (Atomicity, Consistency, Isolation, Durability), most NoSQL systems favor BASE (Basically Available, Soft state, Eventual consistency).
🏗️ 1. The CAP Theorem
Proposed by Eric Brewer, the CAP Theorem states that a distributed data store can only simultaneously provide at most two out of the following three guarantees:
| Guarantee | Description |
|---|---|
| Consistency (C) | Every read receives the most recent write or an error. |
| Availability (A) | Every request receives a (non-error) response, without the guarantee that it contains the most recent write. |
| Partition Tolerance (P) | The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes. |
Why P is Mandatory
In a modern distributed environment, network partitions will happen. Therefore, we must always choose P, leaving us with a choice between:
- CP (Consistency/Partition Tolerance): MongoDB (by default) is a CP system. It ensures all nodes have the same data, but if a partition occurs, some nodes might become unavailable to preserve consistency.
- AP (Availability/Partition Tolerance): Systems like Cassandra or DynamoDB prioritize staying online, even if they return stale data during a partition.
🚀 2. BASE vs. ACID
To achieve horizontal scale, NoSQL systems often relax the strict ACID requirements.
ACID (The Relational Standard)
- Atomicity: All-or-nothing transactions.
- Consistency: Transactions must transition the database from one valid state to another.
- Isolation: Concurrent transactions don’t interfere with each other.
- Durability: Committed data is permanent.
BASE (The NoSQL Alternative)
- Basically Available: The system guarantees availability according to the CAP theorem.
- Soft state: The state of the system may change over time, even without input (due to eventual consistency).
- Eventual consistency: The system will eventually become consistent, provided the system does not receive new inputs for a period.
🧪 3. MongoDB’s Approach
While MongoDB is traditionally CP, modern versions (4.0+) have introduced Multi-document ACID Transactions while still allowing for Tunable Consistency through Read and Write Concerns.
Example: Tunable Consistency with Pymongo
from pymongo import MongoClient, WriteConcern, ReadConcern
client = MongoClient("mongodb://localhost:27017")
db = client.test_db
# Example: High Consistency (Wait for majority acknowledgment)
collection_strict = db.get_collection(
"orders",
write_concern=WriteConcern(w="majority"),
read_concern=ReadConcern(level="majority")
)
# Example: High Availability (Don't wait for replication)
collection_relaxed = db.get_collection(
"logs",
write_concern=WriteConcern(w=1),
read_concern=ReadConcern(level="local")
)
# Insert a document with strict consistency
collection_strict.insert_one({"order_id": 101, "status": "Pending"})Summary
The CAP theorem reminds us that there is no “perfect” database. Every choice—whether it’s MongoDB, PostgreSQL, or Cassandra—is a trade-off between consistency, availability, and how the system handles network failures.