Balance — The game of CAP

Ani
5 min readApr 15, 2022

“If you are capable, but not available, nature will raise a person with lesser ability to replace you soon.”
Israelmore Ayivor, Become a Better You

Not all are Cucumbers

No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning.

CAP is often misunderstood as a choice at all times of which one of the three guarantees to abandon. In fact, the choice is between consistency and availability only when a network partition or failure happens. When there is no network failure, both availability and consistency can be satisfied. SQL relational databases such as YugabyteDB, CockroachDB, LeanXcale, NuoDB, or Google Spanner are counter-examples of this fallacy.

You can pick any two

CAP has been used by many NoSQL database vendors as a justification for not providing transactional ACID consistency, claiming that the CAP theorem “proves” that it is impossible to provide scalability and ACID consistency at the same time. However, a closer look at the CAP theorem and, in particular, the formalisation by Gilbert & Lynch, reveals that the CAP theorem does not refer at all to scalability, but only availability (the A in CAP).

You can’t have both

Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency.

Consistency versus availability

A database’s consistency refers to the reliability of its functions’ performance. A consistent system is one in which reads return the value of the last write, and reads at a given time epoch return the same value regardless of where they were initiated.

NoSQL databases support a range of consistency models, such as the following:

  • Strong consistency: A system that is strongly consistent ensures that updates to a given key are ordered and reads reflect the latest update that has been accepted by the system
  • Timeline consistency: A system that is timeline consistent ensures that updates to a given key are ordered in all the replicants, but reads at a given replicant might be stale and may not reflect the latest update that has been accepted by the system
  • Eventual consistency: A system that is eventually consistent makes no guarantees about whether updates will be applied in order in all the replicants, nor does it make guarantees about when a read would reflect a prior update accepted by the system

A database’s availability refers to the system’s ability to complete a certain operation. Like consistency, availability is a spectrum. A system can be unavailable for writes while being available for reads. A system can be unavailable for admin operations while being available for data operations.

How can you ensure consistency in NoSQL?

  1. ACID Consistency (ACID stands for Atomicity, Consistency, Isolation, Durability): ACID means that once data is written, you have full consistency in reads.
  2. Eventual Consistency (BASE): “BASE”Basically Available, Soft State, Eventual Consistency means that once data is written, it will eventually appear for reading.

How do you maintain consistency in a database?

Consistency in database systems refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof.

Does NoSQL support consistency?

Eventual Consistency is a popular feature offered by many NoSQL databases. Cassandra is one of them, and it can offer availability and network partition on such a level that it does not compromise the usability of the most accessed websites in the world that uses Cassandra.

Why is NoSQL eventually consistent?

The reason why so many NoSQL systems have eventual consistency is that virtually all of them are designed to be distributed, and with fully distributed systems there is super-linear overhead to maintaining strict consistency.

What is the alternative to acid properties in NoSQL?

These are known by the amusing backronym “BASE”, or “Basically Available, Soft State, Eventual Consistency”. While these are alternatives to ACID, the words “available” and “consistency” refer to the same properties as the CAP theorem, which lets you know these guarantees apply specifically to distributed databases.

--

--

Ani

Big Data Architect — Passionate about designing robust distributed systems