NoSQL — A tale of four sisters

Ani
6 min readJan 14, 2022
4sisters.jpg

The acronym NoSQL was first used in 1998 by Carlo Strozzi while naming his lightweight, open-source “relational” database that did not use SQL. The name came up again in 2009 when Eric Evans and Johan Oskarsson used it to describe non-relational databases. Relational databases are often referred to as SQL systems. The term NoSQL can mean either “No SQL systems” or the more commonly accepted translation of “Not only SQL,” to emphasize the fact some systems might support SQL-like query languages.

The rise of NoSQL is an important event in computer science and in application development because SQL has been so dominant for so long. Many other forms of database technology have come and gone, but few have had the wide adoption of NoSQL.

Companies are finding that they can apply NoSQL technology to a growing list of use cases while saving money in comparison to operating a relational database. NoSQL databases invariably incorporate a flexible schema model and are designed to scale horizontally across many servers, which makes them appealing for large data volumes or application loads that exceed the capacity of a single server.

The popularity of NoSQL has been driven by the following reasons:

  • The pace of development with NoSQL databases can be much faster than with a SQL database.
  • The structure of many different forms of data is more easily handled and evolved with a NoSQL database.
  • The amount of data in many applications cannot be served affordably by a SQL database.
  • The scale of traffic and need for zero downtime cannot be handled by SQL.
  • New application paradigms can be more easily supported.

The bloodline

There are various ways to classify NoSQL databases, with different categories and subcategories, some of which overlap. What follows is a non-exhaustive classification by data model, with examples.

The most successful sisters

Well there are a lot of family members out there but out of them four has established them as the most successful family members of all time.

  • Wide-Column Store
  • Document Store
  • Key-Value Data Store
  • Graph Store

Wide-Column Store

A wide-column store (or extensible record store) is a type of NoSQL database which uses tables, rows, and columns, but unlike a relational database, the names and format of the columns can vary from row to row in the same table. A wide-column store can be interpreted as a two-dimensional key–value store. Google’s Bigtable and Apache Cassandra are one of the most eligible examples of wide-column store.

wide-column-store.png

Wide-Column Store — Where to use?

Wide-column databases are ideal for use cases that require a large dataset that can be distributed across multiple database nodes, especially when the columns are not always the same for every row. Examples are as following

  • Log data
  • IoT (Internet of Things) sensor data
  • Time-series data, such as temperature monitoring or financial trading data
  • Attribute-based data, such as user preferences or equipment features
  • Real-time analytics

Document Store aka Document DB

A document-oriented database is a specific kind of database that works on the principle of dealing with “documents” rather than strictly defined tables of information.

The central concept of a document store is that of a “document”. While the details of this definition differ among document-oriented databases, they all assume that documents encapsulate and encode data (or information) in some standard formats or encodings. Encodings in use include XML, YAML, and JSON and binary forms like BSON. Documents are addressed in the database via a unique key that represents that document. Another defining characteristic of a document-oriented database is an API or query language to retrieve documents based on their contents.

There’s very few example who are better than our own very favourite Mongo DB.

Different implementations offer different ways of organizing and/or grouping documents:

  • Collections
  • Tags
  • Non-visible metadata
  • Directory hierarchies

Compared to relational databases, collections could be considered analogous to tables and documents analogous to records. But they are different: every record in a table has the same sequence of fields, while documents in a collection may have fields that are completely different.

https://www.mongodb.com

Document DB use cases

Document DB is widely used for storing product information and details by finance and e-commerce companies. You can even store the product catalogue of your brand in it. It can also be used to store and model machine-generated data, continuous stream ingestion etc.

Key-Value Data Store

A key-value store, or key-value database is a simple database that uses an associative array (think of a map or dictionary) as the fundamental data model where each key is associated with one and only one value in a collection. This relationship is referred to as a key-value pair.

In each key-value pair the key is represented by an arbitrary string such as a filename, URI or hash. The value can be any kind of data like an image, user preference file or document. The value is stored as a blob requiring no upfront data modeling or schema definition.

The storage of the value as a blob removes the need to index the data to improve performance. However, you cannot filter or control what’s returned from a request based on the value because the value is opaque.

The two finest implementations of key value stores are Amazon Dynamo DB and Azure Cosmos DB.

Use Case of Key-Value Data Store

Several popular use cases for key-value databases are

  • Web applications may store user session details and preference in a key-value store. All the information is accessible via user key, and key-value stores lend themselves to fast reads and writes.
  • Real-time recommendations and advertising are often powered by key-value stores because the stores can quickly access and present new recommendations or ads as a web visitor moves throughout a site.
  • On the technical side, key-value stores are commonly used for in-memory data caching to speed up applications by minimizing reads and writes to slower disk-based systems. Hazelcast is an example of a technology that provides an in-memory key-value store for fast data retrieval.

Graph Store

Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, and most of the value of graph databases is derived from these relationships. Graph databases use nodes to store data entities, and edges to store relationships between entities. An edge always has a start node, end node, type, and direction, and an edge can describe parent-child relationships, actions, ownership, and the like. There is no limit to the number and kind of relationships a node can have.

A graph in a graph database can be traversed along specific edge types or across the entire graph. In graph databases, traversing the joins or relationships is very fast because the relationships between nodes are not calculated at query times but are persisted in the database. Graph databases have advantages for use cases such as social networking, recommendation engines, and fraud detection, when you need to create relationships between data and quickly query these relationships.

Neo4J is one of the most popular graph databases in the world at this moment.

Graph Store Use Cases

As explained by the image above it is evident graph stores are highly used in social networking sites. It has solid usage in fraud detection, threat identification, vigilance, Recommendation Engines, Supply Chain Mapping etc.

For any type of help regarding career counselling, resume building, discussing designs or know more about latest data engineering trends and technologies reach out to me at anigos.

P.S : I don’t charge money

--

--

Ani

Big Data Architect — Passionate about designing robust distributed systems