SAKSHI CHHABRA
4 min readMar 10, 2022

Document-Oriented Database/ Document store

In the current and upcoming three blogs we are going to discuss various types of NoSQL database i.e. document database, key-value store, wide column store and graph database.

What are NoSQL databases and their properties?

NoSQL are non-tabular databases which store data differently from relational tables. NoSQL came into picture to store semi-structured data and to save time by avoiding complex joins by storing all data related to an object at one place.

BASE is often used to describe the properties of NoSQL databases. In comparison to CAP, BASE chooses availability over consistency.

  • Basically available — system guarantees availability
  • Soft state — state of system may change overtime without any input
  • Eventual consistency — system becomes consistent over time, given that system doesn’t receive any input within given time.

What is document store?

Document store is the first category of No-SQL database. Document-Oriented Database is a data storage system used for storing, managing and retrieving documents(XML, JSON, etc). Document databases are used to store semi-structured data.

A document is a record stored in a database and typically stored information about one object and related metadata. Documents store data in key-value pair.

Understanding Normalization in Document Database:

Document databases aren’t normalized i.e. data could be repeated in multiple documents.

For example when customer makes a purchase at online store, and all the the information related to purchases is stored in document database.

One purchase might look like:

Document object storing single transaction

After sometime, customer makes another purchase which will look something like:

Document object storing single transaction — 2

Notice the redundancy between two documents — shopper information and shopperId. But it’s okay as storage is inexpensive and each record contains single transaction which can be retrieved quickly using simple key-value query without any joins.

Collections:

Collections is a group of documents with similar content. However, its not necessary for all the documents to have same fields as document database gives flexible schema.

Database documents have query language that lets developers perform CRUD operations:

Create: Insert a document in the database where every document has a unique identifier

Read: Query for documents by using unique identifier or field value. Indexes can be used for faster data retrieval

Update: Update existing document in database

Delete: Delete existing document from database

Key Features of Document Database:

  1. Query language/query through API: Document databases have API or query language that allows developers to execute CRUD operations on the database. The documents can be queried based on the contents of the document.
  2. Distributed: Document stores are distributed and allow for horizontal scaling/vertical scaling and data distribution.
  3. Flexible schema: Documents in the database don’t need to have same fields. Document database provides flexible schema. Some databases provide schema validation, which can be locked down when needed.
  4. Resilient: Document databases are resilient through replication.

Difference between Document database and relational database:

  • Intuitive data model: Since the documents map to object in code, there is no need to decompose data, perform expensive joins. Data that is accessed together is stored together in document database, so developers have to write less code and end users get higher performance.
  • Flexible schema: A document’s schema is dynamic, so developers don’t need to pre-define it. Fields can vary from document to document, so developers can modify structure at any time avoiding disrupting schema migration.
  • Omnipresent JSON document: Json has been established standard for data sharing and storage because of JSON document being light-weight, human readable and language independent. Developers can structure data on application need basis compared to relational database with fixed rows and columns.

Developers find working with data in documents easier compared to data in relational database. Developers don’t have to worry about splitting data in multiple tables while storing it, or joining it back while retrieving data.

Strengths of Document Database:

  • No joins needed: Simple key-value query can fetch required data without need for join
  • Schema-less: There are no restrictions in the format and structure of data storage which avoid continuous database changes due to frequent changing data.
  • Minimal care: Minimal maintenance is required once the document is created.
  • No foreign key: With absence of relationship dynamic between documents, documents are independent of each other

Disadvantages of document store:

  • Lack of familiarity: There isn’t much documentation available for document database that it can be sometimes hard to find specific information without deep dives.
  • Security: Today’s web applications actively leak sensitive data. Owners of NoSQL database need to pay careful attention to web app vulnerabilities.

Use Cases of document database:

  • Real-Time Big Data: Historically, the ability to extract operational data was hampered by the fact that operational databases and analytical databases were maintained in different environment — operational and business/analytical respectively. Being able to extract operational data in real time is critical in highly competitive business environment. By using document database, operational data can be stored and managed from any source and concurrently fed to BI engine for analysis, eliminating need for two environments.
  • Customer data management and personalization: Since document database have flexible schema, they can store documents with different attributes. In online profiles, different users provide different type of information. Using document database, you can store user information efficiently by storing only the attributes that are user specific, rather than keeping various fields null which user doesn’t provide.
  • Catalog: Catalogs have thousands of attributes and document database provides faster reading times as attributes related to single product are stored in a single document

Most commonly used document databases are: CouchDB, MongoDB, Elasticsearch, DynamoDB.

The next blog covers second kind of NoSQL database i.e. key-value store. Link: https://sakshi8699.medium.com/key-value-store-nosql-database-608e78bd76f3

Subscribe to get notified whenever my upcoming blogs are published.

SAKSHI CHHABRA

Master's student in Computer Science from University of Florida. I love to write and help others, so here am i.