e-Zest members share technology ideas to foster digital transformation.

Deployment of MongoDB Sharding

Written by Anandkumar Jadhav | Aug 24, 2015 11:53:11 AM

A recent project that I worked on required us to store millions of records along with low memory and processor usage. The system worked well for the first one million records. After that, the system started taking a lot of time to insert records, create data files and search records within a database. To solve this problem, we decided to implement MongoDB Sharding.

Sharding is the process of storing data/records on multiple machines. Sharding helps to cluster the database in horizontal scaling. Through sharding, we can add more machines, incorporate more data and read/ write operations and their performance.

Sharding consists of shards clusters, config Server and mongos servers. Shard cluster stores data or records which we can refer to as Documents. We can also replicate data sets. Query Router or Mongos sends request to the database and reply to the client with query results. It is possible to have multiple query routers. Config server stores the cluster metadata and mapping related information.

The best practice is to have at least three shard clusters, three config servers and two mongos instances.

Let us go ahead and create three Shard Clusters, one config server, and one mongos. We are going to create it on a Linux machine:

We are going to create

  1. Three shards ( ‘edb0.cluster.local’, ‘edb1.cluster.local’, ‘edb2.cluster.local’) and their associated ports will be 27021, 27022, 27023 recording (Since we are creating shard on a single machine, we are using different ports. Otherwise, we can use default port i.e. 27017 for all shards)
  2. One Config server with port 27024
  3. One Mongos/Query Router server with default port 27017

Let’s add host entries.

#vim /etc/hosts <enter>

Create Shard Clusters using following command.

# mongod --dbpath <location of shard database> --port <port number>

e.g.

# mongod --dbpath /volmdb/data/db/shard1 --port 27021

 

# mongod --dbpath /volmdb/data/db/shard2 --port 27022

# mongod --dbpath /volmdb/data/db/shard3 --port 27023

Create Config Server

# mongod --configsvr --dbpath <location of config database> --port <port number>

e.g.

# mongod --configsvr --dbpath /volmdb/data/configdb --port 27024

Create Mongos/ Query Router

# mongos –port 27017 –configdb <config server host/ip>:<port>

e.g.

# mongos –port 27017 –configdb edb0.queryserver.local:27024

 

Connect to the database using mongo command

#mongo --port 27017

And you can start adding shard cluster using addShard command

e.g

#mongos> sh addShard(“edh0.cluster.local:27021”)

{ “shardAdded” : “shard0000”, “ok” : 1 }

#mongos> sh addShard(“edh1.cluster.local:27022”)

{ “shardAdded” : “shard0001”, “ok” : 1 }

#mongos> sh addShard(“edh2.cluster.local:27023”)

{ “shardAdded” : “shard0002”, “ok” : 1 }

 

Enable Shard with new database

Add indexed and shard the collection.


 

It will create three shards for your database. The application can now use mongos to connect with the database i.e. edb0.queryserver.local:27017

Mongos>sh.shardCollection( {<database>:<collection>, key: <shardkey>} )

It will enable collection to shard and even distribute data between different shards.
Production environment should ideally have a separate machine for all shards, mongos and configdb which will result in improved performances of read/write, space optimization with replica sets, among other benefits.

References:

https://docs.mongodb.org/manual/sharding/