Distributed Systems — Sharding

When i interview candidates about No-Sql and DS, one of the things i ask in detail is sharding? Many people just blunt out that dividing data is sharding, although that is technically ok, but i can also divide data into multiple Rdbms tables , does that makes my rdbms database a no-sql db now?
some use the fancy terms like horizontally scaling, and if i ask why not vertically scale, they throw terms like backups , master/slave which can again, be done in a vertically scaled up server. So lets discuss ‘sharding’ in a very basic and simple term using databases and servers as examples.

Primary Table

If we just split the tables into the range 0000 0000 0000–9999 9999 9999 into N tables where N can be computed by studying a variety of numbers.
For sake of simplicity lets say we just divide into 5 tables.
Now if a query comes for aadhar no 1234 4321 1224 we know just by looking at the number “123443211224” that it will be in the first table and that just reduces our searching time by 1/5 already. There are other things like indexes etc but we are just considering very basic sharding here.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store