Distributed Systems — Replication

  1. Real-Time Sync- The replicas have to be in sync with the master. what does in-sync mean? Let's take a Database for example. In most databases, any writes going to the database are generally written to WAL (write-ahead log) first. Assuming the database storage is actually distributed over several storage servers, the data needs to be replicated as well for availability, scalability, and in the case of server crash/failure as a fallback mechanism. But to fallback, as in the server to start serving API requests from the fallback server, it needs to be in sync with the master.
    For eg an application serving hotel bookings to customers. As soon as a write occurs on the portion of the database in the master server, the replicated server needs to be in sync because if the master crashes before the write gets replicated that booking becomes invalid and might get rebooked to someone else.
    So how do we keep replicas in sync near real-time, it is one of the tough problems in computer science to solve out there.




Data Engineering

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Know The Go Programming Language!!

How To Enable Cross-Continental Collaboration for Tech Teams

SCPs at scale

Easy Speedup Wins With Numba

Generate Password Protected PDFs in Ruby on Rails

Create the latest WVD with new capabilities

Setting up for Waves and Breaks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Harry Singh

Harry Singh

Data Engineering

More from Medium

Build for Resilience — Importer Redesign

Distributed systems in production

Microservices, Facades, and everything in between

Thoughts on testing in a distributed system