A Distributed Key-Value Store in Go

distribKV is a distributed key-value store written in Go. It supports sharding, replication, client-side routing, LSM-based storage (BadgerDB), and more.

Note

This project is for educational purposes — to understand how a key-value store works under the hood along with the theoretical concepts of sharding, replication, LSM trees, consistency models, bloom filters, and distributed systems.

📦 Features

Sharding & Replication with configurable leaders and replicas
Client-side Routing with HTTP redirection
LSM-Tree Storage using BadgerDB
Benchmarking Tools for performance testing
Concurrency via Goroutines
Persistence with Write-Ahead Logging and Compaction
Built with Go 1.24 and designed for extensibility

System Design & Architecture

This distributed key-value store is built with modern distributed systems principles, ensuring scalability, fault tolerance, and high availability.

Distributed Systems Fundamentals

Sharding(static)
The distribKV project utilizes a static sharding technique. In static sharding, the dataset is partitioned across predefined shards based on a fixed configuration. Each shard is responsible for a specific subset of the data, and the mapping between data keys and shards is established at the time of system configuration. In distribKV, this configuration is specified in the sharding.toml file.
Replication
Maintains multiple copies (replicas) of data across nodes for fault tolerance and high availability. Uses a leader–follower model or single leader replication, where the leader node handles all writes and followers synchronize data.

Consistency Models

Eventual Consistency
After a write, data is eventually replicated to all nodes, ensuring convergence over time.
Leader-based Replication
A single leader node handles all writes. Follower nodes replicate updates to maintain consistency.

Client-side Routing & Redirection

Key-based Routing
Clients use consistent hashing to determine the shard responsible for a key.
Redirection
If a request hits the wrong shard or non-leader replica, it is redirected using HTTP 307 or 308 status codes to the correct node.

Persistence and Storage

BadgerDB (LSM Tree) and SSTables
Uses BadgerDB, an embeddable key-value store inspired by RocksDB and LevelDB. It implements a Log-Structured Merge Tree (LSM) architecture for high write throughput and efficient storage management.
- LSM Tree: Writes are first stored in memory (MemTable) and periodically flushed to disk as immutable Sorted String Tables (SSTables).
- Write-Ahead Log (WAL): Ensures durability by logging changes before applying them.
- Compaction: Periodically merges SSTables to optimize read performance and reclaim storage space.
- Bloom Filters:Bloom filters are built for each SSTable to help quickly reject non-existent keys, avoiding unnecessary disk reads.

Scaling Types

Horizontal Scaling (Scale-Out)
The system supports adding more nodes (shards or replicas) to distribute the load. Keys are partitioned across shards using consistent hashing, allowing efficient horizontal growth.
Vertical Scaling (Scale-Up)
Each node can independently handle increased load by using efficient concurrency with Goroutines and optimizing storage with BadgerDB’s low-overhead design.
Shard-Based Partitioning
Data is divided across shards by hashing keys. Each shard can be independently scaled with its own replicas and leaders.
Replication-Based Scaling
Read operations are offloaded to replicas, reducing the load on leader nodes and improving throughput.

Concurrency & Benchmarking

Goroutines
Efficient parallelism using lightweight Go goroutines for concurrent GET/SET requests.
Benchmarking
Tools for measuring:
- Throughput (requests/sec)
- Latency (response times)
- Load performance

Fault Tolerance & Resilience

Replication
Replicas allow the system to survive leader failures without data loss.
Data Synchronization
Periodic syncing ensures eventual consistency between leader and replicas.

CAP Theorem Trade-offs

“You can only choose two: Consistency, Availability, Partition Tolerance.”

distribKV favors:

✅ Availability
✅ Partition Tolerance
🚫 (Relaxed) Consistency

Strong consistency can be added with protocols like Raft or Paxos in the future (LOL).

References

This project draws inspiration from:

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
badgerdb		badgerdb
cmd/bench		cmd/bench
config		config
db		db
img		img
replication		replication
server		server
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
badger_db.go		badger_db.go
bench.sh		bench.sh
go.mod		go.mod
go.sum		go.sum
launch.sh		launch.sh
main.go		main.go
populate.sh		populate.sh
sharding.toml		sharding.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Distributed Key-Value Store in Go

📦 Features

System Design & Architecture

Distributed Systems Fundamentals

Consistency Models

Client-side Routing & Redirection

Persistence and Storage

Scaling Types

Concurrency & Benchmarking

Fault Tolerance & Resilience

CAP Theorem Trade-offs

References

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A Distributed Key-Value Store in Go

📦 Features

System Design & Architecture

Distributed Systems Fundamentals

Consistency Models

Client-side Routing & Redirection

Persistence and Storage

Scaling Types

Concurrency & Benchmarking

Fault Tolerance & Resilience

CAP Theorem Trade-offs

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages