This project implements the PDF specification as a centralized distributed file system in Go with three executables:
master: the master tracker node with the lookup table and replication monitor.keeper: a data keeper node that stores.mp4files, sends heartbeats, and participates in replication.client: a CLI client for upload and download operations.
- Master tracker and data keeper communication uses gRPC.
- File transfers use raw TCP.
- Every data keeper sends a heartbeat every second.
- Upload flow is: client asks master, master assigns a keeper, client uploads over TCP, keeper notifies master, master records the file.
- Replication runs every 10 seconds and tries to keep every file on at least 3 alive data keepers.
- Download flow is: client asks master for candidate machines, then downloads from one selected keeper.
cmd/
master/
keeper/
client/
internal/
client/
keeper/
master/
rpc/
transfer/
scripts/
- Go 1.22+
gRPC APIs are defined in proto/dfs.proto and generated into internal/rpc/dfs.pb.go and internal/rpc/dfs_grpc.pb.go.
Open separate terminals and start the cluster.
go run ./cmd/master -grpc-addr 127.0.0.1:50051go run ./cmd/keeper -id keeper-1 -grpc-addr 127.0.0.1:6001 -tcp-addr 127.0.0.1:7001 -master-addr 127.0.0.1:50051 -data-dir ./data/keeper-1
go run ./cmd/keeper -id keeper-2 -grpc-addr 127.0.0.1:6002 -tcp-addr 127.0.0.1:7002 -master-addr 127.0.0.1:50051 -data-dir ./data/keeper-2
go run ./cmd/keeper -id keeper-3 -grpc-addr 127.0.0.1:6003 -tcp-addr 127.0.0.1:7003 -master-addr 127.0.0.1:50051 -data-dir ./data/keeper-3go run ./cmd/client upload -master 127.0.0.1:50051 -file ./sample.mp4go run ./cmd/client download -master 127.0.0.1:50051 -file sample.mp4 -out ./downloads- Only
.mp4files are accepted. - Download selection is round-robin per client process to satisfy the uniform request requirement over repeated downloads.
- Replication is eventually consistent: the master triggers a copy and the destination keeper notifies the master after the replica is stored.