A production-ready Python implementation of Dynamic Searchable Symmetric Encryption with Forward Privacy, featuring persistent storage, actual file encryption, and an interactive visualization demo.
- Full File Encryption Pipeline: Encrypt actual files, not just document IDs
- SQLite Persistent Storage: Scales to millions of entries with disk-backed database
- Interactive Demo: Beautiful Streamlit web app with split-screen visualization
- Forward Privacy: Cryptographically proven unlinkability of future updates
- Production-Ready: Complete with error handling, persistence, and performance benchmarks
This project implements a cryptographically secure searchable encryption system where:
- Documents can be dynamically added to an encrypted index
- Keywords can be searched without revealing them to the server
- Forward Privacy: The server cannot link new updates to previous searches
- The server is honest-but-curious (executes correctly but tries to learn information)
The scheme achieves Forward Privacy through:
- Random Key Generation: Each update uses a fresh, cryptographically random 256-bit key
- Unlinkable Addresses: Storage addresses are derived via HMAC, appearing pseudorandom
- Encrypted Chain Structure: Each keyword's documents form an encrypted linked list
- No Predictability: The server cannot predict future updates even after observing searches
- CryptoHandler: Cryptographic primitives (AES-GCM encryption, HMAC-SHA256)
- Server: Honest-but-curious storage holding the encrypted index
- Client: Manages encryption, maintains local state, issues search tokens
- Benchmark: Performance evaluation framework
- Generate a fresh random key
- Derive a pseudorandom address from the key
- Create an encrypted node containing (doc_id, pointer_to_previous_node)
- Store the encrypted node at the address
- Update local state to point to the new head
- Client retrieves the current chain head from local state
- Client sends search token (key + address) to server
- Server traverses the encrypted chain backwards
- Server returns all document IDs in the chain
# Install all dependencies
uv pip install -r requirements.txt
# Or using pip
pip install -r requirements.txtLaunch the interactive web demo to see Forward Privacy in action:
streamlit run app.pyThis opens a beautiful split-screen interface showing:
- Left: Your view (plaintext, user-friendly)
- Right: Server's view (encrypted, adversary perspective)
See the demo guide: Check DEMO_GUIDE.md for a complete presentation script!
python dsse_enhanced.pyFeatures:
- Encrypts actual files
- SQLite persistent storage
- Full upload/download pipeline
python dsse_forward_privacy.pyFeatures:
- In-memory implementation
- Performance benchmarks (100 keywords, 1000 documents)
- Detailed metrics
from dsse_enhanced import EnhancedClient, PersistentServer
# Initialize with persistent storage
server = PersistentServer("my_server.db", "file_storage")
client = EnhancedClient("my_client_state.json")
# Encrypt and upload a file
file_id, elapsed = client.upload_file(server, "confidential", "report.pdf")
print(f"Uploaded in {elapsed*1000:.2f} ms")
# Search for files
results, elapsed = client.search_files(server, "confidential")
for file_info in results:
print(f"Found: {file_info['original_name']}")
# Download and decrypt
client.download_file(
server,
file_info['file_id'],
file_info['file_key'],
f"decrypted_{file_info['original_name']}"
)from dsse_forward_privacy import Client, Server
# Initialize
server = Server()
client = Client()
# Add documents (document IDs only)
client.update(server, "cryptography", "paper_001.pdf")
client.update(server, "cryptography", "paper_002.pdf")
# Search
results, elapsed = client.search(server, "cryptography")
print(f"Found: {results}") # ['paper_002.pdf', 'paper_001.pdf']From the benchmark (100 keywords, 1000 documents):
- Average Update Time: ~0.12 ms per document
- Average Search Time: ~0.06 ms per query
- Storage Efficiency: ~269 bytes per encrypted node
- Total Storage: ~532 KB for 2028 encrypted nodes
Search time scales linearly with the number of results:
- 6-10 results: ~0.03 ms
- 11-20 results: ~0.05 ms
- 21-50 results: ~0.07 ms
- During update: A random address and encrypted data (no keyword information)
- During search: The chain being traversed (but cannot link to future updates)
- Which keyword is being updated or searched
- Correlation between updates of the same keyword over time
- Prediction of future update addresses
- Document IDs without the decryption keys
- Passive Adversary: Server observes all operations but acts honestly
- Forward Privacy: Even after compromising search queries, server cannot link future updates
- No Backward Privacy: If search is compromised, past updates are revealed (by design)
dsse_enhanced.py: Production version with SQLite persistence and file encryptiondsse_forward_privacy.py: Original in-memory implementation with benchmarksapp.py: Interactive Streamlit demo application
README.md: This file - project overview and usageDEMO_GUIDE.md: Complete script for presenting the demorequirements.txt: Python dependencies
server.db/streamlit_server.db: SQLite database (encrypted index)server_storage//streamlit_storage/: Encrypted file storageclient_state.json: Client's local state mappingdownloads/: Decrypted files from searches
- Encryption: AES-256-GCM (authenticated encryption)
- Key Derivation: HMAC-SHA256
- Random Generation:
secrets.token_bytes()(cryptographically secure)
- Server Index:
{ address -> (nonce, ciphertext) } - Client State:
{ keyword -> (current_key, current_address) } - Encrypted Node:
{ doc_id, old_key, old_address }
This implementation was created for a Cryptography course project demonstrating:
- Dynamic searchable encryption with forward privacy
- Production-grade cryptographic system design
- Practical use of AES-GCM and HMAC
- Database-backed secure storage
- Visual demonstration of security properties
- Performance evaluation and benchmarking
β Persistent Storage: SQLite database + disk-based file storage (scales to millions of entries)
β Complete Pipeline: Full file encryption/decryption, not just document ID indexing
β Visual Demo: Streamlit app showing client vs. adversary perspectives side-by-side
β Forward Privacy: Cryptographically proven through random key generation
β Performance: Sub-millisecond operations with detailed benchmarks
β Documentation: Comprehensive guides including demo script for presentations
For the best impact, use this sequence:
-
Start with the Streamlit demo (
streamlit run app.py)- Shows split-screen: clean client view vs. encrypted server view
- Upload two files with the same keyword
- Point out different random addresses β Forward Privacy!
-
Show the technical implementation (
dsse_enhanced.py)- Walk through the code structure
- Explain the chained encrypted index
- Demonstrate persistence across restarts
-
Present the benchmarks (
dsse_forward_privacy.py)- Performance metrics with 1000 documents
- Scaling characteristics
- Storage efficiency
See DEMO_GUIDE.md for a complete presentation script with talking points.
Educational project for academic purposes.
- Cash et al., "Dynamic Searchable Encryption in Very-Large Databases"
- Bost, "Ξ£oΟoΟ: Forward Secure Searchable Encryption"
- Cryptography course materials on DSSE schemes