Rust Rewritten PostgreSQL
awaiting fundingby qsliu dev · 1 upvote · raised $0.00 · spent $0.00 · pool $0.00
Rewrite PostgreSQL in Rust as a faithful, open-source re-implementation — same SQL behavior, same wire protocol, same on-disk format — so that PostgreSQL's proven design lives on in a language the next generation of systems developers can actually maintain. # Motivation We love PostgreSQL. Its design — MVCC, WAL, the extensible type system, catalog-driven everything — is among the best in databases. The problem is the implementation language: aging C with manual memory discipline (MemoryContext), longjmp-based error handling, process-per-connection concurrency, and pervasive global state. These make it increasingly hard to maintain and contribute to. The goal is to preserve PostgreSQL's semantics while moving its implementation onto Rust's guarantees. # Approach (three phases, strictly in order) 1. **Translate.** Translate the entire PostgreSQL source tree (a single frozen release tag) from C to Rust, 1:1 and mechanical — "C written in Rust syntax," unsafe-heavy, idiom-for-idiom, following a written translation dictionary so the output is uniform. Pure Rust only: no hybrid C/Rust build, no FFI into PostgreSQL's own code. During this phase, do NOT attempt to compile or pass tests; optimize for throughput, uniformity, and per-line traceability back to the C source. Translation is embarrassingly parallel. 2. **Converge.** Make the translated tree compile, link, boot, run initdb, and pass PostgreSQL's full existing test suite (`make check-world`: pg_regress, isolation tests, TAP tests). The suite is the specification. Every fix restores 1:1 fidelity to the C — no improvements, no redesigns, bug-for-bug compatibility. Use differential testing against stock PostgreSQL (results, WAL, on-disk pages are byte-comparable because nothing was redesigned). 3. **Refactor.** With the suite green and trusted, refactor toward idiomatic safe Rust in small semantics-preserving steps, re-running the full suite after every step, with CI-enforced ratchets (unsafe count, static-mut count may only decrease). Target modernizations, in dependency order: - MemoryContext → Rust allocators, arenas with real lifetimes, and RAII (`Drop`); use-after-free of query memory becomes a compile error. - Eliminate global state, then replace one-process-per-connection with tokio-based async tasks in a single multi-threaded server — tens of thousands of cheap connections, built-in pooling. - Reduce defensive tuple copying toward zero-copy: buffer-pool pins as guard objects, tuples as borrows tied to pins, copy-vs-borrow made explicit and enforced by the ownership system, ideally zero copies from heap page to client socket. # Constraints - Existing drivers (libpq, JDBC, psycopg) must work unmodified; pg_regress pass rate is the headline metric. - Behavior changes are out of scope until all three phases complete; if C has a bug, replicate it. - License-compatible with PostgreSQL (PostgreSQL License); original project name. Preferred implementation: Rust for everything; keep linking external C libraries (zlib, ICU, OpenSSL) via bindings or swap in Rust equivalents where the test suite can't tell the difference.