Hunting a 16-year-old SQLite WAL bug with TLA+(ubuntu.com)
220 points by peterparker204 4 days ago | 27 comments
tl;dr: Canonical's dqlite team used TLA+ to model a 16-year-old SQLite WAL checkpoint bug that could corrupt databases via a data race: a second checkpoint could miss that the WAL was reset by a concurrent writer, causing it to incorrectly mark frames as backfilled and later skip transactions. The model quickly reproduced the bug in ~20 states, matching SQLite's own description. Modeling dqlite showed it's unaffected since it acquires the write lock during checkpoints, preventing the race. SQLite's fix adds a salt comparison to detect WAL resets before proceeding.
HN Discussion:
  • Background context on TLA+ and its creator Leslie Lamport
  • Author is available to answer questions about the post
  • Title is misleading since article is really about proving dqlite safe, not hunting the SQLite bug
  • Interest in porting TLA+ concepts to Lean with tactics
  • Speculation about using TLA+ models as verification for LLM-generated code