There is a quiet problem that surfaces when you ship a long-running service backed by a file-based database. The database file sits on disk, immutable to the running process, until it is not. Someone drops a new snapshot. A backup restores an older state. A replication consumer writes new data. And your application, blissfully unaware, continues querying stale bytes.
The question is simple: how do you notice? And once you notice, how do you switch without breaking the queries already in flight?
What Happens Mid-Request
A request comes in. Your handler opens a read transaction, scans a few rows, starts building a response. Between the first SELECT and sending the response, an external process replaces app.db with a fresh snapshot.
If the new file was written directly over the old one—bytes overwriting bytes—you get corruption. Your cursor is walking a B-tree that no longer exists. Page checksums fail. SQLite returns SQLITE_CORRUPT and your request dies.
If the swap used rename(new.db, app.db)—atomic on POSIX—something subtler happens. Your file descriptor still points to the old inode. The kernel didn't move your handle. Your transaction completes and returns correct data, but stale data. The user sees a version of reality that no longer exists on disk.
The first case crashes. The second lies. I'm not sure which is worse for debugging at 3am.
Why This Breaks Things
Your process opened app.db. It memory-mapped regions, cached pages, built internal structures. SQLite assumes those bytes are stable for the duration of a connection.
When the underlying file changes: open transactions can't see consistent state. Prepared statements hold page offsets that now point somewhere else. If you're using WAL mode and the WAL file gets swapped too, readers and writers end up disagreeing about what history looks like. Connection pools become a liability because every handle in the pool is now pointing at a ghost.
The database trusts the filesystem. That trust is now misplaced.
Requests in Three States
When you detect a change and want to reload, your request queue has three kinds of requests sitting in it.
Completed ones already sent a response. Nothing to do there. In-flight ones are mid-transaction, holding read locks or cursors. Queued ones are waiting for a connection and haven't touched the old file yet.
The hard question is what to do with the in-flight ones.
You can abort them. Kill the transaction, return 503. If your clients have retry logic, they'll figure it out. This is fine for internal services, less fine for user-facing ones.
You can let them finish. They see stale data, but consistent stale data—their transaction was correct for the version they started with. This is eventual consistency doing its thing, and depending on your domain it might be completely acceptable.
Or you can fence them. Mark the old version as "draining." New requests go to the new file. Old requests finish against the old one. When the last draining request completes, close that file handle and reclaim the memory. This is the most work but it's the only option that doesn't drop requests or serve stale data to new ones.
Knowing When to Swap
Before you can fence, you need to notice. The filesystem provides tools: inotify on Linux, kqueue on BSD, FSEvents on macOS. You can poll the modification timestamp or file size. You can watch for a sentinel file that the writer creates when the new snapshot is ready.
But detection alone isn't enough. A half-written WAL isn't a valid database. A VACUUM in progress isn't complete. The writer has to signal completion somehow—a renamed file, an atomically written version number, a lock released. You need the writer and reader to agree on when a transition is safe.
If you control the writer, this is straightforward. If the writer is "whatever cronjob runs the S3 sync," you have to get creative.
Coordination Patterns
Double-buffering keeps two file handles, one active and one standby. When a new snapshot arrives, you open it on the standby, run some queries to warm the cache, atomically swap which handle is active, drain the old one, then close it. Memory doubles, but requests never see a mid-swap state.
Version-tagged connections assign each request a version number at arrival. The connection pool returns a handle for that version. New requests after a swap get version N+1. Old requests finish with N. When no one's holding version N anymore, clean it up. This is more bookkeeping but uses less memory than keeping two full copies.
Grace periods accept that you'll have stragglers. After detecting a change, start a timer. New requests go to the new version. Old requests have T seconds to finish. After T, force-close anything still running. This bounds tail latency but you'll occasionally abort something.
Locks Don't Save You Here
SQLite has advisory locking. You can use sqlite3_file_control() to check lock state. DuckDB has WAL management APIs.
But these assume a cooperative writer—one that knows to check locks before swapping files. If the external process is just dropping files from a restore script or S3 sync, it doesn't know you exist. The coordination has to happen outside the database: a lock file, a process signal, something in Redis. The database's internal locks are for its own consistency, not for your deployment choreography.
The Simple Version
If you don't need zero downtime, this works:
- Watch the file.
- When it changes, stop accepting requests.
- Close all connections.
- Reopen the database.
- Resume.
Downtime is steps 2 through 5. For a small SQLite file that's milliseconds. For a 10GB DuckDB file with a cold cache, it's seconds. Sometimes this is fine.
Trade-offs
| Strategy | Request Loss | Memory | Complexity | Consistency |
|---|---|---|---|---|
| Stop-the-world | During reload | 1x | Low | Strong |
| Double-buffering | None | 2x | Medium | Eventual |
| Version-tagged | None | 1-2x | High | Strong |
| Grace period | Tail aborts | 1x | Medium | Strong |
Closing
The filesystem is not a message bus. The inode is not a version number. When you open a file, you're trusting it to stay put, and that trust is implicit and unspoken. When someone violates it, the fallout is yours to handle.
You can build coordination. You can fence and drain and double-buffer. But you have to build it—no one hands it to you.