Engineering

SQLite Durable Workflows — What the HN Crowd Got Right, and What They Skipped

A 473-point Hacker News post argues SQLite is all you need for durable workflows. We break down the architecture, where it actually works, and where it crumbles for Indian SMB workloads that need shared state.

29 May 20268 min readAnkur

A blog post by the Obelisk team hit 473 points on Hacker News this week. The thesis: SQLite is all you need for durable workflows. The argument builds on DBOS's earlier claim that Postgres replaces your queue and orchestration tier — but Obelisk pushes it further. Why run a network database when a local file, wrapped in transactions, gives you the same guarantees?

We agree with 80% of it. The remaining 20% is where Indian SMB workloads break the model. Here's our breakdown.

The Architecture They're Proposing

The pattern is clean. Each worker process owns a SQLite file. Workflow state is written transactionally into that file. Litestream streams WAL changes to S3-compatible object storage asynchronously. An observer process pulls databases for inspection and debugging.

💡 Key Insight The durable part is the workflow state — not the infrastructure. Compute stays cheap and disposable. SQLite gives you ACID without a separate database service. No network hop, no extra control plane, no new operational surface.

Obelisk's argument is that AI agent workloads are especially well-suited here. Agents are bursty and experimental. Each agent or tenant benefits from a self-contained unit of state. A fleet of tiny containers, each with its own SQLite file, is cheaper and simpler than a shared Postgres cluster — and gives better fault isolation.

Where This Actually Works

We've used this pattern internally for two things:

Single-user workflow engines. When you're running an n8n instance for one department of a textile SMB in Ludhiana, the workflow history doesn't need to be shared. SQLite is the right call.
Agent debugging. Dumping an agent's execution log into a SQLite file, backing it to S3, and pulling it for inspection beats tailing structured logs from a central service. You get the full state, not just what someone remembered to emit.

The Litestream caveat matters. Replication is asynchronous. If the SQLite volume disappears before the latest writes are copied, you lose them. For AI experimentation and single-tenant workflows, this is acceptable. For a production order management system that can't lose a single transaction, it isn't.

What the Post Skips: Shared-Mutable State

SQLite + Litestream	Postgres + Queue
Single-writer only. WAL mode allows concurrent readers but only one writer.	Multi-writer. Row-level locking handles concurrent mutation.
No built-in notification. You poll or use external signaling.	LISTEN/NOTIFY for real-time event delivery.
Backup is async. RPO measured in seconds to minutes.	Synchronous replication available. RPO = 0.
Operational simplicity — a file. No separate process.	Operational overhead — a database server to manage.
Best for: single-tenant agents, local tooling, embedded workflows.	Best for: multi-tenant SaaS, shared queues, zero-data-loss requirements.

Here's the problem for Indian SMB SaaS. Most of our vertical products — Paraslace for textile ERP, for instance — have multi-tenant architectures. Multiple garment units share infrastructure. Workflow state spans tenants. A SQLite-per-tenant model means 400 SQLite files for 400 manufacturers. That's manageable with Litestream, but the moment two tenants need to share a workflow — like a dyeing unit and a stitching unit coordinating on the same order — the single-writer constraint bites.

The real insight isn't "SQLite replaces Postgres." It's "match your durability mechanism to your concurrency model." Most systems over-provision infrastructure on day one. The Obelisk team is right that many workflows don't need a distributed queue. But they also don't need to start with SQLite if they know they'll need shared-mutable state within six months.

When We'd Reach for SQLite (and When We Wouldn't)

Use SQLite workflows when:

Single-tenant or agent-per-user architecture
Bursty, experimental workloads
Workflow state is self-contained
You already run SQLite for the application
You want to avoid a separate queue service

Use Postgres + queue when:

Multi-tenant with shared workflows
Zero-data-loss requirement (RPO = 0)
You need LISTEN/NOTIFY for real-time triggers
Multiple services need to read/write the same state
You're already running Postgres for the app

The Obelisk team acknowledges this. They support Postgres as a backend too. "Many workflow systems do not need that on day one and should not start with more infrastructure than their state actually demands." That's the line we'd underline.

The Indian SMB Angle

Most Indian SMBs running SAAS platforms don't have dedicated DevOps. They're on a ₹1,500/month VPS from Hostinger or DigitalOcean, running PM2 and nginx. Adding Redis for a queue, or Kafka for event streaming, is not just cost — it's cognitive load. If SQLite gets them 90% of the way for <100 users, they should use it.

But the moment they cross 100 tenants and workflows start spanning them, the migration from SQLite to Postgres isn't trivial. WAL format differences, connection pooling, and the mental model shift from "a file I can copy" to "a server I must manage" all hit at once.

Our recommendation: start with Postgres if you expect multi-tenancy within 12 months. The overhead is lower than a migration. If you're building single-tenant agents or internal tools, SQLite + Litestream is the correct default.

The Obelisk post is good engineering advice. It's just incomplete for the kind of shared-state systems Indian B2B SaaS tends to build.

The Architecture They're Proposing

Where This Actually Works

What the Post Skips: Shared-Mutable State

When We'd Reach for SQLite (and When We Wouldn't)

The Indian SMB Angle

More on engineering