lbutler 34 minutes ago

I’m building a local-first web app, and SQLite works well for my case since a single project can be contained in one database file, just like users are used to with existing desktop applications.

What I’d really like is an easy way to sync the SQLite database state to a cloud service. Most existing options expect you to query against a remotely hosted database and charge per read/write.

Since the database will have around 100,000 rows and you're typically working with all the data at once, streaming parts of it doesn’t make sense for my use case.

The closest I’ve found is Turso, which has offline writes in private beta, and SQLite Cloud, which lists local-first and offline sync as "coming soon."

The simplest approach might be letting users push to S3 storage with versioning. Ideally, it would also support point-in-time restores, tracking incremental updates alongside full snapshots.

Even better, I’d manage minimal server-side infrastructure and just pull the SQLite database from a service that handles syncing and management.

  • 0cf8612b2e1e 25 minutes ago

    Maybe I am misunderstanding which part you want in the cloud, but that sounds like litestream. Let’s you transparently backup a live SQLite database to a remote destination.

    https://litestream.io/

  • galaxyLogic 15 minutes ago

    How about: Have 1 + N separate SQLite database-files.

    Each user would have their own database-file which contains only information about that user. Then 1 shared database-file which contains info needed for all users.

    Users would update their own data, which is a small database file which can be easily uploaded. They would not need to update the shared data.

    Not knowing your app I don't know what the shared data would contain, presumably something. Perhaps the shared data-file would be updated on the server based on what individual user-data the users upload.

    • anovick 3 minutes ago

      In this multi-tenant arrangement, you run into synchronization problems.

      Developers should expect users to connect to the service using multiple devices (clients).

      AFAIK bare SQLite doesn't offer synchronization mechanisms between multiple SQLite instances.

      I believe Turso offers some solution of this kind, but not sure if that's open source or not.

bob1029 42 minutes ago

> Benefits of SQLite-Per-Partition

The obvious caveat here is any situation where you need global tables. I've never worked on a product where 100% of the schema could be partitioned like this. I've done a ton of SQLite-per-<entity type>, but there's always been a singleton SQLite database above this tracking the metadata.

Looking up a user session via token is not an analytics task. Neither is checking if an email address is already registered. These are very pedestrian queries that demand global visibility, ideally with hard transactional guarantees.

  • manmal 26 minutes ago

    Is it even realistic to depend on transactional guarantees, with hundreds of services hammering the DB(s) more or less concurrently? Don’t they need to coordinate themselves outside of transactions?

    • bob1029 23 minutes ago

      > Is it even realistic to depend on transactional guarantees, with hundreds of services hammering the DB(s) more or less concurrently?

      If you have hundreds of services hammering the datastore concurrently, then the ability to offer transactional guarantees seems absolutely mandatory for me.

      > Don’t they need to coordinate themselves outside of transactions?

      I think we're back at the beginning of our journey here.

      • manmal 7 minutes ago

        Do you mean via distributed transactions? I‘m clueless in that area, and don’t understand how hundreds of self sufficient services would partake in transactions that span different sets of those services, without entangling them badly.

  • slt2021 20 minutes ago

    would redis/in memory DB be a better suited for this task? I don't think a regular rdbms will meet this requirement at scale

akavi 15 minutes ago

Hmm, this article is a little confusing. I'm not familiar with Vitess or Citus, but am familiar with "manually" sharded Postgres/Mysql, and I'm not sure I understand if there's any "interaction effects" of the decision to shard or not and the decision between MySQL/Postgres and Sqlite.

Like, the article's three sections are:

1. The challenges of sharding

2. The benefits of these new sharded Sqlite solutions over conventional Sqlite

3. A list conflating the benefits of SQL databases generally with the benefits of Sqlite

None of which answer the question of "Why should I use sharded Sqlite instead of, say, sharded Postgres, for hyperscale?".

  • koakuma-chan 8 minutes ago

    If you think this is a good fit for your case, you should embed SQLite in your application and shard your application. An embedded SQLite is faster and uses less memory than a PostgreSQL running as a separate process and possibly on a different machine.

byoung2 an hour ago

Looks interesting. How would you approach write and multiple read replicas with this setup? My holy grail would be allowing users to get data from read replica, but also writing to the read replica which then forwards those writes to the primary write DB

  • hinkley 37 minutes ago

    One of the classic solutions to this was to put the data behind a cluster and route by verb. All the GET requests go to the replicas and everything else goes to the writeable instance.

    • datadrivenangel 31 minutes ago

      Suffers from dirty reads: Write to the primary, get confirmation of write-success, reload page, user sees 'stale' data from the read-replica and thinks you lost their data.

  • slt2021 18 minutes ago

    there is no need in read replica, because every micro sqlite partition has very low workload (single digit queries per second)

NathanFlurry an hour ago

Author here, happy to answer questions!

  • robertlagrant an hour ago

    I love the idea of Durable Objects, and have thoughts about it in the health space (although having a single provider of them is probably a bad thing) but cross-cutting questions such as this seem to be quite difficult to achieve.

    From the chat logs example in your article: how do you cope with a requirement such as "I as a user want to see all my latest thread activity in one place, across all my chat rooms?"

    • jmull 43 minutes ago

      That's an easy case: write to two places (one for the chat and one for the user's activity).

      In this case I think you can let them become inconsistent in the face of, e.g., write errors.

    • korkybuchek an hour ago

      > From the chat logs example in your article: how do you cope with a requirement such as "I as a user want to see all my latest thread activity in one place, across all my chat rooms?"

      create a copy, for example

  • abdullin an hour ago

    Interesting read, thank you!

    Do you use any special tools to manage all these separate databases, track performance and debug problems?