Skip to main content

Command Palette

Search for a command to run...

Why I Moved My SaaS from Cloud-Dependent to Local-First (And What Happened Next)

Updated
11 min read
Why I Moved My SaaS from Cloud-Dependent to Local-First (And What Happened Next)

Series: Local-First Type: Case Study Meta Description: A real-world migration story: moving a SaaS product from cloud-dependent to local-first architecture, what broke, what improved, and the unexpected business impact. Keywords: local-first migration, SaaS architecture, offline-first, cloud to local, sync strategy Word Count Target: 2200 Published: Draft — NOT for publication


In March 2025, I made a decision that terrified my co-founder. I proposed ripping out the central database dependency from our SaaS product and replacing it with a local-first architecture. We had 3,200 paying customers, a Laravel monolith on AWS RDS, and a frontend that made an API call for literally everything. Click a tab? API call. Expand a row? API call. Sort a column? Three API calls.

The app felt fast enough on a fiber connection in our office in Berlin. But our users were not all on fiber in Berlin. They were project managers on construction sites using iPads on cellular. They were consultants in hotel lobbies on airport Wi-Fi. They were sales teams in rural offices where "broadband" meant 12 Mbps down on a good day.

Our support inbox told the story: "The app keeps freezing." "I lost my changes." "It takes 30 seconds to load my dashboard." We optimized queries, added caching layers, deployed CDN edge nodes. Each optimization helped a little. None solved the fundamental problem: our app required a network round trip for every interaction.

This is the story of migrating to local-first: the architecture we chose, what broke during the transition, what surprised us about the results, and what I would do differently next time.

The Product

Our SaaS, FieldTrack, is a field operations management tool for construction and engineering companies. Users manage project tasks, assign crews, log hours, and file daily reports. The typical user is a site supervisor with an iPad Pro on a noisy construction site, wearing gloves, quickly checking off completed tasks between conversations with subcontractors.

The app has about 120 database tables, 40 API endpoints, and a Vue.js frontend. Peak usage is 8-10 AM when supervisors arrive on site and review the day's plan, and 4-6 PM when they file end-of-day reports. These are also the times when cellular connectivity on construction sites is at its worst because hundreds of workers arrive simultaneously and saturate the local cell tower.

The Original Architecture

Before the migration, FieldTrack was a textbook Laravel SaaS:

  • Laravel 10 on EC2, behind an ALB
  • PostgreSQL on RDS (db.r6g.large)
  • Redis for sessions and cache
  • Vue.js SPA that called the API for everything
  • File uploads stored on S3
  • Real-time notifications via Pusher

Every user action went: Vue component -> Axios request -> ALB -> EC2 -> Redis/PostgreSQL -> Response -> Vue re-render. The median API response time was 180ms. The 95th percentile was 2.1 seconds. The 99th percentile was 8.4 seconds. Those outliers were almost entirely network-related: users on poor connections experiencing TCP retransmissions and TLS handshake delays.

We spent four months optimizing the server side. We added database read replicas, moved heavy queries to dedicated reporting instances, implemented Eager Loading across every controller, and used Laravel's remember() cache where possible. Server-side response time dropped from 180ms to 45ms. But the user experience barely improved, because most of the latency was network round-trip time, not server processing time.

The realization hit hard: we could not engineer away the speed of light. A user on a 200ms latency connection will wait 200ms minimum per request, regardless of how fast our server responds.

The Migration Plan

We decided on a phased approach rather than a big bang rewrite.

Phase 1: Local read cache. Keep the existing API but add a local SQLite database on the client that caches responses. Reads come from SQLite. Writes go to the API and update SQLite on success. This gave us instant reads but did not solve offline writes.

Phase 2: Offline write queue. Add a changelog layer. Writes go to SQLite immediately and are queued for sync. When connectivity returns, the queue flushes to the server. This gave us full offline functionality but no conflict resolution beyond last-write-wins.

Phase 3: Conflict resolution. Implement version tracking and a merge strategy for concurrent edits. This was the hardest phase and took the most iteration.

Phase 4: Background sync and real-time updates. Add WebSocket-based push for changes from other users and background sync intervals.

We budgeted three months. It took five.

Phase 1: What We Built

On the client side, we wrapped every API call in a cache layer backed by sql.js (SQLite compiled to WebAssembly). The first time a user loaded their project list, it came from the API and was cached in SQLite. Every subsequent load came from SQLite instantly.

// Simplified version of our caching layer
class LocalCache {
    constructor() {
        this.db = null;
    }

    async init() {
        const SQL = await initSqlJs({ locateFile: f => `/wasm/${f}` });
        const saved = await this.loadFromStorage();
        this.db = saved ? new SQL.Database(saved) : new SQL.Database();

        this.db.run(`CREATE TABLE IF NOT EXISTS cache (
            key TEXT PRIMARY KEY,
            data TEXT NOT NULL,
            etag TEXT,
            cached_at TEXT DEFAULT (datetime('now'))
        )`);
    }

    async get(endpoint) {
        const result = this.db.exec(
            'SELECT data, etag FROM cache WHERE key = ?', [endpoint]
        );
        return result.length > 0
            ? { data: JSON.parse(result[0].values[0][0]), etag: result[0].values[0][1] }
            : null;
    }

    async set(endpoint, data, etag = null) {
        this.db.run(
            'INSERT OR REPLACE INTO cache (key, data, etag) VALUES (?, ?, ?)',
            [endpoint, JSON.stringify(data), etag]
        );
        await this.persist();
    }
}

The impact was immediate. Dashboard load time dropped from 1.8 seconds to 40ms for cached users. Tab switching went from a visible loading state to instant. Users noticed and sent positive feedback before we even announced the change.

The problem: writes still required network. If a supervisor checked off a task underground in a parking garage, the check-off failed silently. We had not solved the core problem yet.

Phase 2: Offline Writes

We added a write queue that persisted to IndexedDB. Every mutation was recorded locally first, applied to the cached data immediately (optimistic update), and queued for sync.

This is where things got tricky. Our API was designed around request-response semantics, not eventual consistency. Endpoints like "reorder tasks" expected the server to validate the order and return the canonical state. With offline writes, the client had to maintain its own canonical state and resolve differences later.

We had to rethink several features:

  • Task reordering changed from server-assigned positions to client-generated fractional indices.
  • Auto-numbering (task IDs like PROJ-042) could no longer be assigned synchronously. We switched to UUIDs internally and generated display numbers as a post-sync step.
  • Validation moved client-side. The server became a verification layer, not the authority.

This phase took six weeks instead of the three we planned. The extra time came from edge cases we had not anticipated. For example, a user creating a task offline, then trying to assign it to a crew member. The assignment endpoint needed the task to exist on the server first. We had to redesign the assignment flow to handle "task does not exist yet" as a temporary state.

Phase 3: The Conflict Minefield

This is where we lost sleep. FieldTrack has several data types that multiple users edit simultaneously:

  • Task statuses (multiple supervisors updating the same project)
  • Crew assignments (dispatchers and supervisors both making changes)
  • Daily log entries (usually one per person, but sometimes a foreman edits a subordinate's entry)
  • Project settings (only managers, rare but high impact)

We chose last-write-wins for task statuses and daily logs. A status change is idempotent in practice — if two people mark a task as "done," the result is the same regardless of who wins. Last-write-wins is not elegant, but it works.

Crew assignments were harder. If Supervisor A assigns Crew 3 to Task X and Supervisor B assigns Crew 3 to Task Y simultaneously, we need a strategy. We chose "latest assignment wins" with a notification: the loser gets a push notification saying "Crew 3 was reassigned to Task Y by [name]." This turned out to be the right UX because the notification gave supervisors a way to coordinate.

Project settings got the full OR-Set treatment. Changes to project settings are rare and important, so we implemented a proper observed-remove set that prevents accidental overwrites.

The biggest surprise was daily logs. We assumed these were single-user and used simple version tracking. Then we discovered that foremen regularly edit their crew members' logs to correct hours or add notes. This caused more conflicts in the first week of Phase 3 than everything else combined. We ended up switching daily logs to a CRDT-backed text field using Y.js for the notes section and field-level merge for structured data.

Phase 4: Real-Time Sync

With offline writes working and conflict resolution in place, we added real-time push using Laravel Reverb. When the sync server processes changes from one device, it broadcasts the update to all other connected devices for that project.

This created a subtle bug. Devices were receiving the same change twice: once via real-time push and once via the periodic sync pull. We solved this by adding a change ID to every mutation. The client tracks which change IDs it has applied and deduplicates.

// Sync response includes change IDs
return response()->json([
    'synced_at' => now()->toIso8601String(),
    'changes' => $serverChanges->map(fn($c) => [
        'change_id' => $c->uuid . '-' . $c->version,
        // ... rest of the change data
    ]),
]);
// Client deduplication
const appliedChangeIds = new Set(loadFromStorage('applied_changes') || []);

function applyRemoteChange(change) {
    if (appliedChangeIds.has(change.change_id)) return;
    appliedChangeIds.add(change.change_id);
    // Apply the change to local SQLite
    applyChange(change);
}

The Results After Six Months

We completed the migration in August 2025. By February 2026, the numbers told a clear story.

User experience. Median interaction latency dropped from 180ms to under 10ms for reads and 0ms for writes (writes are local). The support inbox volume dropped 40%. "App keeps freezing" complaints dropped 78%.

Infrastructure costs. Our RDS instance dropped from db.r6g.large ($280/month) to db.r6g.medium ($140/month). We eliminated one of our two read replicas ($90/month saved). Redis memory usage dropped 60% because the cache was no longer handling read-through traffic. Total infrastructure savings: roughly $400/month, not life-changing for a business with 3,200 customers, but a meaningful reduction in variable costs.

User retention. This was the surprising one. Our 90-day retention improved from 72% to 81%. Customer acquisition cost did not change, so the improvement came entirely from reduced churn. Exit survey data pointed to two factors: reliability ("it just works even on bad connections") and speed ("it feels like a native app now").

Offline usage. We added analytics to track offline sessions. In the first month of tracking, 34% of daily active users had at least one offline session per day. 12% had more than 30 minutes of offline usage per day. These were not edge cases. This was a third of our users operating in conditions where the old app would have been unusable.

Support conversations changed. Instead of "the app is slow" or "I lost my data," we started getting feature requests. That shift told us we had moved reliability from a liability to a non-issue.

What I Would Do Differently

The migration was successful, but it was harder than it needed to be. Here is what I would change:

Design for local-first from day one. Retrofitting local-first onto a request-response API is painful. If we had started with a sync-based architecture, we would have avoided months of refactoring. The key difference: in a request-response API, the server validates before accepting. In a local-first architecture, the client validates optimistically and the server reconciles. Designing around this distinction from the start eliminates entire categories of integration bugs.

Use an existing sync engine instead of building one. We wrote our own sync layer. It works, but it is 4,000 lines of code we now maintain. Products like PowerSync, ElectricSQL, and Triplit handle the sync plumbing so you can focus on your app logic. We evaluated them late in the process. By then, switching costs were too high.

Invest in conflict resolution testing earlier. Our conflict resolution bugs were the hardest to diagnose because they only appeared in specific timing windows with specific user behavior combinations. We eventually built a test harness that simulates concurrent edits by interleaving operations in different orders and verifying convergence. I wish we had built that harness in Phase 2, not Phase 3.

Communicate the offline state clearly in the UI. Early in the migration, we did not show users when they were offline. This led to confusion: "I made changes but my colleague doesn't see them." We added a persistent sync status indicator that shows "All changes saved," "Syncing... (3 pending)," or "Offline — changes saved locally." This small UI change reduced sync-related support tickets by 65%.

The Hidden Benefit: Developer Experience

An unexpected outcome was how the local-first architecture changed our development workflow. Because the frontend now works against a local database, frontend developers can build and test features without a backend server running. They seed their local SQLite with test data and iterate without network latency.

Our frontend test suite runs entirely against SQLite in memory. Tests that used to take 8 minutes (waiting on API responses, database seeding) now run in 45 seconds. This speed improvement changed how developers write tests — they write more of them because the feedback loop is tight.

Lessons for Anyone Considering This Migration

Local-first is not a silver bullet. It adds complexity to your data layer. You must think about conflict resolution, merge strategies, and eventual consistency from the start. Your database schema needs version and sync metadata columns. Your API becomes a sync endpoint instead of CRUD endpoints.

But for any application where users interact with data on unreliable connections — and that describes most mobile and field-use apps — local-first is a transformation. Not just for performance, but for user trust. When an app works without an internet connection, users trust it more. It feels reliable in a way that no loading spinner or progress bar can replicate.

The migration took five months for a team of three developers on a 120-table Laravel app. If you are building something new, start local-first from day one. If you are migrating an existing app, phase the transition: cache first, then offline writes, then conflict resolution. Measure the impact at each phase. You will see the benefits compound.

More from this blog

M

Masud Rana

51 posts

I am highly skilled full-stack software engineer specializing in Laravel, PHP, JS, React, Vue, Inertia.js, and Shopify, with strong experience in Filament Frontend and prompt engineering.