How to move an MVP to production when AI wrote the first version

// pre-launch// field-notes7 min read

The MVP did its job. It validated the assumptions you wanted to test. Real users showed up. The business model worked. Now the question changes. The MVP architecture that was fine for validation isn't necessarily fine for production traffic, growth, and durability. You have to move the system from "good enough to learn from" to "good enough to depend on."

The instinct is often to rewrite. The MVP feels rushed in retrospect; surely the production version should be more deliberate. The rewrite usually isn't the right answer. Most AI-built MVPs can be moved to production through deliberate hardening rather than starting over, and the hardening path is faster, cheaper, and less risky than the rewrite path.

I want to walk through what actually changes between MVP and production, the five-step transition pattern that gets you from one to the other, and a real case study where I did exactly this transition on this site's central /log mesh component.

What changes between MVP and production

The visible code might not change at all. What changes is the conditions the code runs under and the expectations against which it's measured.

Volume changes. MVP traffic was your team plus a small set of early users. Production traffic is the actual user base, growing over time, with peaks and patterns you didn't see during validation.

Reliability expectations change. MVP downtime was acceptable because users were testing along with you. Production downtime affects real users who depend on the system. The acceptable failure rate is much lower.

Edge case exposure changes. MVP saw the cases you tested. Production sees the cases real users present, which is a much larger and weirder distribution. Edge cases the MVP didn't have to handle become regular events.

Operational requirements change. MVP could be monitored manually. Production needs to monitor itself. MVP could be deployed when you remembered. Production needs scheduled deployment with rollback capability. MVP credentials could be on a development machine. Production credentials need rotation, vaulting, and access control.

Data durability requirements change. MVP data loss was acceptable because data was fake or temporary. Production data loss affects real users. Backups, recovery, retention, and deletion become required.

Each of these changes might require code changes, infrastructure additions, or operational discipline. None of them necessarily requires a rewrite. The transition is about adding what production needs that the MVP didn't have, not rebuilding what already works.

The five-step transition pattern

Step one: audit what's in the MVP that production needs. Walk through the system. For each piece, ask: does this need to change for production, or just need additional surrounding work? Most pieces just need the surrounding work. A small set might need actual changes.

The output of this step is a list. Items that are production-ready as-is. Items that need additional surrounding work (monitoring, error handling, edge case coverage). Items that need actual code changes. The list is the transition plan.

Step two: add the operational surface. Before changing code, add what makes the system operable in production. Structured logging if it's not there. Error tracking. Health endpoints. Monitoring dashboards. Alerting. The operational surface lets you see what production is doing, which is required before you can confidently change anything in production.

This step is usually 80% of what's needed for a successful transition. An MVP with good observability can survive in production through patches; an MVP with no observability fails the first time something unexpected happens.

Step three: harden the error paths. Walk through the code looking specifically for places where the MVP assumed inputs would be well-formed. Add validation. Add appropriate error responses. Add retry logic where it makes sense. Add graceful degradation where it doesn't.

The goal isn't to make the system handle every possible error perfectly. The goal is to handle the common error modes (malformed input, network failures, vendor rate limits, timeouts) so they don't cascade into system-wide failures.

Step four: address data durability. Verify backups exist for the data the system stores. Verify recovery procedures actually work (most "we have backups" claims fail at restore time). Add retention policies. Add data export. Add account deletion if you'll have users. Each is small individually; together they're what makes production data trustworthy.

Step five: load test against realistic traffic. Simulate production load against the system. Watch for what breaks. Each break is something to fix before real production load arrives. Common failure modes: database queries that don't scale, API rate limits that get hit faster than expected, race conditions under concurrency, memory leaks under sustained traffic.

The load test is the safety net. If the system survives realistic load in testing, it'll survive real load in production. If it doesn't survive, you find out before users do, while the fix is cheap.

The /log mesh polish case study

The central visualization on this site is the /log mesh: a 3D interactive node graph of all the posts. It was built MVP-quality: rendered, responded to clicks, looked roughly right. Shipping the early version was fine for validation.

Then real users started arriving and I saw the actual experience cold. Several MVP-quality issues surfaced:

A hydration flash on initial load that lasted long enough to feel broken. Visible for maybe 800ms but enough to register as "this isn't loading right."

Node identification was inconsistent. Hover worked sometimes, didn't work other times, depending on whether the underlying R3F state had stabilized.

The mobile fallback (a 2D radial layout) didn't gracefully degrade from the desktop 3D version. Visitors on mobile got a different visual treatment that didn't quite match the brand the desktop version established.

None of these required rewriting the mesh. They required hardening the existing implementation. The /log mesh polish leg ran through the five steps above: audited what needed work (the three issues above plus a few smaller ones), added observability to debug R3F state issues, hardened the hydration and hover code paths, addressed how the mobile fallback rendered, tested under realistic loading conditions.

The transition took a focused multi-day arc. The /log mesh now renders cleanly across devices, doesn't flash on hydration, identifies nodes consistently, and gracefully serves mobile and desktop with appropriate experiences. No rewrite required. The MVP architecture absorbed the production-quality requirements through hardening.

That's the pattern. The five steps applied to my own central component. The same pattern works for most AI-built MVPs.

When you should actually rewrite

The five-step transition pattern works for most MVPs. There are cases where rewriting is the right answer:

The MVP's architecture genuinely can't support production. Not "would be better redesigned". Actually can't. Examples: schema fundamentally wrong for the queries production needs, framework that has been deprecated, dependencies with no migration path.

The MVP validated something different from what it was built for. The product pivoted during validation. The architecture supports the old direction, not the new one.

The MVP is so opaque that hardening it is more expensive than rebuilding it. The codebase has no documentation, the tests are minimal, no one understands the original decisions. Sometimes the rebuild is genuinely cheaper than reverse-engineering.

These cases are real but rarer than the "I want to rewrite because the MVP feels rushed" cases. The instinct to rewrite is usually wrong. Hardening in place is usually correct.

What "production" actually means

Production isn't a switch you flip. It's a continuous state of being responsive to real-world conditions. A system is in production when:

Real users depend on it, and their dependence has consequences if the system fails.

The system has observability sufficient to know when it's healthy or not.

There are operational procedures for handling failures, deploying changes, rotating credentials, recovering data.

The team running it can extend it without breaking existing behavior.

The five-step transition above gets you to that state. It's not glamorous work. It's the work that turns "we have an MVP that works" into "we have a system the business runs on." Both descriptions can be true of the same codebase; what changes is the discipline around it.

Got an MVP that needs to move to production and you're trying to figure out whether to harden or rewrite? Send the current state, the user traffic projection, and the operational requirements. VibeKoded can scope the prototype, build the MVP, or hand off the production app. → Work with VibeKoded