Why your dashboard works in demo and breaks in production

// pre-launch// field-notes7 min read

The dashboard demo went well. The charts rendered cleanly. The numbers told the right story. The interactions felt smooth. Stakeholders approved. The dashboard shipped to real users with real data, and within a week things started going wrong in ways the demo never showed.

This pattern is especially common for AI-built dashboards because the demo and the production environments differ in specific dashboard-shaped ways. The demo runs against test data the team controlled. Production runs against real data that has properties the test data didn't. The gap between the two environments is consistent, predictable, and addressable, but only if you know to look for it.

I want to walk through the five things that go wrong only in production for dashboards specifically, then the discipline that catches each before the demo declares the dashboard ready.

Break one: real data distribution looks different

Test data is usually constructed by the team. The team picks reasonable values, balanced distributions, clean shapes. Real data is messy. It has outliers. It has nulls. It has values from edge cases nobody thought about. It has historical anomalies from when the data system worked differently.

The dashboard renders fine on test data because the test data was shaped to render fine. The same dashboard renders strangely on real data: charts squashed by outliers, percentages that don't add up because of nulls, summary statistics dominated by historical anomalies that aren't currently relevant.

The diagnostic: take a snapshot of real production data. Render the dashboard against it (in a development environment, not live). Note every visual that doesn't read the way the demo version did. Each is a real-data failure that needs to be addressed.

The fix is usually a mix of data-handling improvements (filter nulls explicitly, cap outliers visually, document distribution assumptions) and design adjustments (use log scales where appropriate, choose color scales that handle the actual data range, format numbers in ways real values support).

Break two: scale of records

The demo had 50 records. Production has 50,000 or 5 million. Queries that returned instantly against 50 take seconds against 50,000 and might time out against 5 million. The dashboard that felt snappy in demo feels broken in production because every interaction triggers a slow query.

The diagnostic: query the production data store directly with the queries your dashboard runs. Measure response time. Anything over a second per query is going to feel slow in the dashboard. Anything over five seconds is going to feel broken.

The fix is usually database-shaped: add indexes to support the queries the dashboard runs, denormalize where appropriate, cache aggregations that don't need to be real-time, paginate or virtualize lists that show thousands of items. None of this is exotic, but all of it requires deliberate work that demos don't surface the need for.

Break three: refresh patterns

The demo dashboard probably refreshed on page load. Real production dashboards need to show current data, which means deciding when to refresh and how.

The choices: refresh continuously (expensive at scale, often unnecessary), refresh on user request (puts the burden on the user, often results in stale data being acted on), refresh on a schedule (data is slightly stale most of the time, sometimes very stale), use a push pattern where the data source notifies the dashboard of changes (most architecturally complex, best user experience).

The diagnostic: ask how stale the dashboard's data is allowed to be. Match the refresh pattern to that requirement. If real users are making decisions on the dashboard's data, the refresh latency matters; if it's mostly read-only reference, slower refresh is fine.

The fix is to design the refresh pattern deliberately rather than accepting the AI's default (which is usually refresh on page load and otherwise never).

Break four: missing edge cases that production exposes

Production data exposes edge cases the team didn't think to test. A user with zero records. A user with a billion records. A user whose data shape differs slightly from the majority. Time zones, currencies, languages, units of measurement that the demo didn't include. Data from before a schema change that the dashboard doesn't know how to interpret. Data from after a system change that the dashboard doesn't know how to handle.

Each edge case can produce visible failures (the dashboard renders wrong) or invisible ones (the dashboard renders something that looks fine but is technically incorrect).

The diagnostic: enumerate the edge cases that production data could have. Either find real examples of each in your production data and test the dashboard against them, or construct synthetic test cases that cover the edge cases.

The fix is to handle each edge case explicitly in the dashboard logic. Empty states for users with no data. Reasonable behavior for users with extreme data. Currency and unit normalization. Time zone handling. Schema-migration awareness.

Break five: user permissions and access

The demo dashboard showed everything to one user (the demoer). Production has multiple users with different permission levels, and the dashboard needs to show each user only the data they're authorized to see.

The failure: the dashboard renders fine for the developer (who has full access) and renders broken or empty for users with restricted access (because the dashboard wasn't designed to handle the restriction gracefully).

The diagnostic: log in as users with different permission levels. Verify the dashboard renders appropriately for each. The user with limited access should see a useful dashboard limited to what they can access, not a broken dashboard.

The fix is to design the dashboard to be permission-aware from the start. The data layer filters based on the user's access. The presentation layer handles "no data available" gracefully. Permission boundaries are enforced consistently rather than as afterthoughts.

Why dashboards specifically fail this way

Dashboards are especially prone to demo-vs-production gap because they have several properties that amplify the gap:

They visualize data, so the demo necessarily uses test data and the gap to real data is large.

They aggregate across records, so scale issues that aren't visible in single-record views become visible in dashboards.

They serve multiple users with different permissions, so the demo (one user) doesn't represent production (many users).

They need to stay current, so refresh patterns matter in ways one-time-render apps don't have to deal with.

The combination means dashboards have more places where demo and production can diverge than most other app types. The discipline to catch the divergences upfront is correspondingly more important.

The discipline that catches it pre-production

The pattern that prevents dashboard demo-to-production failure is the same shape as the pre-launch audit applied to dashboard-specific concerns:

Test against real production data (or a sanitized snapshot of it) before the demo, not after. Note every place the dashboard renders differently than against test data. Fix each.

Measure query performance against production-scale data, not test-scale. Optimize the queries that are slow before the user notices.

Design the refresh pattern deliberately. Match it to the staleness tolerance of the actual users.

Enumerate edge cases and handle them explicitly. Empty states, extreme values, missing fields, schema variations.

Test with restricted-permission users. Verify the dashboard works for them, not just for the full-access developer.

These steps add hours to days to the dashboard build. They save weeks of post-launch firefighting. The math is consistent: deliberate pre-production work always wins over reactive post-production fixes for dashboard-shaped systems.

If your AI-built dashboard is currently failing in production in any of these five ways, the diagnostic above tells you specifically what to fix. The fixes are bounded; the cumulative effect is a dashboard that works for the production conditions it'll actually face.

Got an AI-built dashboard that worked in demo and is failing in production? Send the dashboard description, the failures you're seeing, and the production data characteristics. VibeKoded can scope the prototype, build the MVP, or hand off the production app. → Work with VibeKoded