The fastest way to make an AI design tool dangerous is to let it write straight into the product repo and call the result progress.
Frontend generation is useful. It can explore layouts, tighten copy, find better hierarchy, and produce a cleaner first pass than a blank screen. But the useful version has a hard boundary: generation creates artifacts; operators choose what applies.
The boundary
The working rule is simple: default to no-apply. A design run may inspect the repo, produce variants, write screenshots, scorecards, diffs, and a winner bundle. It may not silently patch the live site.
What the workflow needs
- A repo-grounded context file so each run understands the design system and constraints.
- Static-site verification before anything is considered a candidate.
- Scorecards that separate visual quality from operational safety.
- A winner bundle that can be manually applied, reviewed, and reverted.
Why this matters
Most AI-generated redesigns look confident and break small things. Navigation labels drift. Mobile spacing gets theatrical. Copy becomes generic. A route works in the screenshot but not in the shipped artifact. None of that is fatal if the output is treated as a candidate.
The operator job is to preserve the boring guarantees: links work, mobile does not overflow, the page still says something true, and the diff is small enough to understand.
The useful shape
The tool should behave more like a disciplined junior designer than an autonomous deploy bot. It brings options. It labels tradeoffs. It packages evidence. Then it waits.
That is slower than magic. It is also how things stay running.