TL;DR: We built 1 production product, 1 semi-production tool, and 3 POCs in 2.5 months using vibe coding. Here’s how we did it and what we learned about the gap between AI-generated prototypes and real production systems.
Lessons Learned
- POCs shine, production still fights back.
- Keep agents on a tight leash – specific prompts + diff review.
- Bigger scope ⇒ harder everything, trim aggressively.
- Fewer lines cost less to own than more.
- One-page spec before the first prompt saves hours (grab the template).

Vibe Coding Is Everywhere
The term “vibe coding” – popularized by Andrej Karpathy – perfectly captures what’s happening in software development right now. There’s even a book about it by Rick Rubin, the famous music producer, about the art of vibe coding.

Definition: Vibe coding = an LLM-first, rapid-iteration workflow where you prompt, get code, run it, and iterate based on the “feel” rather than a full up-front spec.
LLMs are exceptionally good at writing code. Just look at the benchmarks: SWE-bench, Aider leaderboards, and countless others show AI crushing coding tasks.

And developers are using it. A lot. According to Anthropic’s economic index, computer-related jobs show a massive outlier usage of LLMs compared to every other profession.

Market Response
The market has responded enthusiastically, as shown in the Infrared report. We now have:

Sometimes the market reacts badly – like the Internet of Bugs YouTube channel debunking Devin’s Upwork demo, or Answer.AI’s “Thoughts On A Month With Devin”.

But usually, we get good products.
| ✏️ Editors | 🤖 Agents | 🎨 UI Builders |
| Cursor AI-first code editor Windsurf The IDE for AI agents VS Code Copilot GitHub’s AI pair programmer Zed High-performance multiplayer editor Kiro Agentic IDE for production code | Cursor Agents Autonomous coding agents OpenAI Codex Powers GitHub Copilot Jules by Google AI coding companion Claude Code Anthropic’s coding assistant Gemini CLI Google’s AI in terminal Devin AI software engineer | Lovable.dev Build apps with AI Bolt.new Full-stack web dev in browser V0.dev UI generation by Vercel Gemini Canvas Google’s AI workspace Spark Dream it. See it. Ship it. |
Editors → Agents ← UI Builders → Editors
Lines are blurring: editors now ship built-in agents, agents offer editor plug-ins, and UI builders bundle both – progress in any layer instantly upgrades the whole stack.
Want to see how easy it is? Here are three apps I built with Bolt in one shot:
Mind the Gap

Here’s the crucial distinction: “Good at coding” ≠ “Good Software Engineer”
Good software engineers solve problems with code. Sometimes the best code is the one you never write.
There’s an interesting benchmark here: If AI is so good at coding, where are the open source contributions? Projects like NumPy, PyTorch, Hugging Face Transformers, and PostgreSQL have existed for decades. If AI is so good at coding, why aren’t there major contributions to these frameworks?

Reference : https://pivot-to-ai.com/2025/05/13/if-ai-is-so-good-at-coding-where-are-the-open-source-contributions/
The Real Cost of Software
As Jeff Atwood says in “The Best Code is No Code At All”, the real cost of software isn’t writing it – it’s owning it:
- Infrastructure
- Support & maintenance
- Security updates
- Monitoring & observability
- Upgrades & migrations
Writing code: ~20% | Maintaining it: ~80%
I think of vibe coding as a new abstraction layer. We went from machine code → assembly → C → high-level languages → and now AI-assisted programming. Back in the 90s, if you wrote in Python, people would say you’re not a real software engineer!
This is a higher abstraction with the flexibility of programming languages, but non-deterministic. As Martin Fowler writes about LLMs and abstraction, we’re not just moving UP in abstraction, but SIDEWAYS into non-determinism.

AWS’s Kiro is attempting to bridge this gap with spec-driven development, similar to how project management works at Big Tech.
Case Study
8-Person Team, 10-Week Sprint
Context first: I joined a Toronto venture studio focused on HCI (Human-Computer Interaction) to stand up an ML arm from scratch.
Goal: build prototypes for portfolio companies and ship at least one production product – all inside one quarter.
The Team
- 4 computer science interns
- 1 full stack engineer
- 2 designers
- 1 ML advisor (me)
Every interview included rigorous checks for AI tool usage. We went all-in on Cursor Team Plan.
Our Workflow
Team → Cursor (All LLMs, UI, Editor, Agents) → GitHub → Railway

The Numbers
Looking at our Cursor analytics:



- ~500 lines of code accepted per user daily.
- ~1,500 lines generated.
- Claude Sonnet 4 most popular model.
- Agent requests dominate overall usage.
GitHub stats over 90 days:
- 335 Pull Requests
- 811 Commits
- 26.1 PRs/week velocity
- 10.2 hours average merge time
- 90% CI/CD success rate
What We Built in 2.5 Months
✅ 1 production product – 100 companies used so far.
🚀 1 semi-production product – Newsletter with human in the loop.
🔬 3 POCs – Customer interviews & testing.
🔍 1 internal LLM evaluation framework based on Langfuse.
Patterns for Non-Technical Folks
We encouraged everyone – even non-engineers – to use vibe coding. Three patterns emerged:
Pattern 1: Slack
Create a dedicated Slack channel (e.g., #ml-team-vibe-coding) and message:
@Cursor repo=<repo-name> "Write what you want to do"
Wait for the green checkmark ✅, then ask an engineer to review.
Pattern 2: Web
Go to cursor.com/agents, select your repo, and chat away.
Pattern 3: Mobile
Same as web, but from your phone. Code from the beach! 🏖️
What’s Missing
There’s no “Accelerate” for AI coding yet. The fundamental research on efficient engineering organizations – Continuous Delivery, Architecture, Product and Process, Lean Management and Monitoring, Culture – needs a refresh for the AI era.

The key insight comes from Karpathy’s talk on keeping agents on the leash:
Instead of giving vague prompts like “Write me a Python rate limiter that limits users to n requests per minute”, be specific: “Implement token-based rate limiter in Python with following requirements…” The more constraints you provide, the better the output.

If your verification is small, easy, and fast, you’re in a good position. Otherwise, you’re not.

Summary
Vibe coding is here. It’s not replacing engineers – it’s giving us better abstractions. The gap between prototype and production is still wide, but it’s narrowing.
Key takeaways:
- Use best practices from Accelerate – they still apply.
- Keep agents on the leash as per Karpathy’s framework.
- Remember that better abstractions include non-determinism.
- The real cost is ownership, not initial development.
- Less code, more vibe – Rick would approve.
If you’re not using these tools yet, start with the basics: give every engineer Cursor Pro and API keys. Run an onboarding session. You’ll get 80% of the value for 20% of the effort.
What’s your experience with vibe coding? What worked? What didn’t? What surprised you?