Back to Blog
statisticsbuild in public2026data

Build in Public Statistics 2026: The Sourced Data Roundup

Most build-in-public statistics floating online are unsourced. This is the verified-source roundup — YC W25's 95% figure, Lovable's $200M ARR, Karpathy's Feb 2, 2025 tweet — with primary sources for each.

··6 min read

Build in Public Statistics 2026: The Sourced Data Roundup

TL;DR

  • Most "build in public statistics" content online is uncited and circulates without source verification. This is the sourced version.
  • Five canonical data points: Karpathy's tweet (Feb 2, 2025), YC W25's 95% figure (TechCrunch March 6, 2025), Lovable's $100M / $200M / $330M trajectory (TechCrunch July + Nov 2025), Collins Word of the Year (Nov 6, 2025), Belogubov's threshold rule (Feb 6, 2025).
  • Plus: Pieter Levels' 3-hour flight sim, Jon Yongfook's 10x stress quote, and the dev.to 10%/90% framing. With citations.

The most-cited build-in-public statistics circulate widely but with weak sourcing. This roundup verifies the canonical 2026 figures with their primary sources, so you can cite them accurately in your own content. It sits inside our build in public pillar.

The five canonical 2026 data points

1. Karpathy's "vibe coding" tweet

Date: February 2, 2025, 6:17 PM Source: @karpathy on X, status/1886192184808149383 Verbatim: "There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists."

The cleanest one-line definition of vibe coding. Cite this tweet directly when invoking the term in your content. The exact timestamp (6:17 PM) is the specificity marker that distinguishes well-sourced content from generic citations.

2. YC W25's 95%-AI-codebase figure

Date reported: March 6, 2025 Source: TechCrunch reporting Jared Friedman's YouTube remarks Verbatim (per TechCrunch): "It's not like we funded a bunch of non-technical founders…A year ago, they would have built their product from scratch — but now 95% of it is built by an AI." The claim: 25% of YC W25 startups had codebases that were 95% AI-generated

Important caveat: this is Friedman's video remarks reported by TechCrunch, not a formal YC publication. Cite as "per TechCrunch reporting Friedman's remarks" rather than as institutional YC data.

3. Lovable's trajectory

$100M ARR in 8 months: per TechCrunch July 23, 2025 $200M ARR four months later: per TechCrunch November 19, 2025 $330M raise at $6.6B valuation: December 2025 Anton Osika at Slush 2025: "faster than OpenAI, Cursor, Wiz, and every other software company in history."

Important caveat: these figures are TechCrunch reporting and Osika's own claims at Slush, not audited financials. Robust enough to cite but acknowledge the source-of-claim transparency.

4. Collins Dictionary Word of the Year

Date: November 6, 2025 Source: Collins Dictionary announcement Word: "vibe coding" Collins managing director Alex Beecroft, per CNN: "It signals a major shift in software development, where AI is making coding more accessible."

The cultural-moment marker. Use this citation when you need to establish that vibe coding is not a niche developer term but a mainstream concept by late 2025.

5. Belogubov's threshold rule

Date: February 6, 2025 Source: @alexbelogubov on X The framing: Stop sharing MRR once you cross ~$10K/month. Drop product mention from bio once you cross ~$30K/month.

This is the canonical late-stage build-in-public guidance. Used in when to go ghost mode founder and build in public revenue sharing.

Supporting data points

Pieter Levels' flight simulator

Source: @levelsio, status/1893385114496766155 Claim: Cursor-built flight simulator took "I'd say 3 hours" to build Important caveat: Founder-claimed via Levels' own X account. Robust enough to quote but cite as his claim, not independently verified.

Jon Yongfook on stress amplification

Source: Publicly-shared quote (X / interviews, multiple attestations) Verbatim: "When your numbers are live for the world to see, the level of stress and dread is amplified 10x."

The canonical citation for the cost of public-revenue posting. Used as the framing for build in public burnout.

The dev.to 10%/90% framing

Source: Imran Hassan on dev.to + cited across multiple Indie Hackers threads Verbatim: "Building the app is 10% of the work. Marketing it is the other 90%."

The cleanest one-line articulation of the post-vibe-coding distribution problem. Use as the framing setup for any distribution-focused content.

"Voice echoes into the void"

Source: Wisp CMS founder guide + multiple Indie Hackers threads with near-verbatim phrasing Verbatim: "Your voice echoes into the void, unanswered."

The canonical articulation of the void-shipping pain. Used as the framing for shipping into the void.

How to use these statistics in your content

Specific patterns that work:

  • Cite with the date. "Per TechCrunch's March 6, 2025 reporting..." outperforms "Per recent reports..."
  • Quote verbatim where the source supports it. The actual quote (Karpathy's "vibe coding" definition, Yongfook's "10x stress") carries more weight than paraphrase.
  • Acknowledge the source-of-claim transparency. "Anton Osika's Slush 2025 remarks" is more honest than "Lovable reported $100M ARR" — distinguishes founder-claim from audited disclosure.
  • Use the citation as a framing setup. Start the content with the data point; build the argument from it.

What this list deliberately excludes

Numbers that circulate but lack solid sourcing:

  • Generic "75% of indie hackers fail" statistics without source
  • Specific MRR claims about founders without their public attestation
  • "Build in public increases conversion by [X]%" without methodology
  • AI usage statistics from vendor PR rather than independent surveys

Including unsourced statistics produces content that survives initial reading but collapses under scrutiny. The five canonical data points above are robust enough to survive operator fact-checking.

Sibling clusters

FAQ

Are these the only sourced data points worth citing? The most canonical ones. Several others (specific Cursor user counts, Claude Code adoption metrics, GitHub Copilot statistics) exist but with weaker sourcing. The five above produce the strongest content-citation surface; others can be added selectively with appropriate source-disclosure.

Should I link to the primary sources in my posts? Yes when possible. The X status link (status/1886192184808149383 for Karpathy, status/1893385114496766155 for Levels) is the cleanest citation format. TechCrunch articles can be linked. Wisp CMS founder guide references should specify the article title.

Will these citations age out? The 2025-2026 vintage of these citations will eventually become "historical references" as the field evolves. By 2028, expect to update with newer Lovable / OpenAI / Anthropic data points and possibly retire the Karpathy tweet as the foundational citation if a more current one emerges.

Can I use these in commercial content? The citations themselves (date + source + verbatim quote where applicable) are factual references that can be used in commercial content with attribution. Reposting full TechCrunch articles is different — that requires permission. Verbatim quotes with citation are standard journalistic practice.

What if a citation becomes contested? Update or remove. The "95% AI-generated codebase" figure has already received some skeptical commentary; if a more rigorous source emerges contradicting it, update your posts. The discipline is honest-citation-or-no-citation, not citation-no-matter-what.


Building is no longer the bottleneck. Visibility is. buildinpublic.so is narrative infrastructure that runs inside your building workflow — for content discipline: Loudy drafts posts that pull from your AI Brain memory of sourced citations, Dev Cards supplies the workflow-content baseline that does not depend on macro statistics, and Vibey schedules the strategic content that does use the canonical data points.