Vendor landscape · 5 min read

Bright Data Hit $300mn ARR. The Press Missed It.

Bright Data crossed $300mn ARR in 2025, growing 50% year-on-year — roughly ten times the combined disclosed funding of the YC-backed AI-scraper cohort (Browserbase, Browser Use, Firecrawl, Reworkd). The biggest scraping business in the world is the one nobody covers.

By Signal Census Editorial April 24, 2026

Apify · marketplace signal

Bright Data crossed $300mn ARR in 2025, growing 50% year-on-year — roughly ten times the combined disclosed funding of the YC-backed AI-scraper cohort (Browserbase, Browser Use,.

Bright Data · vendor scale

The original residential-proxy network is now a $300mn ARR data utility powering 14 of the top 20 LLM labs. Almost nobody writes about it.

Bright Data ARR vs total disclosed funding of competitors (2025)
Item	Value ($mn)
Bright Data — 50% YoY	$300mn ARR
Browserbase — $300mn valuation	$67.5mn raised
Apify — Cerebral Valley interview	~$25mn ARR
Browser Use — seed, March 2025	$17mn raised
Firecrawl — Series A August 2025	$14.5mn raised
Reworkd — shut down February 2025	$2.75mn raised
Bright Data is bigger than the entire YC AI-scraper class combined on funding ($101mn total raised) and an order of magnitude above Apify on reported ARR.

Bright Data crossed $300 million in annual recurring revenue in 2025, growing 50% year-on-year, with a target of $400mn by mid-2026. The number surfaced in a Calcalist item that never broke out of Israeli tech press.

In the same year, the four most-covered scraping startups — Browserbase ($40mn Series B at a $300mn valuation), Browser Use ($17mn seed), Firecrawl ($14.5mn Series A), and the now-defunct Reworkd ($2.75mn seed, shut down February 2025) — raised a combined $74mn and generated more headlines than any other slice of the market.

The size gap is the story.

The order-of-magnitude problem

Bright Data sits an order of magnitude above the entire AI-agent scraper cohort by revenue. Apify, the nearest comparable Store-and-platform business, runs at roughly $25mn ARR per the company’s Cerebral Valley interview. Decodo (formerly Smartproxy) and Oxylabs do not disclose, but neither has raised public capital, and industry estimates place both meaningfully below Bright Data.

If those numbers are roughly correct, Bright Data alone exceeds the next three or four scraping-infrastructure providers combined. By a wide margin, it is larger than every venture-backed scraping company on Earth.

This is not a sleeping-giant story. It is a story about a giant whose customers do not generate press releases.

Who actually pays

The buyer profile explains the silence. Bright Data’s marketing claims the company powers 14 of the top 20 global LLM labs and seven of the top 10 AI-first companies, serving more than 100 million AI-agent interactions per day. The numbers are unaudited, but the customer mix is consistent with the product line: structured datasets (LinkedIn, Amazon, Walmart, Google Maps), unblocking infrastructure (Web Unlocker at roughly $3 per 1,000 successful requests), and a $100mn Browser.ai / Deep Lookup suite aimed at agent builders.

Those buyers do not post on Hacker News about which scraper they use. They are foundation labs, hedge funds, and price-intelligence vendors with NDA-grade procurement. They evaluate quietly, sign multi-year contracts, and never appear in case studies. That cohort is invisible to the journalists who cover the AI-scraper class — so the journalists end up covering the YC startups instead.

The legal moat

Bright Data also won the most consequential scraping lawsuit of the decade. In Meta v. Bright Data (N.D. Cal., January 23, 2024), Judge Edward Chen granted summary judgment, holding that Meta’s Terms of Service bind only logged-in users and therefore cannot restrain logged-out scraping or resale of public Facebook and Instagram data. Meta dropped the residual tortious-interference claim a month later, waived appeal, and the ruling stood.

For a company selling LinkedIn and Instagram datasets to Fortune 500 enterprises, that ruling is not a footnote. It is the legal predicate that lets a procurement team sign the contract.

The AI-agent startups have no equivalent moat. Browserbase and Firecrawl render customer-supplied URLs in their cloud Chromium fleets — a model that has not been litigated at the district level, and that exposes the platform vendor to a direct CFAA argument the moment a target site complains. Bright Data has been through that hearing and won.

Three different markets, one shared word

The architectural overlap between Bright Data and the venture-backed class is more limited than the press suggests.

Browserbase and Browser Use sell agent infrastructure — cloud-hosted Chromium with an LLM-driven control plane — to developers building consumer AI agents that need to act inside a browser. The buyer is the AI engineer shipping a feature that books flights or fills out forms. Pricing reflects it: per-session, per-minute, with throughput ceilings.

Firecrawl sells page-to-markdown for RAG. The buyer is the LLM application developer who needs clean text fed into a retrieval pipeline. Pricing is per-page, with a 5× credit multiplier on structured extraction. Crawl4AI undercuts the entire category by being self-hostable Apache-2.0.

Bright Data sells finished data and unblocking primitives to enterprise data buyers. The buyer is the data scientist at a hedge fund or the procurement team at a foundation lab. Pricing is per-record on datasets and per-GB on residential bandwidth.

Three different markets that happen to share the word “scraping” in marketing copy. The AI-agent class is selling to a venture-software TAM. Bright Data is selling to a B2B data TAM. The latter is several times larger, and Bright Data faces effectively no public competition inside it.

What the ARR ratio says about Apify

For publishers in the Apify Store, Bright Data’s quiet scale is a calibration check. The ARR ratio implies the entire Apify Store — six thousand-plus actors, forty thousand active developers, the whole long tail — generates roughly the same revenue as a single mid-tier customer at Bright Data. That is not a value judgment about Apify. It is a statement about where enterprise data spend actually goes.

The open question is whether Apify’s $1mn Challenge — which produced 3,329 actor submissions across 704 developers — is a path toward winning enterprise spend, or a path toward winning long-tail consumer spend. The two markets are not the same, and the playbooks are not the same.

Bright Data did not win the enterprise data market by running developer challenges. It won by spending 15 years on residential-proxy infrastructure, layering datasets and unblocking APIs on top, and then winning the Meta lawsuit. That stack is hard to replicate from a marketplace flywheel — and it is the stack the press is not covering.

Sources