Scraping Pricing Fragmentation: 47 Ways to Charge for a Page
20+ scraping vendors run 47+ distinct pricing models in 2026: $/GB, $/request, $/result, $/credit, $/session, $/CPU-hour. Fragmentation is intentional — prevents apples-to-apples comparison and protects vendor margins. Benchmarks are the only useful reference.
A buyer evaluating scraping infrastructure in 2026 confronts a pricing landscape that is structurally impossible to compare. Across the 20+ commercial scraping vendors with meaningful market presence, at least 47 distinct pricing models are in active use — and most vendors run two or three of them simultaneously across different products.
The fragmentation is not accidental. Pricing-model proliferation prevents cross-vendor comparison, which protects vendor margins on the dimension that buyers cannot quickly equalize. The benchmarks that normalize across pricing models — Q1 2026 cost-per-1000-pages, AIMultiple’s task-success benchmark, the Apify Store’s pay-per-event distribution — are now the only practical way to compare offerings without weeks of contract math.
The 47-model catalog
The pricing-axis taxonomy across observed vendor pricing pages:
| Axis | Used by | Variations |
|---|---|---|
| $/GB bandwidth | Bright Data, Smartproxy/Decodo, Oxylabs, NetNut, Soax, IPRoyal | Residential, datacenter, mobile, ISP tiers |
| $/successful request | ZenRows, Scrapfly, ScrapingBee, ScraperAPI | Geo-tiered, anti-bot-tiered, JS-render-tiered |
| $/result returned | Apify (PPE), Datafiniti, Bright Data datasets | Per-record, per-page, per-extraction |
| $/credit (composite unit) | Apify (platform compute), Browserbase, Firecrawl | Credits map to compute, bandwidth, time differently per vendor |
| $/proxy-port | Smartproxy, NetNut, IPRoyal | Sticky-session, rotating, geo-locked |
| $/session-hour | Browserbase, Hyperbrowser, Steel, Anchor | Browser-instance time |
| $/CPU-hour | Apify (platform), Crawlbase | Compute resource consumption |
| $/month flat subscription | ZenRows mid-tiers, ScrapingBee freelancer, ParseHub | Capacity-tiered |
| $/seat (analytics tools) | Bright Data Web Scraper IDE, Octoparse | Per-user pricing |
| $/dataset subscription | Bright Data datasets, Datafiniti | Refresh-cadence-tiered |
| $/API call (general) | Diffbot, Crawlbase | Tiered by content type |
| $/MCP tool call | Apify MCP (PPE pass-through), Bright Data MCP | New surface, pricing inheriting from underlying product |
Counting the cross-product of these axes with the standard tiering dimensions (geo, success-rate guarantee, content type, JS render, anti-bot tier, refresh cadence), the unique observable pricing models cross 47 in mid-2026. The exact number depends on how granularly the buckets are drawn, but the order of magnitude is robust — pricing-model variety has grown faster than vendor count over the past three years.
Why fragmentation increased
Three forces drove the proliferation.
Vendor differentiation pressure. When two vendors price the same way, the comparison collapses to the per-unit number. When two vendors price differently, comparison requires reformatting math that most buyers do not perform. Vendors that introduced novel pricing axes ($/credit, $/session-hour, $/result) protected themselves from direct head-to-head comparison with vendors that stayed on $/GB or $/request.
Product-line expansion. Vendors that originally sold one thing (proxies, scrapers, datasets) now sell three or four things. Each line gets its own pricing axis that matches the unit-economics of that line. Bright Data is the clearest case — separate pricing for residential proxies ($/GB), Web Unblocker ($/successful request), datasets ($/record), Browser API ($/session-hour), MCP server (pass-through). Five products, five different pricing axes.
Buyer-segment specialization. Enterprise buyers tolerate complex pricing if it matches their procurement workflows. Self-serve buyers want simple flat-rate or per-call pricing. Vendors targeting both segments end up running two pricing models in parallel for the same underlying capacity.
What the fragmentation costs buyers
Three concrete costs land on the buyer side.
Pre-purchase comparison work. Evaluating “do I use ZenRows at $69/month for 10k results or Bright Data Web Unblocker at $3/1k successful requests” requires translating both to a common unit (cost-per-1000-successful-pages-with-comparable-success-rate). The translation is non-trivial because the success-rate definitions differ between vendors. The buyer either spends days on the math or accepts that they cannot truly compare.
Lock-in via pricing unit. A buyer that integrates against Apify’s per-event PPE model has rewritten cost accounting around per-event units. Switching to Bright Data Web Unblocker requires re-doing the cost accounting in successful-request units. The pricing-model lock-in is real even when the underlying capability is roughly substitutable.
Hidden price increases. When a vendor restructures its pricing (e.g., Apify retiring Rental in favor of PPE) the buyer-side experience is often a price increase that is hard to detect because the unit changed. Comparing “I was paying $100/month under Rental” to “I’m now paying $0.005 per event under PPE” requires forecasting event volume — and the volume often exceeds the buyer’s pre-migration estimate.
The benchmark response
Three independent benchmark efforts have emerged in 2025-2026 specifically to normalize across the pricing fragmentation.
Signal Census’s cost-per-1000-pages benchmark. Normalizes the 20+ vendors to a common per-1000-successful-pages metric across geo, anti-bot tier, and content type. The first attempt at a public cross-vendor pricing reference.
AIMultiple’s scraping-vendor benchmark. Reports task-success-rate and average completion time on a standardized agent benchmark, with implicit cost normalization (per-task cost across vendors).
Apify Store’s PPE distribution as reference. The Apify Store now hosts thousands of pay-per-event actors with public per-record prices. The distribution of these prices (typically $0.0015-$0.005 per record) is a useful floor for what scraping-as-output costs at the marketplace tier.
None of the three is comprehensive. All three reduce the fragmentation problem but do not eliminate it. The buyer’s practical workflow in 2026 is to consult the benchmarks for a starting position, then run vendor-specific cost modeling for the chosen short-list. The days of selecting a scraping vendor on the basis of headline pricing are over.
What changes the fragmentation
Two forces could compress the 47-model landscape over the next 2-3 years.
Standardization pressure from MCP-era buyers. An LLM agent picking tools at runtime needs to evaluate cost in real time. The agent cannot run vendor-specific cost models. The pricing-unit that wins agent adoption is the unit that is most directly comparable across vendors — likely per-tool-call with a transparent dollar amount per call. The MCP ecosystem’s pricing standardization, when it arrives, will compress the fragmentation toward a smaller set of canonical models.
Vendor consolidation. If 5-10 of the 20+ scraping vendors are acquired or shut down over 2026-2028 (a reasonable expectation given the vendor productivity ranking and the Apify-scale economics that constrain who can sustainably operate), the pricing-model variety will compress proportionally. Survivors typically inherit pricing approaches from acquirers, which standardizes the model count.
For Apify Store publishers and other buyers managing scraping infrastructure across multiple vendors, the practical advice is to maintain a normalized cost-tracking layer in-house. The vendors will not standardize for the buyer’s convenience. The buyer who tracks “cost per successful structured-record across vendor X, Y, Z” in a common framework retains comparison power that the vendor pricing models actively try to erase.
The 47-model count is a snapshot. By 2027, the count will likely be higher, not lower. Pricing fragmentation in scraping infrastructure is a feature of the market, not a bug — and one of the few defenses vendor pricing teams have against the commoditization pressure that the underlying capability is otherwise under.
Sources
- Public pricing pages across 20+ scraping vendors (Apify, Bright Data, Oxylabs, ZenRows, Scrapfly, ScrapingBee, ScraperAPI, Smartproxy/Decodo, NetNut, Soax, IPRoyal, Datafiniti, Crawlbase, Browserbase, Hyperbrowser, Steel, Anchor, Diffbot, Octoparse, Firecrawl)
- Signal Census: Cost per 1,000 Pages Q1 2026 Benchmark
- Signal Census: Scraping Vendor ARR-per-Employee — operating-model context
- Signal Census: $0.0025 Per Product — per-record pricing analysis