Crunchbase, PitchBook, CB Insights: VC Data Tier 2026
Crunchbase ($49/mo to thousands/yr), PitchBook ($20-100k/yr), CB Insights ($60-100k/yr) wholesale VC-investment data. All three depend on scraping and curation; pricing opacity rises with deal size. Apify Crunchbase scraper alternatives undercut the floor by 10×.
The VC-data wholesale market in 2026 has three dominant platforms — Crunchbase, PitchBook, and CB Insights — and a layer of alternative scraper-based access underneath. The three majors sell substantially the same product (private-company deal data, funding rounds, valuations, investor relationships) at substantially different prices, with no public per-record pricing on any of them.
The pricing-page transparency drops sharply as the deal size rises. Crunchbase publishes its Pro tier at $49/month. Crunchbase Enterprise pricing is “contact us”. PitchBook’s published page lists no numeric pricing at all. CB Insights publishes a “Free” tier and routes everything else through sales. The opacity is the segment’s defining feature — the buyers are sophisticated enough that public pricing would constrain negotiation more than it would enable discovery.
The pricing tiers
The publicly observable pricing landscape across the three majors and the alternative layer:
| Vendor | Entry tier | Mid tier | Enterprise |
|---|---|---|---|
| Crunchbase | Pro $49/mo (individual) | Crunchbase for Business ~$500/mo | Enterprise — $50k-200k/yr typical |
| PitchBook | Not published | Not published | $20k-100k/yr typical via sales |
| CB Insights | Free (limited) | Not published | $60k-100k/yr typical via sales |
| Alternative scraper layer | $0.001-0.01 per record | Custom contracts | Custom |
The headline number that matters is what the enterprise tier actually charges per record. Across the three majors, the back-of-envelope math runs roughly: a $50k/yr seat with access to ~30mn company records and ~500k deal records prices each record-touched at sub-cent levels. The pricing is bundle-based, not per-record. The bundle includes the data, the analytics layer, the export tooling, the API, and the human research team that maintains the data quality.
The alternative-scraper layer prices the data only — no analytics, no curation, no API SLA. That layer (Apify Crunchbase scrapers, third-party PitchBook scrapers where they exist, custom-built CB Insights ingestion) prices in the $0.001-0.01 per record band. The 10-100× cost gap between alternative scraper and major vendor reflects exactly what the bundle’s analytics-and-curation layer is worth in the market.
Where the scraping actually sits
All three majors depend on scraping plus human curation as the upstream data pipeline. The mix varies.
Crunchbase has the largest crowd-sourced component — companies and investors submit data directly through claimed-profile workflows. The scraping infrastructure backfills around the crowd-sourcing, indexing press releases, news mentions, and public funding announcements. The data-quality variance is highest at Crunchbase because the crowd-sourcing layer is uneven across geographies and stages.
PitchBook is closer to pure-curation — a large research team manually verifies deals and maintains the database. The scraping infrastructure feeds the research team but does not directly populate the customer-facing data. The result is higher data accuracy and the pricing power to charge enterprise rates.
CB Insights sits between the two. The platform combines proprietary scraping, NLP-based news ingestion, and a research team. The differentiator is the analytics layer (deal-prediction models, market sizing, technology-mapping reports) rather than the raw data itself.
For an Apify Store publisher building a VC-data scraping actor, the practical observation is that the data behind the three majors is mostly assembled from public sources — press releases, SEC filings, regulatory disclosures, company About pages, LinkedIn profiles. The work that separates a usable VC dataset from a raw scrape is in entity resolution (matching company-name variants), deal deduplication (the same round shows up in multiple sources with different details), and ongoing maintenance of the cross-references.
The buyer-segment split
Three buyer cohorts dominate spending in the segment, and they map cleanly onto the three majors:
Operators and founders. Spend $49/month on Crunchbase Pro to research competitors and benchmark salary. Often supplement with the free CB Insights tier and ad-hoc scraping for specific company lookups. Total VC-data spend: hundreds to low thousands per year.
VC and PE firms. Spend mid-five-figures per year on PitchBook for diligence workflow, plus CB Insights for market reports. Crunchbase is supplementary. Total VC-data spend: $50-200k per year per firm.
Corporate strategy teams. Spend on CB Insights for market intelligence and technology mapping, often with PitchBook for transaction data. Total VC-data spend: $100-500k per year for large corporates.
The unbundling pressure on the three majors comes from the bottom and the top. From the bottom, the alternative scraper layer (Apify actors, custom-built ingestion) erodes the operators-and-founders tier — buyers at that level discover that they can replicate 70% of their Crunchbase use case for $50/month in Apify pay-per-event credits. From the top, the corporate-strategy buyers increasingly demand custom datasets that the standard PitchBook/CB Insights bundles do not deliver — and turn to dataset-vendors or in-house teams that build on scraping infrastructure.
What it means for the alternative layer
The Apify Store hosts a long tail of Crunchbase, PitchBook, and CB Insights scrapers. Their economics work because:
- The major vendors have set the per-record price at $0.10-1.00 (implied from bundle math)
- The alternative scrapers can deliver the same record at $0.001-0.01
- The 100× cost difference creates a viable buyer cohort for the alternative tier
The constraint is what gets lost when buyers move down-stack: entity resolution quality, deduplication, ongoing maintenance, and access SLA. A Crunchbase scraper that returns raw HTML-parsed records is not the same product as Crunchbase Pro. The buyer who chooses the alternative is implicitly choosing to perform the curation in-house.
The publishers who win in this segment are the ones who layer curation on top of the scraping — entity-resolution actors, deduplication actors, cross-reference actors that combine multiple raw sources into a cleaner output. The same pattern that the labor-intel wholesale vendors demonstrate: collection is commodity, value is in the analytics layer above it.
For the three major vendors, the medium-term threat is not the alternative scraping layer directly. It is the combination of alternative scraping + LLM-based entity resolution + LLM-based deduplication that collapses the 100× cost gap by automating the curation work that justifies the bundle pricing. By 2027-2028, the bundle math will have to be defended on analytics value alone, not on data-access value. Crunchbase, PitchBook, and CB Insights all know this, which is why all three have shipped their own LLM-powered analytics layers in the last 12 months.
Sources
- Crunchbase pricing
- PitchBook contact-for-pricing page
- CB Insights pricing
- Apify Store — Crunchbase scraper segment
- Signal Census: Labor-Intel Wholesale Stack — adjacent enterprise-data segment
- Signal Census: Clay’s Orchestration Layer — scraping-plus-orchestration buyer pattern