Vendor landscape · 4 min read

Apify's Pareto-of-Pareto: 0.67% Take Half of Demand

Of 3,155 tracked Apify actors, the top 0.67% (21 actors) capture 50% of measured demand. Top 3.87% take 80%. The top-3 actors hold 18.5%. The Apify distribution is sharper than 80/20 — more like 80/4 or 50/0.67. Median actor serves 7 users/month.

By Signal Census Editorial Apify Pareto OF Pareto
All articles
Apify's Pareto-of-Pareto: 0.67% Take Half of Demand editorial image
Apify
Apify · marketplace signal

The Pareto principle says 80% of the outputs come from 20% of the inputs. On the Apify Store, the principle holds but the ratio is significantly more extreme. Of 3,155 actors with continuous data history, the top 0.67% (21 actors) capture 50% of all measured demand. The top 3.87% capture 80%. The top three actors alone capture 18.5%.

That is not 80/20. That is 80/4, and inside the 80/4 result there is a 50/0.67 result, and inside that there is a 18/0.1 result. The distribution is recursively concentrated — what looks like a power-law at one level of aggregation reveals another, sharper power-law inside it.

Demand concentration A tiny actor tier captures most Apify demand
0% 50% 100% 80% demand = top 3.87% 0.1% 0.5% 1% 2% 5% 10% 20% 50% Top actor percentile

The top 1% already holds 58.2% of observed demand; by the top 5%, the curve is nearly saturated.

The full cumulative curve

Across the tracked subset of 3,155 actors:

Top X% of actorsActor countCumulative demand share
Top 0.1%318.5%
Top 0.5%1544.0%
Top 1.0%3158.2%
Top 2.0%6371.2%
Top 5.0%15782.7%
Top 10%31589.4%
Top 20%63194.9%
Top 50%1,57799.2%

Read the other way:

Demand thresholdActors needed to reach itShare of catalog
50% of all demand21 actors0.67%
80% of all demand122 actors3.87%
90% of all demand338 actors10.71%
95% of all demand642 actors20.35%
99% of all demand1,475 actors46.75%

The shape implies that the “long tail” framing is wrong for Apify. The tail is not long. The tail is most of the catalog. The head is functionally three actors, sitting on top of a body of 100-1,000 mid-tier actors, sitting on top of a graveyard of 2,000+ near-zero actors.

Why the distribution is this sharp

Three structural mechanisms compound to produce the 50/0.67 shape.

Discoverability winner-take-most. A buyer searching for “Instagram scraper” lands on the top-ranked result first. The top-ranked result is whichever actor is currently the most-used Instagram scraper. The most-used Instagram scraper is the one whose current user count rises fastest, which is partly a function of being the most-used yesterday. The positive feedback loop produces sharp concentration on the actor that briefly led the segment and then locked in distribution.

The 25,787-actor catalog is mostly invisible. Only about 3,155 of those have enough continuous data history to be tracked at all. The remaining roughly 22,500 are below the discovery threshold — either newly published, abandoned, or serving fewer than the measurement-tool can reliably count. The “catalog” in the buyer’s effective experience is closer to 3,155 actors, and the head of that subset is what gets seen.

Multi-actor publisher overlap. The tracked top decile is partially the same publishers — Apify the company runs several of the top-ranked actors, and a handful of multi-actor publishers operate two or three actors that each appear in the top 1%. The aggregate concentration is even sharper at the publisher level than at the actor level.

What 50/0.67 means for the marketplace shape

The standard mental model for a two-sided marketplace assumes a fat middle — many publishers, each serving meaningful demand, with the platform aggregating demand-side discovery. Apify’s distribution does not match that model. The middle is thin. The platform’s aggregation does produce visibility, but the visibility flows to a tiny head cohort.

Three implications:

The “marketplace” is structurally a small set of canonical actors plus optionality. A buyer’s typical interaction is to find the canonical actor for their target (Instagram Scraper, LinkedIn Profile Scraper, Crunchbase Scraper) and use it. The optionality — alternative actors for the same target — exists in the catalog but is rarely used. The 50/0.67 shape is the data-side confirmation that this is how buyers actually behave.

Publisher entry into the head is exceptionally rare. A new actor entering the 0.67% requires displacing one of the 21 actors already there. Those 21 actors have entrenched discovery, accumulated review-count, and pricing positions that benefit from network effects. The entry path for a new publisher is not “compete in the top tier” — it is to find a niche target that the existing canonical actors do not serve and build a new canonical position there.

The platform’s revenue depends on the head. Apify’s pricing model (PPE plus platform-take) means the platform’s revenue is roughly proportional to the demand. With 50% of demand in 21 actors and 80% in 122, Apify’s revenue depends asymmetrically on whether those 122 actors stay healthy. The platform-side investment in those publishers — featured placement, support, payment-rail integration — reflects that dependency.

What this means for new publishers

For a creator weighing whether to launch a new actor in a competitive category, the 50/0.67 data is honest about the entry problem. The expected outcome of “launch a new Instagram scraper” is to land somewhere in the 1,500-actor body that collectively captures 1% of demand. The expected revenue from that position is statistically zero.

The viable entry paths run two ways:

Find an underserved target. A category-target combination that the existing canonical actors do not cover. The MCP_SERVERS category in 2025-2026 was an example — new category, no canonical incumbents, opportunity to land a top-ranked position by being early. Most categories no longer have this opening.

Build the canonical actor for a new use case. An actor that defines a new category of extraction (lead enrichment, AI-citation tracking, agent-driven workflow scraping) rather than competing in an existing one. This is harder but produces a much better outcome distribution if it works.

The Apify Store entry math is not encouraging if the publisher’s plan is “competent execution in an existing category.” The 50/0.67 distribution rewards either pre-existing dominance or genuine category creation. The middle path — better execution on a known target — is the one the data says does not work for new entrants.

The marketplace’s three-level concentration (actor long-tail, publisher one-hit, category zombie-rate from earlier in the week) is the same shape from three different angles. The Pareto-of-Pareto distribution is the underlying numerical signature of a marketplace where discovery, network effects, and platform mechanics all reinforce concentration at the head and starve the tail of demand.


Sources