Vendor landscape · 4 min read

Apify Title Keywords: 'Scraper' in 77%, LinkedIn Dominant

Of 3,155 tracked Apify titles, 'scraper' appears in 76.7%. 'LinkedIn' tops target keywords at 27.6%; jobs/profile/search/company follow. Naming default is target+scraper. Recruiting and B2B targets dominate the title corpus far beyond their share of the catalog.

By Signal Census Editorial May 31, 2026 Apify Title Keyword Density

All articles

Apify · marketplace signal

Tokenizing the titles of all 3,155 Apify actors with continuous data history produces a striking result: the word “scraper” appears in 2,419 titles — 76.7% of the tracked catalog. The naming convention is functionally locked at “[Target] Scraper” or “[Target] [Type] Scraper”. A publisher who launches an actor without “scraper” in the title is in the minority across the entire marketplace.

The target-keyword distribution that follows the “scraper” baseline tells a sharper story about what Apify actually serves. LinkedIn appears in 870 titles (27.6%). Jobs / profile / search / company follow in the 7-13% range. The recruiting and B2B-prospecting use case dominates the title corpus far more aggressively than it dominates the underlying catalog by demand share or category density.

The keyword frequency

Top 20 title keywords across 3,155 tracked actors, excluding stop-words and pure numbers:

Keyword	Title mentions	Share
scraper	2,419	76.7%
linkedin	870	27.6%
jobs	412	13.1%
profile	283	9.0%
job	266	8.4%
search	241	7.6%
company	233	7.4%
reviews	184	5.8%
email	150	4.8%
instagram	147	4.7%
google	134	4.2%
posts	128	4.1%
api	126	4.0%
post	125	4.0%
trustpilot	120	3.8%
cookies	103	3.3%
tiktok	103	3.3%
data	99	3.1%
facebook	91	2.9%
fast	90	2.9%

The top tier (scraper + linkedin) is so dominant that the next-most-frequent keyword (jobs) appears at less than half the LinkedIn rate. The 2,419 “scraper” mentions are a publisher reflex — the word is the assumed identifier of the actor’s category to the buyer.

The “cookies” keyword at 3.3% deserves an aside. The presence of “cookies” in 103 titles is a search-discovery move: actors that explicitly market themselves as “no cookies required” or “without login” appear in this band. The convention is documented in the Q1 2026 lead-extractors census as the canonical positioning move in the LinkedIn-scraping segment, where the leading actor used “No Cookies” to clear the segment density floor.

What the title-keyword corpus reveals

Three patterns sit in the data.

The naming convention is doing discovery work. A buyer searching the Apify Store with “linkedin scraper” matches 870 titles directly. A buyer searching with “instagram data” matches a much smaller subset because “instagram” appears in only 147 titles and “data” in only 99. Publishers who chose target+scraper naming are visible to keyword search; publishers who chose creative naming are invisible. The discovery surface punishes deviation from the convention.

The recruiting/B2B cluster dominates the surface. Combining jobs (412), profile (283), job (266), company (233), email (150), people-related keywords, and linkedin (870), the recruiting-and-B2B-prospecting use case touches well over half of the title corpus. The actual demand for these targets is large but does not represent half the marketplace. The over-representation in titles reflects publisher behavior — easy target, lots of demand, low barrier to launching another LinkedIn or jobs scraper. The result is a saturated naming corpus that further compresses the visible differentiation between actors.

Platform names dominate target keywords. LinkedIn, Instagram, Google, Trustpilot, TikTok, Facebook, Twitter, Glassdoor, Indeed, YouTube — the top target keywords are all named consumer-internet or B2B platforms. The actors targeting smaller or less-well-known sites are invisible at the title-keyword level because their target name does not have search volume. Publishers building scrapers for niche sites face a structural discovery problem that the title corpus makes worse.

The naming-strategy implication

For a publisher considering an actor launch, the title-keyword data points at a tradeoff between discoverability and differentiation.

Conform to the convention. Use target+scraper naming. Get keyword-search discovery. Compete head-on with the other 870 LinkedIn scrapers, the other 412 jobs scrapers. Win or lose on actor quality, pricing, and rank rather than on naming.

Defy the convention. Use creative naming (“LeadFinder Pro”, “DataHarvest”, “ScrapeFlow”). Get marketing differentiation. Lose keyword-search discovery — buyers searching for “linkedin scraper” never see the actor. Win only if the publisher has external traffic (own website, content marketing, paid acquisition) that bypasses the Store’s discovery surface.

The 76.7% adoption of “scraper” in titles is the publisher consensus that discoverability beats differentiation on the Apify Store. The 23.3% who defy the convention are either pre-existing brands with their own distribution or are accepting reduced discoverability in exchange for some other advantage.

The MCP-era buyer dynamic — agents picking actors from typed tool lists — changes the math slightly. An LLM agent reading an actor’s input schema and description does not key on the title-keyword convention the same way a human browsing the Store does. For agent-driven discovery, the differentiation move (clearer description, better-typed schema, explicit success-rate claims) may be more valuable than the keyword-conformity move.

But agent-driven discovery is still a minority of Apify Store traffic in 2026. Until the share crosses 30-40% (likely 2027-2028), the convention holds. The publishers who optimize for the human buyer use “scraper” + target name. The publishers who are already preparing for the agent-buyer use clearer descriptions and richer schemas. Most actors today are written for the human; the publishers writing for the agent are the early-movers in a discovery transition that has not fully arrived.

What it means for the catalog shape

The structural reading is that Apify Store is, by its title-keyword corpus, a marketplace of scrapers for named consumer-internet platforms. The variety in the catalog (data tools, automation utilities, AI agents, MCP servers) exists but is acoustically smaller than the recruiting/social-media/B2B-prospecting cluster.

The implication for the platform’s positioning is informative. Apify markets itself as a “web scraping and automation platform” — the full breadth. The title corpus says the actually-used part of the platform is much narrower: scrape LinkedIn, scrape Indeed, scrape Trustpilot, scrape Instagram, scrape Google. The broader use cases exist in the long tail but do not dominate the visible surface.

For competitors trying to enter the same space, the title-keyword data is a useful signal about where buyer attention concentrates. A new entrant who builds for the same target set is competing in the dense band of the corpus. A new entrant who builds for targets that do not appear in the top 20 keywords is competing in a less-saturated band but also a lower-demand one. The math says the dense band has more demand per target than the sparse band, but also dramatically more supply.

Sources

Apify Store — full catalog
Signal Census pulse data — 3,155 actors with continuous data history, title corpus tokenized 2026-05-16
Signal Census: 3-Tag Default Backfires — companion analysis on category-tag spray
Signal Census: Pareto-of-Pareto — demand-side concentration
Q1 2026 Lead Extractors Census — segment-specific positioning analysis