Anti-bot & legal · 5 min read

After Bright Data v. Meta: The Doctrine Holds, For Now

Judge Chen's January 2024 SJ for Bright Data — Meta's ToS bind only logged-in users — survived 2025 without challenge. Meta dismissed the residual claim and waived appeal. The bigger 2025 docket is NYT v. OpenAI advancing past motion to dismiss.

By Signal Census Editorial May 1, 2026

Apify · marketplace signal

Judge Chen's January 2024 SJ for Bright Data — Meta's ToS bind only logged-in users — survived 2025 without challenge.

Bright Data · scraping case law

The 2024 summary judgment Bright Data won against Meta is the load-bearing US precedent for logged-out scraping of public data. Two years on, the doctrine still holds.

On January 23, 2024, Judge Edward Chen of the Northern District of California granted summary judgment to Bright Data in the case Meta brought against the company over scraping and resale of public Facebook and Instagram data. The ruling held that Meta’s Terms of Service “do not bar logged-off scraping of public data” — meaning the ToS contractually bind only logged-in users, and a scraper accessing the public-facing surface without a session is not bound by them.

Meta dismissed its residual tortious-interference claim a month later and waived appeal. The ruling stands.

Two years on, the doctrine has survived. No follow-on case has materially narrowed it. The buyer-side conclusion held by enterprise procurement teams in 2024 — that resale of logged-out scraped public data is legally defensible in the US — held through 2025 and into 2026.

The bigger 2025 legal story for the data market was a different docket entirely.

The doctrine’s actual reach

The Chen ruling has been followed by district courts in other Ninth Circuit cases involving similar fact patterns — public data, logged-out access, resale to third parties. There has been no published Ninth Circuit appellate ruling overturning or narrowing it. Meta has not refiled.

The doctrine’s reach is, however, narrower than it is sometimes characterized. It applies specifically to:

Logged-out access only. The moment a scraper authenticates with a session token, the ToS attach and the protections of the Bright Data ruling do not apply.
Public-facing surfaces only. Data that requires login to view is not covered. Profile data behind a “view full profile” wall on LinkedIn is not the same as profile data shown to logged-out visitors.
Resale-side facts only. The ruling does not address every CFAA argument, every state-law tort, or every copyright theory that might be advanced against a scraper. It addresses the contract-breach theory specifically.

For practical purposes, that is enough to support the major commercial scraping operations targeting public web data — which is why the Apify Store ecosystem and larger commercial vendors like Bright Data operate without legal-existential threat in 2026. But it does not protect every operation. Scrapers that authenticate, scrapers that access non-public surfaces, and scrapers that touch copyrighted content (rather than facts) all face residual exposure that the Chen ruling does not cover.

Why nothing has materially narrowed it

The expected challenges did not arrive. Several plausible scenarios for narrowing the ruling were available to Meta and to other platform operators:

A new ToS that explicitly addresses logged-out scraping (Meta could update the FB/IG terms to attempt to bind anonymous visitors via header-injected acceptance flows). No such update has been published.
An en banc Ninth Circuit appeal of a related case (REX v. Zillow / NAR was the most plausible vehicle). REX lost at the district level on the antitrust theory, narrowing the appellate path; no related case has produced a vehicle large enough to relitigate the contract-formation question Chen decided.
A new state-law theory advanced in California or another permissive jurisdiction. None has surfaced at the level required to threaten the federal-law conclusion.

The most likely explanation for the silence is that Meta and the other platform operators have concluded the litigation cost is not worth the marginal benefit. Bright Data has demonstrated that aggressive defense plus willingness to litigate to summary judgment produces favorable rulings. Other platforms have moved enforcement to non-litigation channels: API restrictions, IP blocks, ToS enforcement against authenticated users only, and structural moves like Zillow’s Listing Access Standards.

That is not a ruling that the scraping side definitively won. It is a tactical retreat by the platform operators from a legal channel that has become unfavorable.

The NYT v. OpenAI docket and what it means

While the scraping-side doctrine held, the AI-training-side doctrine became the more active legal frontier. NYT v. OpenAI advanced past motion to dismiss in March 2025. The court denied OpenAI’s motion to dismiss the main copyright claims, narrowed but preserved the DMCA §1202 claims, and ordered discovery that includes approximately 20 million ChatGPT user logs.

The NYT case is not a scraping case in the narrow sense. The NYT does not allege OpenAI scraped its content directly; it alleges OpenAI trained models on its content (sourced through Common Crawl and other intermediaries) and that the trained models reproduce protected content in user-facing output. The legal theory is downstream of the scraping question.

But the NYT case matters for the scraping market for one specific reason: it tests whether the buyer of scraped data has independent legal exposure for using that data, even if the scraping itself was lawful under the Bright Data doctrine. If the answer is yes — if AI labs face copyright liability for training on scraped public content — then the demand from the largest buyers of scraped data contracts sharply. That is the same dynamic visible in the Reddit licensing era: buyers preferring contracted access over scraped access for liability insulation.

The NYT case has not yet produced a fair-use ruling at the appellate level. Two district-court rulings in the broader AI-training docket — Bartz v. Anthropic and Kadrey v. Meta (partial) — went favorably for AI labs on fair use, but neither is binding outside its district. The 2026 question is whether NYT v. OpenAI produces a clear answer one way or the other.

The legal frame for actor strategy

For Apify Store publishers, the legal landscape in 2026 is more favorable than it has been in five years for scraping public web data, and progressively less favorable for selling that data into AI training pipelines.

The implication for actor strategy:

Lead generation, market intelligence, price comparison, real-time competitive monitoring: legally well-protected use cases. The Bright Data doctrine holds. The buyer demand is healthy.

AI training data sales: legally exposed downstream of the scraping itself. The Apify-class actor publisher may not have direct exposure if they are selling raw output, but their buyers increasingly do. That dynamic is suppressing demand for scraped data sold for training.

Authenticated-session scraping: continues to be the highest-risk segment. The Bright Data doctrine does not cover it. Actors that require user-supplied cookies to operate sit outside the legal protections that cover the larger logged-out segment of the market.

The Q1 2026 lead and contact extractors census showed that the dominant positioning phrase in the segment is “No Cookies” — actors that operate without authenticated sessions. That phrase is not just a technical preference. It is the positioning that lines up most cleanly with the legal protections established by the Bright Data ruling. The leaders in the segment have been pricing for legal defensibility as well as technical capability, and that pricing is correct given the case-law evolution of 2024–2026.

The doctrine holds. The buyer mix is shifting. The publishers who position for the durable buyer demand — rather than the contracting AI-training buyer demand — will be the ones who continue to ship through the next legal cycle.

Sources