Verticals & buyers · 5 min read

Innodata vs Appen: AI Data's Split

Two listed AI-data suppliers, opposite outcomes: Innodata posted $90.1mn Q1 revenue at 47% gross margin while Appen booked another loss on flat sales. Contract concentration was the dividing line.

By Signal Census Editorial AI Data Supply
All articles
Innodata vs Appen: AI Data's Split editorial image

Two publicly listed companies built the same business — supplying labelled and collected data to the firms training AI models — and arrived at opposite ends of the market. Innodata reported first-quarter 2026 revenue of $90.1mn, up 54% year-on-year, at a 47% adjusted gross margin; Appen closed its 2025 financial year with revenue of $232.67mn, down 1% and still loss-making. The divergence is not about who had the better idea. Both had the same idea. It turned on who owned the customer relationship and who was a line item another company could delete.

The same thesis, two income statements

Innodata, listed on the Nasdaq as INOD, sells data engineering and model-training services to large technology firms building generative AI. Its full-year 2025 revenue reached $251.7mn, up 48%, with net income of $32.2mn. The first quarter of 2026 accelerated that: $90.1mn in revenue, adjusted EBITDA of $25mn at a 28% margin, and operating cash flow of $37.3mn that lifted cash on hand to $117.4mn. Management raised full-year 2026 guidance to revenue growth of roughly 40% or more.

Appen, listed in Sydney as APX, sells the same category — human-labelled training data and data collection — to a similar roster of AI labs and big-tech buyers. Its 2025 revenue of $232.67mn was almost identical in absolute size to Innodata’s. The difference is direction and quality: Appen’s top line slipped 1%, it booked a net loss of $21.82mn, and its guidance for 2026 calls for US$270mn to US$300mn in revenue at a 5% to 10% EBITDA margin — a recovery target, not a growth story.

How one contract broke Appen

The fork traces to a single contract. Google notified Appen in January 2024 that it was terminating their global services agreement, with services ending on 19 March 2024. That contract had generated US$82.8mn in 2023 — 26% of Appen’s revenue that year. Appen’s shares fell roughly 40% on the news. The loss landed on top of a post-pandemic slide that had already taken revenue from a 2021 level of $447.26mn down through $388.31mn, $273.79mn and $235.22mn in the three years that followed.

That is the disintermediation risk at the centre of the AI-data-supply trade. Appen concentrated its revenue in a handful of hyperscaler contracts, and the value it added — coordinating large crowds of human annotators — was the kind of input a sophisticated buyer can bring in-house or re-tender. Google’s exit did not just remove a customer; it exposed how little switching cost Appen had built.

What Innodata captured instead

Innodata’s 2026 results show the opposite dynamic — and, on closer reading, the same concentration risk wearing better clothes. A big-tech customer that generated no revenue a year earlier is now on track to become Innodata’s second-largest account, and management flagged a new engagement with a leading big-tech buyer it expects could contribute about $51mn in 2026. Revenue from its other big-tech customers grew 453% year-on-year. That is spectacular growth, but it is growth from a small number of very large buyers — the same structural exposure that hurt Appen, currently pointed the right way.

What separates the two today is margin and momentum, not contract diversity. A 47% gross margin and 28% EBITDA margin suggest Innodata sells something closer to engineered output — evaluation, observability, complex data pipelines — than commoditised annotation hours. Appen’s 5% to 10% EBITDA target tells you it still competes largely on labour arbitrage, where margins are thin and buyers hold the power.

The read-through for the web-data economy

The lesson generalises well beyond these two tickers. The web-data and scraping industry sells the same primitive — data, at scale, for someone else’s model or product — and faces the same question: does the supplier own a durable position, or is it one procurement decision away from a 40% drawdown? Infrastructure vendors that own proxy networks, anti-bot evasion and managed datasets occupy a sturdier position than firms reselling undifferentiated collection. Bright Data and Apify have built recurring, self-serve revenue across thousands of customers precisely to avoid the single-contract cliff that broke Appen; you can see the breadth of that long-tail demand in the public catalogue at Apify Store.

Innodata and Appen are the same bet priced by the market at opposite extremes. Innodata’s market capitalisation sits near $3.4bn after a roughly 90% single-day jump on its Q1 print — the stock closed at $45.64 on 7 May 2026 and $86.84 the next day. Appen’s market value has fallen to about A$310mn, a small fraction of the company that once traded at A$41.49 a share in 2020. The data-supply thesis was right. Owning the customer, not just the data, is what decided who got paid for it.

What to watch next

Innodata’s premium now rests on retaining big-tech accounts that are themselves building internal data capability. The next two quarters of customer-concentration disclosure will show whether its margin lead is a moat or a head start.