A data business does not negotiate its share of the value. It inherits it. The rake – the portion of the economics that stays with the asset owner versus the portion that flows to whoever turns the asset into something a buyer will pay for – is fixed long before anyone opens a term sheet. It is fixed by where the enrichment work sits.
This is the decision most operators make without realising they have made it. They sign a partnership, hand over raw or lightly processed data, and discover eighteen months later that the partner is keeping seventy percent of the economics on an asset the partner did not own. The operator complains about commercial terms. The terms were never the problem. The architecture was.
Enrichment is the work that converts a data asset from something a buyer cannot use into something a buyer will pay for. Cleaning, structuring, modelling, contextualising, attaching identifiers, building the inferential layer that makes the signal commercially legible. There is a narrow class of assets where this does not hold – exclusive, real-time, regulated feeds where the substrate itself is the product and enrichment is trivial. Exchange market data sits there. Almost nothing else does. In the database era enrichment was largely mechanical and the rake question rarely surfaced – the value sat in the join, and the join sat with whoever held the identity spine. In the AI era enrichment is where the model-relevant signal is constructed, and the construction itself is most of the value. Whoever does it, keeps it.
Three configurations exist. Each fixes the rake at a different level, and each is chosen – or drifted into – at the moment the commercial architecture is designed, not at the moment a deal is signed. The conventional view holds in one narrow case: when the asset is genuinely unique and the buyer has no substitute. There,
the substrate alone commands a rake. Everywhere else, the work commands it.
The first configuration is enrichment inside the asset owner. The data business invests in its own enrichability – its ability to make the asset model-ready before it ever touches a partner or a buyer. The owner ships a product, not a feed. The rake here is high, structurally, because the buyer is paying for something the buyer cannot reconstruct. This is the configuration a retailer reaches when it stops licensing transaction logs and starts licensing audience inferences built on those logs. dunnhumby reached it by turning Tesco’s basket data into a media and insights product Tesco alone could not have sold. The investment is upfront and significant. The economics that follow are not negotiated downward by a partner who did the hard part.
The second configuration is enrichment inside a partner. The asset owner contributes raw or lightly processed data into a joint venture, a clean room, a platform operator, or a specialist that performs the enrichment and faces the buyer. The rake is low and the asset owner rarely understands why until the renewal conversation. The partner is not extracting value through unfair terms. The partner is extracting value because the partner is doing the work that makes the asset commercial. Every dollar the buyer pays is a dollar paid for enrichment the partner performed, on top of an asset the owner could not monetise alone. The split reflects the work. It always will. This is the configuration most retailers, most banks, and most telcos sit inside, and most of them frame it as a partnership problem rather than an architecture problem.
The third configuration is enrichment inside the buyer. The buyer ingests the asset and performs the enrichment itself, internally, against its own models and its own use cases. The rake collapses to commodity pricing on the raw feed. The asset owner has no leverage because the asset, in the form it ships, is interchangeable with any other source of similar signal. This is the configuration foundation model training contracts have made suddenly visible. Reddit’s licensing deal with Google clears around sixty million dollars a year for a corpus Google then enriches into something worth multiples of that inside its own models. The buyer is doing the enrichment. The buyer captures almost all of the economics. The owner is paid for the substrate, not the product.
The operator does not get to renegotiate which configuration they are in once the architecture is set. The leverage to negotiate a higher rake exists only when the enrichment work has already been done, and done by you. After the deal is signed, the location of the work is the location of the value, and no commercial clause rewrites that.
This is why enrichability investment is one of the eight strategic decisions, and why it sits upstream of mode selection rather than downstream of it. An operator who decides to Sell without first deciding where enrichment will sit is not deciding to sell. They are deciding to be a supplier to whoever decides to enrich. An operator who decides to Wrap without enrichment internalised is wrapping somebody else’s product. The Improve mode is the only one where the rake question is internal, and even there it returns the moment the improvement is exposed to a partner.
There is a second-order consequence most boards miss. Enrichment moves over time. An asset enriched today by a partner can be enriched tomorrow by a buyer, the moment the buyer’s internal capability crosses the threshold where the partner stops being necessary. This is the trajectory every data licensing business is on, and the only defence is to keep moving the enrichment frontier – to keep doing work the buyer cannot yet do internally. The rake is not a state. It is a position on a curve, and the curve moves toward the buyer unless the owner keeps investing.
The diagnostic is uncomfortable and fast. Take any data revenue line and ask one question: if the partner or buyer stopped doing the enrichment tomorrow, would the buyer still pay for the asset at the same price. If yes, the rake is yours and the architecture is sound. If no, the rake belongs to whoever is doing the enrichment, and the contract is a lagging indicator of a decision made years earlier. Most operators discover, running this test, that they are not in the data business they think they are in. They are in the substrate-supply business, and the data business sits one layer downstream, owned by someone else.
The correction is not commercial. It is architectural. The operator either internalises the enrichment, builds the enrichability into the asset itself, or accepts the rake the current configuration produces and stops being surprised by it. There is no fourth option, and there is no negotiation that substitutes for the design decision.
The rake is decided before the negotiation. The negotiation only discovers it.
*Gianluca Carrera*