How Uber Eats Ranks Restaurant Ads (And What It Means for Your Marketplace)

Most marketplace ad systems are still fairly blunt. A restaurant pays for placement, gets pushed higher in the feed, and the ranking model mixes in a few engagement signals and relevance features.

Uber recently published a detailed breakdown of how it rebuilt the ads ranking system inside Uber Eats. The technical architecture is worth understanding, but the more important point is the problem Uber was solving: static user profiles are weak proxies for intent, and weak intent modeling eventually hurts advertiser ROI.

That problem exists in almost every marketplace ad product.

The core lesson: sponsored ranking is not just a monetization system. It is a matching system. If it understands buyer intent poorly, it weakens trust on both sides of the marketplace.


The Problem With Aggregate User Features

Before this rebuild, Uber Eats was using the same type of aggregate behavioral features most marketplaces still rely on:

  • clicks in the last 30 days
  • order history by category
  • impression counts
  • cuisine preferences

The issue is that aggregate features flatten behavior into averages.

A user who consistently orders sushi but occasionally tries Thai looks very different from someone who primarily orders Thai and clicked sushi twice last month. Aggregate counts can make those users appear similar even though the underlying intent is different.

Uber identified this as a core limitation. Its model was reading snapshots of behavior instead of sequences of behavior.


Sequential Modeling Changes the Ranking Problem

The new system models user behavior as a chronological sequence of events instead of a fixed profile.

What each event captures

  • restaurant UUID
  • cuisine type
  • hour and day of week
  • engagement type, such as click, add-to-cart, or completed order

What changes

The model does not ask, “What does this user generally like?” It asks, “Given this specific restaurant ad, which parts of this user’s history matter right now?”

That sequence is fed into a transformer-based encoder. The key design decision is that the candidate ad itself acts as the query inside the attention mechanism.

That changes the ranking problem materially. The user representation becomes conditional on the ad being evaluated rather than static across all candidates.

For marketplaces, that matters because intent changes constantly depending on timing, session context, urgency, and budget.

Why the Architecture Matters

To handle the computational cost of long sequences, Uber used Multi-Head Latent Attention, a technique introduced in the DeepSeek-V2 technical report. Standard self-attention scales poorly as sequence length grows. MLA compresses event-level tokens into a smaller set of latent representations, making long user-sequence modeling more tractable at inference time.

The important takeaway is not that every marketplace should build transformer rankers. It is that sequence-aware ranking is becoming the standard for relevance-heavy platforms.


Hetero-MMoE: Separating Click Prediction From Order Prediction

Uber Eats needed to optimize for two different outcomes simultaneously:

  • click-through rate, or CTR
  • click-to-order rate, or CTO

Those are related but different optimization problems. A user clicking an ad is one signal. A user placing an order after clicking is much more valuable because it maps directly to advertiser ROI.

Many marketplace ad systems quietly over-optimize CTR because clicks are easier to generate and easier to train on. The result is sponsored inventory that drives traffic but not revenue.

Uber addressed this using a Heterogeneous Multi-gate Mixture of Experts architecture, or Hetero-MMoE. Instead of relying entirely on identical MLP experts, the system blends different expert types:

MLP experts

Capture deep nonlinear representation learning.

DCN-V2 experts

Capture explicit low- to mid-order feature interactions through cross layers and deep learning components.

CIN experts

Capture higher-order feature interactions, adapted from the xDeepFM architecture.

Task-specific gates

Decide how much each expert should influence CTR prediction versus CTO prediction.

The broader principle is straightforward: marketplace ad ranking is a multi-objective optimization problem.

Buyer relevance, seller ROI, monetization efficiency, and long-term trust in sponsored inventory are competing constraints. Architectures that treat them as a single task will make bad trade-offs.


Why Small Ranking Gains Matter

Uber reported the following improvements in online experiments:

  • +0.93% AUC improvement on predicted click-through rate
  • +0.66% AUC improvement on predicted click-to-order rate
  • Lower log loss across both prediction tasks

Those numbers look small. They are not.

Ranking improvements compound quickly at marketplace scale because they affect impression allocation, conversion efficiency, seller bidding behavior, and total ad inventory yield.

The CTO improvement is especially meaningful. Predicting completed transactions is materially harder than predicting clicks, and much closer to what advertisers actually pay for. A model that improves on that metric is improving the right thing.

Uber also noted that the architecture will likely extend into search ads and the organic home feed. That signals this is being treated as foundational ranking infrastructure, not a standalone ads experiment.


What Marketplace Operators Should Take From This

Most growth-stage marketplaces do not need transformer-based ads systems. But many are already dealing with the same structural problem Uber was solving: the ads system understands user intent worse than the organic recommendation system.

That gap creates compounding problems. Buyers stop trusting sponsored placements. Advertisers lose confidence that paid inventory drives incremental outcomes. Monetization becomes harder to sustain.

Practical Takeaways

  • Engagement sequences beat engagement summaries. Even simple recency-weighted features or last-N interaction models improve relevance over pure aggregates. You do not need a transformer to act on this.
  • CTR alone is a weak optimization target. If sponsored ranking generates clicks without conversion quality, advertiser trust deteriorates over time. Downstream conversion signals need to be in your model.
  • Organic and sponsored ranking systems should not evolve independently. The more disconnected they become, the harder it is to maintain relevance consistency across the marketplace.
  • Marketplace monetization is a matching problem. The strongest ad systems improve allocation efficiency, not just visibility. Relevance is the mechanism. Revenue is the output.
Want to find where marketplace monetization is leaking? SNR Growth helps marketplace operators build the signal infrastructure behind better acquisition, ranking, and paid campaigns.
Explore SNR Growth

Sources

  • Uber Engineering Blog, “Transforming Ads Personalization with Sequential Modeling and Hetero-MMoE at Uber” (March 2026)
  • DeepSeek-V2 Technical Report, DeepSeek-AI (2024)
  • “DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems,” Wang et al., Google (2020)
  • “Deep & Cross Network for Ad Click Predictions,” Wang et al., Google (AdKDD 2017)
  • “xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems,” Lian et al., Microsoft Research (KDD 2018)
  • “DeepFM: A Factorization-Machine based Neural Network for CTR Prediction,” Guo et al. (IJCAI 2017)

More insights

Insights
Josiah

Disintermediation in Marketplaces

SNR Growth Blog Content Disintermediation happens when buyers and sellers use your marketplace to find each other, then complete the transaction elsewhere. The platform created

Read More »

Most marketplace teams eventually reach a point where they have plenty of data but struggle to trust it. Dashboards show

Raising for a marketplace has never been straightforward. The model is capital-intensive early, the metrics are unusual, and most generic

SNR Growth Blog Content Disintermediation happens when buyers and sellers use your marketplace to find each other, then complete the

UGC Images Are a Marketplace Growth Lever UGC Images Are a Marketplace Growth Lever When marketplace founders think about conversion,