top of page

Polymarket guide to AI models

ree

There’s so much noise in the AI world right now. Every day, someone posts a new benchmark, a cherry-picked comparison, or a hot take that this model is finally better than that model.


That’s why I like to look at prediction markets like Polymarket. Unlike social media hype, these are places where people back up their opinions with real money. It’s not perfect — plenty of speculation and bandwagoning still happens — but it gives a more grounded snapshot of how insiders and the broader market see the AI landscape evolving.


Polymarket has been running a contract on who would be considered the “best AI model” by the end of 2025. To resolve it, they relied on something called the Arena Score, an Elo-style system that pits major AI models head-to-head across tasks like reasoning, language, and coding.


It’s worth stressing that topping the Arena leaderboard doesn’t necessarily mean a model is the best for your use case, or that it makes the most money, or even that it’s the most widely used. It simply means it performed best on a controlled series of technical head-to-head tests at a point in time. Here's a link to their current scoreboard.


ree

who are the major players?

1. Google (Gemini) — 50%The clear frontrunner on Polymarket. Gemini (formerly Bard) is Google DeepMind’s flagship AI family, tightly integrated into Google Workspace tools like Docs, Gmail, and Sheets, plus available by API. How they make money: Enterprise subscriptions for Workspace AI, pay-as-you-go API usage, and embedding Gemini in cloud projects.


2. OpenAI (ChatGPT) — 27% The brand most people know. ChatGPT is built on OpenAI’s GPT-4, famous for strong language, reasoning, and code. How they make money: ChatGPT Plus at $20/month for individuals, plus hefty API fees charged to developers and corporations that integrate GPT into their products.


3. xAI (Grok) — 14% Elon Musk’s xAI and its “Grok” models are newer entrants, tied into X (Twitter). While still maturing, the market is clearly watching Musk’s ambitions closely. Recently xAi created a lot controversy with Grok generating anti-semetic rants. On the other hand Elon Musk has been making bold predictions about his newest model being a giant leap forward. How they make money: For now mainly tied to X Premium subscriptions, but expected to expand into APIs and direct business integrations.


4. Meta (LLaMA) — 4% Meta’s open-weight LLaMA models power many smaller and self-hosted AI systems. They’re popular for teams that want to customize AI or keep data fully private. How they make money: Indirectly — Meta benefits by leading open AI research, attracting talent, and pushing adoption of its ecosystem.


5. Anthropic (Claude) — 3% Claude (now in its third generation) is known for safety-first design and careful language alignment. It’s popular with companies focused on minimizing “risky” outputs. How they make money: API usage priced by tokens, and lightweight cheaper models (like Claude 3 Haiku) for smaller tasks.


6. DeepSeek — 3% A rising Chinese AI developer that’s starting to break into international awareness. DeepSeek models are optimized for reasoning and multilingual applications.


Why does this matter for regular businesses?

For non-technical operators and CFOs, the key takeaway is simpler. These foundation models are like powerful engines. Whether you’re building a chatbot, an internal data analyst, or an automated legal assistant, you’re often plugging one of these models underneath.


Most of the time, it’s still pay-as-you-go, based on how many words (tokens) get processed. That’s why you see pricing by “cost per million tokens.” For companies with heavy usage, this can add up quickly — which is why some look to open-weight models like Mistral or LLaMA to run privately.


There’s also a shift to enterprise licensing, where you pay a flat fee to integrate a model deeply into your workflows without worrying about per-call costs.


Bottom line

Prediction markets like Polymarket give an interesting — if imperfect — pulse check on where people are putting their money. Right now, bets might lean GPT or Claude, driven by benchmark wins on Arena.


But for your business, the real question isn’t who tops the Arena chart. It’s which model best fits your data, workflows, and cost structure. That’s what ultimately decides if AI becomes just another flashy experiment or a true competitive edge.


bottom of page