Skip to main content
Back to Docs
EV Scanner

AI Probability Engine

3-model Bayesian consensus: AI + Heuristic + Microstructure

Predite's AI probability engine is the brain behind the scanner. It estimates true event probabilities for prediction market contracts, which is then compared to market prices to identify edge. This document explains how it works, what it's good at, and what it isn't.

## The 3-Model Consensus Approach

We don't rely on a single AI model. Instead, three independent estimation methods run and their outputs are weighted into a final probability:

Model 1: LLM Analysis (50% weight) A large language model (currently Claude) analyzes: - Resolution criteria and underlying question - Current news context (we feed it relevant recent articles) - Historical patterns for similar event types - Geopolitical and structural factors - Multiple reasoning paths

Output: probability estimate with structured reasoning.

Model 2: Heuristic Engine (30% weight) A rules-based system computes probability based on: - Time-decay adjustments (probabilities tend toward 0 or 1 near resolution) - Recent price action (momentum signal) - Volume-weighted average pricing - Spread asymmetry - Liquidity adjustments

Output: numerical estimate without natural language reasoning.

Model 3: Comparative Anchor (20% weight) For event types that recur (elections, sports, economic indicators), we anchor to historical base rates and similar past events. This prevents the LLM from making confident-but-wrong predictions about unusual cases.

## Why Three Models, Not One

Each model has known failure modes:

- LLM: can hallucinate confidently, may anchor on recent news too heavily

  • Heuristic: rigid, can't handle novel situations or contextual nuance
  • Anchor: useless for unique events without historical precedent

By combining them, errors tend to cancel out. The final estimate is more robust than any individual model.

## Calibration

A probability estimate of "70% likely" should mean that, across many such predictions, the event actually happens 70% of the time. This is called calibration.

We track calibration internally:

  • Each prediction logged with timestamp, estimate, eventual outcome
  • Periodically grouped into 5% buckets (60-65%, 65-70%, etc)
  • Compare predicted probability to actual outcome rate

Current calibration (as of mid-2026):

  • Politics: well-calibrated within 3pp across most ranges
  • Sports: slight optimism bias (predicted 70%, actually 67%)
  • Crypto: higher variance, less consistent
  • Economic: well-calibrated on Kalshi-style indicators

Calibration is published periodically in our blog. It's the most honest measure of AI accuracy.

## What the AI Is Good At

Established event types: elections, sports, economic indicators. The AI has lots of training data and clear patterns.
Markets with public information: news-driven outcomes where the relevant data is freely available.
Quantitative questions: "Will X exceed Y by Z date" with clear thresholds.
Aggregated reasoning: combining multiple sources into a single estimate.

## What the AI Struggles With

Insider-driven markets: governance votes, M&A speculation, regulatory decisions. People with private information have edge AI can't replicate.
Cultural events: Oscars, virality predictions, fashion. AI doesn't have good models of taste.
Very recent breaking news: training data has lag. Major events from the last 24h may not be reflected.
Long-tail outcomes: rare events that don't have historical precedent.
Manipulation-sensitive markets: low-volume markets where one trader can move the price.

For these categories, treat AI estimates with skepticism and rely more on your own domain knowledge.

## How News Integration Works

We continuously pull headlines and key facts from:

  • General news APIs (for political and economic context)
  • Sports data services (for game-specific markets)
  • Crypto-specific feeds (for token/protocol markets)
  • Resolution-relevant primary sources

These get fed into the AI's context window when analyzing related markets. So if there's a major news event affecting a market, the AI's analysis reflects it.

Caveat: news integration has limits. We can't cover every news source. We can't read paywalled content. We can't anticipate stories that haven't broken yet.

## How AI Confidence Is Computed

The confidence score (0-100%) reflects:

  • Convergence between the three models (more agreement = higher confidence)
  • Quality of available context (more relevant news = higher confidence)
  • Distance from extremes (estimates near 50% have lower confidence than near 5% or 95%)
  • Historical accuracy on similar question types

A 90% confidence estimate is genuinely more reliable than a 50% confidence one. Trust the confidence metric — it's not marketing fluff.

## Common Mistakes Users Make

Trusting AI estimates blindly: AI is wrong sometimes. Significant losses come from over-confident bet sizing on AI signals.
Ignoring confidence: a 10pp edge at 40% confidence is worse than a 5pp edge at 90% confidence. Account for both.
Trading every signal: AI shows hundreds of small edges. Most aren't worth the friction. Filter for >5pp edge + >70% confidence as a baseline.
Anchoring on AI's number: if AI says 70% and market is 60%, your job is to evaluate WHETHER the AI is more right than the market. Don't just copy the AI estimate as your view.

## Improving the AI

We iterate continuously:

  • Retraining the LLM context with new event types
  • Tuning heuristic parameters based on backtest performance
  • Adjusting weight ratios based on observed calibration
  • Adding new news sources as we find them

Major changes are announced in our changelog. If you notice systematic AI errors in specific market categories, report them via the in-app feedback (it actually goes to a human and we read it).

## Use Cases by Plan

Starter plan: AI estimates on Polymarket scanner. Read-only — see signals, learn the platform.
Pro plan: AI estimates across Polymarket and Kalshi. Plus arbitrage signals (AI compares prices across platforms).
Bot plan: AI estimates feeding into automated bot strategies. The "AI Edge Follower" bot uses these directly.

## Honest Limitations

We don't sell AI as magic. Real performance:

  • Average edge identified: 2-5pp (lower than scanner displays raw)
  • Win rate on AI signals: 55-60% (slightly above random)
  • After execution costs: positive expected value, but not large
  • Variance is real: drawdowns happen, sometimes 20-30% before recovery

This is consistent with the underlying truth: prediction markets are mostly efficient, and informational edge in the era of AI is real but modest. We're not promising 30% monthly returns. We're providing tools to extract modest edge consistently over years.

## Related Docs

- [How the EV Scanner Works](/docs/ev-scanner)

  • [Reading Signals](/docs/reading-signals)
  • [Bot Strategies](/docs/bot-strategies)
  • [Paper Trading](/docs/paper-trading)
AI Probability Engine | Predite