milánlora
← Back to Work

Positive EV MLB Model

A machine learning model to find and exploit pricing inefficiencies in Major League Baseball moneyline markets.

Problem

Sports betting markets are competitive and noisy. Identifying consistent, profitable opportunities (positive expected value, or +EV) requires a systematic approach that can parse complex data signals, from player performance to market odds movement, without succumbing to common biases.

Approach

The approach involved building a feature-rich dataset from multiple sources, including historical game data and odds. Key steps included:

  • Feature Engineering: Created rolling averages, situational stats, and odds-implied metrics.
  • Labeling: Used a triple-barrier method to define outcomes (win, loss, no bet) based on whether odds were beaten, creating a more robust target than simple win/loss.
  • Modeling: Trained an XGBoost model to predict the probability of beating the closing line odds, a proxy for profitable bets.
  • Policy: Implemented a fractional Kelly criterion policy for bankroll management to optimize bet sizing based on the model's confidence.

Outcome

Backtesting showed a consistent positive return on investment over several seasons. A live paper-trading phase confirmed the model's edge, matching historical performance. The synthetic chart below represents the typical profit curve from a simulated season, demonstrating steady growth.

Simulated P/L Trend