HoopsJunkie

Methodology

Win probability estimates the likelihood that the home team will win at any point before or during a game. On Hoops Junkie, this is displayed as a live chart that updates with every play, showing how the balance of the game shifts over time.

Our win probability model combines two key inputs: a pre-game estimate of each team's strength, and real-time game state (the current score and time remaining). The result is a probability between 0% and 100% that updates as the game unfolds.

Elo ratings

Before a game tips off, we need a baseline estimate of which team is stronger. For this we use an Elo rating system. This method was originally developed for chess but has been widely adopted in sports analytics.

In an Elo system, every team starts with a base rating of 1500. After each game, the winning team gains rating points and the losing team loses them. The amount gained or lost depends on three factors:

1. Expected outcome

An upset produces a larger rating change than a result that was expected. If a 1700-rated team beats a 1300-rated team, the Elo shift is modest because this was the expected outcome. If the 1300-rated team wins in an upset the shift is much larger.

2. Margin of victory

Winning by 20 points produces a larger Elo change than winning by 2. However, this is scaled logarithmically so the difference between winning by 2 and winning by 10 matters more than the difference between winning by 20 and winning by 28.

3. Blowout dampening

Expected blowouts are dampened so that a strong team beating a weak team by a large margin doesn't cause an outsized Elo swing. This prevents ratings from becoming inflated simply because a team ran up the score against a weaker opponent.


Additional notes on Elo ratings

Home court advantage

Elo ratings also carry a built-in home court advantage, reflecting the well-documented edge that home teams have in the NBA.

Seasonal regression

At the start of each new season, ratings are regressed partially (25%) toward the 1500 baseline. This regressed rating is then blended with a market-implied Elo derived from pre-season betting over/under win totals, with the market signal weighted at 80% and our regressed Elo at 20%. This accounts for off-season roster changes (trades, free agency, draft picks, coaching hires) that our Elo system can't capture from prior season results alone.

Expected vs unexpected blowouts

The margin of victory and blowout dampening factors apply equally regardless of which team wins, so a 25-point margin produces the same margin multiplier and the same dampening whether the favourite or the underdog wins by that amount. The factor that creates a meaningful difference in Elo shifts is the expected outcome.

When a strong team blows out a weak team by 25 points, the result was expected. The "surprise" component is small, so the overall Elo shift is modest despite the large margin.

When a weak team blows out a strong team by 25 points, the result was highly unexpected. The surprise component is large, and the margin of victory amplifies it further. The result is a dramatically larger Elo swing which is exactly the behaviour we want. The rating system responds strongly to genuinely surprising results.

To illustrate, consider a 1750-rated team (home) playing a 1250-rated team. The strong team has roughly a 97% pure Elo pre-game win probability. Here's how the winner's Elo change varies across four scenarios:

Scenario Margin Elo shift
Strong team wins (expected) 3 points +1
Strong team wins (expected blowout) 25 points +2
Weak team wins (close upset) 3 points +21
Weak team wins (blowout upset) 25 points +50

The expected result barely moves the needle (+1 or +2) regardless of the margin. A close upset shifts ratings significantly (+21), and a blowout upset produces a massive swing (+50), twenty-five times larger than the expected blowout.

Of course, what the Elo model doesn't know here is whether the strong team was resting its starters in anticipation of a less competitive game against a weaker opponent, or whether the weaker team genuinely played above its head and took out a full strength elite team.

This limitation is one of the main reasons we supplement Elo with betting odds when they're available. The market is much better at pricing in these kinds of situational factors.

Incorporating betting odds

Elo ratings are excellent at capturing sustained team strength and momentum over a season. However, they can't account for short-term factors like injuries, rest days, or roster changes that the betting market prices in.

When available, we incorporate closing moneyline odds from the betting market as an additional signal. The odds-implied win probability is converted into an equivalent Elo value and blended with our own Elo ratings, with our Elo weighted at 60% and the odds-derived signal at 40%.

This blend gives us the best of both worlds, where Elo's momentum-based team strength assessment is combined with the betting market's time-sensitive awareness. For historical games where betting odds are not available, the model falls back to pure Elo ratings.

Back-to-back rest adjustments

Teams playing on a "back-to-back" (two games in two nights) have a well-documented performance disadvantage. We apply a temporary Elo adjustment to account for this, which shifts the pre-game win probability without permanently changing either team's rating.

The adjustment varies based on travel. A team playing the second night of a back-to-back in a different city receives a larger penalty than one playing in the same city. For example, a team playing at home after flying back from an away game the night before receives a moderate penalty, while a team playing on the road after travelling to a new city receives the largest penalty.

These penalty values were tuned against eight seasons of historical game data by measuring which adjustments best calibrated our predictions against actual outcomes for back-to-back games. The penalties are applied to both teams independently, so if both teams are on a back-to-back, both receive their respective adjustments.

Importantly, these are pre-game adjustments only. The post-game Elo update remains unchanged because the rest penalty is already reflected in the expected outcome. A team that wins on a road back-to-back naturally receives a larger Elo gain because the win was "more surprising" given the fatigue penalty.

Additionally, it's important to note that if a team has a back-to-back playing in the same city (eg. playing New York then Brooklyn, or the Clippers then the Lakers), no travel penalty is applied.

Live win probability

Once a game is underway, the pre-game estimate is only part of the picture. It provides a win probability starting point but once a game is live, the model also factors in the current score margin and the time remaining. This produces a live win probability that updates on every play (though it moves more significantly on scoring plays).

The intuition is straightforward: a 10-point lead in the first quarter is relatively modest because there's plenty of time to recover. The same 10-point lead with two minutes remaining is much more difficult to overcome. The model captures this by weighting the score margin more heavily as the game clock winds down.

A long possession where a leading team grabs multiple offensive rebounds will reduce the opposing team's win probability simply because there's less time remaining for them to make up whatever margin they're trailing by. And this effect is amplified the closer the game is to completion.

The model was trained on over 800,000 scoring events from six seasons of NBA play-by-play data. By analysing how games actually played out from every possible combination of score margin, time remaining, and pre-game team strength, the model learned the historical relationship between these factors and the eventual outcome.

Practical considerations

Win probability is a useful tool for understanding the flow of a game but it comes with important caveats. It does not account for which players are currently on the court, foul trouble, momentum swings, or coaching decisions. It is a statistical estimate based on historical patterns, not a prediction engine.

The model also assumes that both teams will play at a roughly average level for the remainder of the game. In reality, a team trailing by 15 might be about to go on a 20-0 run but the model doesn't know anything about that. What it does know is that historically, teams in that situation come back a certain percentage of the time, and the win probability reflects that base rate.

Up next: Win projections