Most people model win/loss as binary in situations where win/loss is calculated in a non-binary way (e.g. in sports). But you lose a lot of information by labeling a win as binary instead of as its point differential. It’s better to model the point differential directly and then map that to a probability (which you can then label as win/loss).

For example, if your point differential follows the normal distribution you can use the normal CDF to map that score to a probability. Let $x$ be the home team score minus the away team score. For a given standard deviation $\sigma$, the probability that the score differential $x$ is less than or equal to 0 is $\Phi(x, 0, \sigma)$. This is the probability that the home team loses the game. So $1-\Phi(x, 0, \sigma)$ is the probability that the home team wins the game.

But what do we do about $\sigma$? One option would be to fix it to a given value, but this means the score differential uncertainty at the end of the game would be treated similarly to the score differential at the beginning of the game. But that doesn’t really match reality. We know that the win probability for a team with the leading score with a minute left on the clock has higher certainty than the win probability associated with the same score at the very beginning of the game.

In order to capture this we can allow $\sigma$ to take on larger values earlier on in the game and smaller values towards the end of the game. Intuitively this makes sense. We are more certain about what the win probability should be towards the end of a game than at the beginning.

To do this mathematically we need to model $\sigma$ as a function of time. For example, in an NBA regulation basketball game (which lasts 48 minutes) we could define $\sigma = (48-t)^{0.5}$, where $t$ is the number of minutes that have elapsed in the game.

The choice of exponent depends on the application. Here we choose to take the square root of how many minutes are left in a regulation game. The exponent determines the uncertainty so that you’re not dealing with time directly (minutes in this case), but rather a transformation of time. This prevents you from having excessively large values for $\sigma$. For each point in time, the closer the exponent is to 1 then the more uncertainty there is in win probability and the closer the exponent is to 0 the more certainty there is in the win probability.

The graphic below (Python code follows) illustrates this result for a fixed score differential at varying time points in the game and with varying exponent values (0.2, 0.5, 0.8). The score differential is fixed at 3 (which, in this case, is interpreted as a 3 point lead for the home team). You can see how at different points in time a lower exponent value results in a relatively smaller standard deviation, which in turn results in a win probability closer to 1.

import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

t = np.arange(0, 48, step=1)

def sd(time_elapsed, exponent):
	result = (48 - time_elapsed)**exponent

def win_prob(pt_diff, sd):
	return(norm.cdf(pt_diff, 0, sd))

a = sd(t, 0.2)
b = sd(t, 0.5)
c = sd(t, 0.8)
wp_a = win_prob(3, a)
wp_b = win_prob(3, b)
wp_c = win_prob(3, c)

plt.title('Transformed Standard Deviation')
plt.ylabel('Standard Deviation')
plt.plot(t, a, lw=2, color='#8fb9a8', label='0.2')
plt.plot(t, b, lw=2, color='#fcd0ba', label='0.5')
plt.plot(t, c, lw=2, color='#765d69', label='0.8')

plt.title('Win Probability')
plt.ylabel('Win Probability')
plt.plot(t, wp_a, lw=2, color='#8fb9a8', label='0.2')
plt.plot(t, wp_b, lw=2, color='#fcd0ba', label='0.5')
plt.plot(t, wp_c, lw=2, color='#765d69', label='0.8')
plt.legend(title='sd', loc='lower right')

plt.savefig('./win_prob.svg', format='svg')