Probability in MTG: What's in a Win Rate? (Part 1)


It is a term that is often thrown around in different gaming
communities and is a primary metric by which many people make decisions on how they play the game. While seemingly simple, a deeper look at the concept of winrate actually reveals that it is, indeed, quite a complex parameter to estimate particularly in the context of Magic: The Gathering games. A practical consequence of this is that sufficient care should be made whenever attempting to use an estimate of a win rate for decision making, such as choosing a deck for a tournament. In this 3-part series of articles, I discuss the concept and estimation of win rate in MTG across different sub-contexts. In particular, I include discussions on defining win rate (Part 1), estimating a deck’s true win rate (Part 2), and estimating your win rate as a measure of skill in the game (Part 3).

Part 1: Defining Win Rate

In my opinion, the problem with properly defining what win rate is, is that the term itself seems so intuitive. If someone without any background in Mathematics or Statistics is asked to define what a win rate is, they would likely say something like “it’s how good a deck is” or “it’s how good a player is,” and they would be correct. However, this general definition does not advance our ability to estimate a win rate, nor our ability to evaluate how good a given estimate of a win rate is. Thus, we start with some basics.

A basic definition of win rate that serves our purpose is as follows: a win rate is a parameter of the form Pr(Deck A, Pilot B wins versus Deck C, Pilot D), which is short for the probability that Deck A piloted by Player B wins versus Deck C piloted by Player D. As a probability, a win rate can only take values from 0 to 1. As a parameter, it is a fixed but unknown number. Substituting different inputs for A, B, C, and D would result in different win rates. Some examples are as follows:

Pr(Example 1): Pr(UW Control, PVDDR wins versus Jeskai Fires, Marcio Carvalho)

Pr(Example 2): Pr(UW Control, any pilot wins versus Jeskai Fires, any pilot)

Pr(Example 3): Pr(Any deck, PVDDR wins versus Any deck Marcio Carvalho)

Pr(Example 4): Pr(UW Control, you versus Any Deck, Other players on the ranked ladder)
Pr(Example 5): Pr(Jeskai Fires, you versus Any Deck, Other players on the ranked ladder)

The first example specifies both players and decks, while the second example specifies only the decks and the third specifies only the players. As one can surmise, the second example is something that may be useful when choosing a deck to pilot for a tournament or for climbing the ranked ladder, while the third example is useful for gauging the skill difference between two players. However, the most useful for a specific player would likely be Examples 4 and 5. Suppose you only have access to these two decks, and you know for a fact that Pr(Example 4) = 0.50 while Pr(Example 5)=0.70, then the only logical choice for you would be to play Jeskai Fires on the ladder. However, there’s the rub; you do not know what the value of those two parameters are. This is, in fact, one of the main goals of the entire study of Statistics, finding a way to make “good guesses” about the value of a parameter through data.

Before we can tackle making guesses on Example 2, 3, 4 and 5, we must first consider the much simpler case of Example 1. Even in this case, getting a good estimate is more difficult than it may seem. For Example 1, we want to find the probability that PVDDR wins with UW control when up against Jeskai Fires piloted by Marcio Carvalho. This is, of course, the finals match of the recently concluded Magic World Championship XXVI, where PVDDR won (which is awesome because I chose him as my Champion in the “Choose your Champion” Event). The best estimator for this parameter is simply the sample proportion X/N, where X is the number of matches that PVDDR won versus Carvalho in the matchup and N is the number of matches the two of them played. Among other things that this estimator has going for it, Borel’s Law of Large Numbers says that this estimator will approach the true value of the parameter of interest as N goes to infinity (for more details on why this is the “best” estimator, see here). The problem with this estimator is that N needs to be sufficiently large in order for the estimate to be sufficiently precise. In order to avoid a lengthy discussion on precision and confidence intervals (which can be found here), I will demonstrate this by example.

In the Magic World Championship XXVI, this matchup was played between these two players four times. Once before the finals where PVDDR won and thrice in the finals where Marcio won two and PVDDR won one. Thus, if we assume that each of these matches are independent events (an issue I will further discuss), then the estimate for Pr(Example 1) would be X/N = 2/4=0.50. However, the 95% confidence interval for this would be [0,1] (calculator here). That is, we can be 95% confident that the true value of Pr(Example1) lies between 0 and 1. This is, as you can deduce, a useless estimate (since this covers all possible values of a probability!). On the other hand, had the two of them played 100 matches with the same estimate (i.e., with PVDDR winning 50 out of 100 matches), the interval would be [0.402, 0.598] which would be more useful as this allows us to infer that the matchup is actually somewhat even. Increasing sample size even further would likewise produce narrower and narrower intervals until our precision on the target parameter is as desired.

Of course, the obvious problem is that I don’t think either of them would be willing to actually play this matchup 100 times with the same level of enthusiasm and competitiveness each time. However, if they did, then the resulting data would certainly provide for a good estimate of Pr(Example 1). Also, while they might not want to do it, you and your friend may have both the time and gusto to see such an experiment through. In which case, the parameter that we would be estimating is Pr(UW Control, You versus Jeskai Fires, Your Friend). Having a good estimate of this parameter has little practical value though. If after just 50 games you find that your win rate is 0.70 [0.573, 0.827], this does not mean that you are a very good player. It does mean that you can be at least 95% confident that you have a higher than 57% chance to defeat your friend when you are on UW control and he is on Jeskai Fires. So, suppose you are paired up against him at your local FNM in this matchup the next day, you would actually have a statistical advantage (which, again does not help you in any way except perhaps by boosting your confidence and/or impairing his).

In summary, a win rate is a hidden number that can be estimated. Every deck has a win rate, every player has a win rate. In this article, it was shown how to estimate a very specific win rate, where both decks and players are defined. In the next part, I will discuss how to estimate win rates where one of the decks are defined but everything else are not, as well as how to evaluate estimates of such win rates that one can find online.

May the shuffler be with you.

Comments