Performance rating (chess)
Performance rating (abbreviated as Rp) in
Due to the difficulty of computing performance rating in this manner, however, the linear method and FIDE method for calculating performance rating are in much more widespread use. With these simpler methods, only the average rating (abbreviated as Rc) factors into the calculation instead of the rating of each individual opponent. Regardless of the method, only the total score is used to determine performance rating instead of individual game results. FIDE performance ratings are also used to determine if a player has achieved a norm for FIDE titles such as Grandmaster (GM).
Definition
A player's performance rating in a series of games is the
Mathematical definition
Given a total score over a series of games and opponent ratings , the perfect performance rating is the number where the expected score on the right equals the actual score on the left:
Note that the two edge cases have unusual results:
- If someone loses all their games (), their performance rating is technically
- If someone wins all their games (), their performance rating is technically
Calculation
Since is a monotonically increasing function, we can find by performing binary search over the domain. This means we set a lower and an upper bound for reasonable ratings (here 0 and 4000), then check how much someone rated at the midpoint (2000) should have scored. If the actual score is more, this means the performance was better than 2000, so we repeat the search on the halved interval (between 2000 and 4000, midpoint at 3000).
A sample implementation in Python follows:
def expected_score(opponent_ratings: list[float], own_rating: float) -> float:
"""How many points we expect to score in a tourney with these opponents"""
return sum(
1 / (1 + 10**((opponent_rating - own_rating) / 400))
for opponent_rating in opponent_ratings
)
def performance_rating(opponent_ratings: list[float], score: float) -> int:
"""Calculate mathematically perfect performance rating with binary search"""
lo, hi = 0, 4000
while hi - lo > 0.001:
mid = (lo + hi) / 2
if expected_score(opponent_ratings, mid) < score:
lo = mid
else:
hi = mid
return round(mid)
print(performance_rating([1851, 2457, 1989, 2379, 2407], 4)) # should be 2551
FIDE performance rating
FIDE calculates a player's performance rating as , where is the average rating of the opponents and is an additional rating difference based on the player's total score divided by the number of rounds played. That fractional score is called . There is no
Like the true definition, the FIDE method also does not depend on individual game results. Unlike the true definition, the FIDE method does not depend on individual opponent ratings.[3]
Rating difference examples
Note: Zero scores have , even scores have , and perfect scores have .
Negative scores | |||||||||
---|---|---|---|---|---|---|---|---|---|
Score | ½ | 1 | 1½ | 2 | 2½ | 3 | 3½ | ||
0.06 | 0.13 | 0.19 | 0.25 | 0.31 | 0.37 | 0.44 | |||
−444 | −322 | −251 | −193 | −141 | −95 | −43 | |||
Positive scores | |||||||||
Score | 4½ | 5 | 5½ | 6 | 6½ | 7 | 7½ | ||
0.56 | 0.63 | 0.69 | 0.75 | 0.81 | 0.87 | 0.94 | |||
+43 | +95 | +141 | +193 | +251 | +322 | +444 |
Negative scores | |||||||||
---|---|---|---|---|---|---|---|---|---|
Score | ½ | 1 | 1½ | 2 | 2½ | 3 | 3½ | 4 | |
0.06 | 0.11 | 0.17 | 0.22 | 0.28 | 0.33 | 0.39 | 0.44 | ||
−444 | −351 | −273 | −220 | −166 | −125 | −80 | −43 | ||
Positive scores | |||||||||
Score | 5 | 5½ | 6 | 6½ | 7 | 7½ | 8 | 8½ | |
0.56 | 0.61 | 0.67 | 0.72 | 0.78 | 0.83 | 0.89 | 0.94 | ||
+43 | +80 | +125 | +166 | +220 | +273 | +351 | +444 |
Negative scores | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Score | ½ | 1 | 1½ | 2 | 2½ | 3 | 3½ | 4 | 4½ | ||
0.05 | 0.10 | 0.15 | 0.20 | 0.25 | 0.30 | 0.35 | 0.40 | 0.45 | |||
−470 | −366 | −296 | −240 | −193 | −149 | −110 | −72 | −36 | |||
Positive scores | |||||||||||
Score | 5½ | 6 | 6½ | 7 | 7½ | 8 | 8½ | 9 | 9½ | ||
0.55 | 0.60 | 0.65 | 0.70 | 0.75 | 0.80 | 0.85 | 0.90 | 0.95 | |||
+36 | +72 | +110 | +149 | +193 | +240 | +296 | +366 | +470 |
Negative scores | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Score | ½ | 1 | 1½ | 2 | 2½ | 3 | 3½ | 4 | 4½ | 5 | |||
0.05 | 0.09 | 0.14 | 0.18 | 0.23 | 0.27 | 0.32 | 0.36 | 0.41 | 0.45 | ||||
−470 | −383 | −309 | −262 | −211 | −175 | −133 | −102 | −65 | −36 | ||||
Positive scores | |||||||||||||
Score | 6 | 6½ | 7 | 7½ | 8 | 8½ | 9 | 9½ | 10 | 10½ | |||
0.55 | 0.59 | 0.64 | 0.68 | 0.73 | 0.77 | 0.82 | 0.86 | 0.91 | 0.95 | ||||
+36 | +65 | +102 | +133 | +175 | +211 | +262 | +309 | +383 | +470 |
Use in norms
One of the requirements to earn a FIDE title in a standard manner is to achieve a certain number of norms. A norm in chess is awarded if a player has a performance rating in a tournament at or above a threshold rating. As an example, for the Grandmaster (GM) title, a player must achieve three GM norms corresponding to performance ratings of at least 2600 against opponents with an average rating of 2380 and must also have reached a required peak live rating of 2500. These norms are calculated with the FIDE performance rating method.[4]
Linear performance rating
Because of the need to have a lookup table to calculate the rating difference in FIDE performance ratings, another simpler method instead calculates the rating difference as , where is the percentage score in this case. The overall performance rating is then calculated as , the same as the FIDE method.
An equivalent way to calculate this performance rating is by taking the average of
- opponent's rating + 400 for each win
- opponent's rating - 400 for each loss
- just their rating for each draw.
A disadvantage becomes obvious: An additional win against a low-rated player can actually lower your performance rating.
This method is sometimes called the linear method due to the linear dependence on the percentage score . Like the true definition, the linear method also does not depend on individual game results. Unlike the true definition, the linear method does not depend on individual opponent ratings.[5]
Comparison between methods
Different methods for calculating the performance rating generally give similar results. The only score in which all methods give exactly the same result is an even score against opponents with no skew away from their average rating, in which case the performance rating is the average of the opponents' ratings. There are larger discrepancies closer to zero scores or perfect scores, or a larger variance in the individual ratings (in which case the individual ratings have a larger effect). The true definition of the performance rating gives -∞ for a zero score and ∞ for a perfect score, whereas the other methods yield finite values.[1]
As a specific example, if a player scores 2½/3 against three opponents rated 2400, 2500, and 2600, their performance ratings with the different methods are 2785 (true definition), 2773 (FIDE), and 2767 (linear).[1]
References
- ^ a b c "Performance calculator". Kivij. Retrieved 22 October 2020.
- ^ "Elo Rating Performance Calculator". Paxmans. Retrieved 22 October 2020.
- ^ a b "B. Permanent Commissions / 02. FIDE Rating Regulations (Qualification Commission) / FIDE Rating Regulations effective from 1 July 2017". FIDE. Retrieved 22 October 2020.
- ^ a b "B. Permanent Commissions / 01. International Title Regulations (Qualification Commission) / FIDE Title Regulations effective from 1 July 2017". FIDE. Retrieved 22 October 2020.
- ^ "Performance calculator". Kivij. Retrieved 22 October 2020.