Sports rating system

A sports rating system is a system that analyzes the results of sports competitions to provide

standings

which are based on win–loss–tie ratios.

In the United States, the biggest use of sports ratings systems is to rate NCAA college football teams in Division I FBS, choosing teams to play in the College Football Playoff. Sports ratings systems are also used to help determine the field for the NCAA men's and women's basketball tournaments, men's professional golf tournaments, professional tennis tournaments, and NASCAR. They are often mentioned in discussions about the teams that could or should receive invitations to participate in certain contests, despite not earning the most direct entrance path (such as a league championship).^[1]

Computer rating systems can tend toward

Ken Massey writes that an advantage of computer rating systems is that they can "objectively track all" 351 college basketball teams, while human polls "have limited value".^[2]

Computer ratings are verifiable and repeatable, and are comprehensive, requiring assessment of all selected criteria. By comparison, rating systems relying on human polls include inherent human subjectivity; this may or may not be an attractive property depending on system needs.

History

Sports ratings systems have been around for almost 80 years, when ratings were calculated on paper rather than by computer, as most are today. Some older computer systems still in use today include:

Dunkel Index, which dates back to 1929. Before the advent of the college football playoff, the Bowl Championship Series

championship game participants were determined by a combination of expert polls and computer systems.

Theory

Sports ratings systems use a variety of methods for rating teams, but the most prevalent method is called a power rating. The power rating of a team is a calculation of the team's strength relative to other teams in the same league or division. The basic idea is to maximize the amount of transitive relations in a given data set due to game outcomes. For example, if A defeats B and B defeats C, then one can safely say that A>B>C.

There are obvious problems with basing a system solely on wins and losses. For example, if C defeats A, then an

home field advantage

). In most cases though, each team plays a sufficient amount of other games during a given season, which lessens the overall effect of such violations.

From an

academic perspective, the use of linear algebra and statistics are popular among many of the systems' authors to determine their ratings. Some academic work is published in forums like the MIT Sloan Sports Analytics Conference

, others in traditional statistics, mathematics, psychology, and computer science journals.

If sufficient "inter-divisional" league play is not accomplished, teams in an isolated division may be artificially propped up or down in the overall ratings due to a lack of correlation to other teams in the overall league. This phenomenon is evident in systems that analyze historical college football seasons, such as when the top

Final Four

.

Goals of some rating systems differ from one another. For example, systems may be crafted to provide a perfect retrodictive analysis of the games played to-date, while others are predictive and give more weight to future trends rather than past results. This results in the potential for misinterpretation of rating system results by people unfamiliar with these goals; for example, a rating system designed to give accurate

point spread

predictions for gamblers might be ill-suited for use in selecting teams most deserving to play in a championship game or tournament.

Rating considerations

Home advantage

When two teams of equal quality play, the team at home tends to win more often. The size of the effect changes based on the era of play, game type, season length, sport, even number of time zones crossed. But across all conditions, "simply playing at home increases the chances of winning."^[3] A win away from home is therefore seen more favorably than a win at home, because it was more challenging. Home advantage (which, for sports played on a pitch, is almost always called "home field advantage") is also based on the qualities of the individual stadium and crowd; the advantage in the NFL can be more than a 4-point difference from the stadium with the least advantage to the stadium with the most.^[4]

Strength of schedule

Strength of schedule refers to the quality of a team's opponents. A win against an inferior opponent is usually seen less favorably than a win against a superior opponent. Often teams in the same league, who are compared against each other for championship or playoff consideration, have not played the same opponents. Therefore, judging their relative win–loss records is complicated.

We looked beyond the record. The committee placed significant value on Oregon's quality of wins.
—
College football playoff committee chairman Jeff Long, press conference, week 12 of the 2014 season,^[5]
after ranking 9–1 Oregon above 9–0 Florida State

The college football playoff committee uses a limited strength-of-schedule algorithm that only considers opponents' records and opponents' opponents' records

RPI

).

Points versus wins

A key dichotomy among sports rating systems lies in the representation of game outcomes. Some systems store final scores as ternary discrete events: wins, draws, and losses. Other systems record the exact final game score, then judge teams based on margin of victory. Rating teams based on margin of victory is often criticized as creating an incentive for coaches to run up the score, an "unsportsmanlike" outcome.^[7]

Still other systems choose a middle ground, reducing the marginal value of additional points as the margin of victory increases. Sagarin chose to clamp the margin of victory to a predetermined amount.^[8] Other approaches include the use of a decay function, such as a logarithm or placement on a cumulative distribution function.

In-game information

Beyond points or wins, some system designers choose to include more granular information about the game. Examples include time of possession of the ball, individual statistics, and lead changes. Data about weather, injuries, or "throw-away" games near season's end may affect game outcomes but are difficult to model. "Throw-away games" are games where teams have already earned playoff slots and have secured their playoff seeding before the end of the regular season, and want to rest/protect their starting players by benching them for remaining regular season games. This usually results in unpredictable outcomes and may skew the outcome of rating systems.

Team composition

Teams often shift their composition between and within games, and players routinely get injured. Rating a team is often about rating a specific collection of players. Some systems assume parity among all members of the league, such as each team being built from an equitable pool of players via a draft or free agency system as is done in many major league sports such as the NFL, MLB, NBA, and NHL. This is certainly not the case in collegiate leagues such as Division I-A football or men's and women's basketball.

Cold start

At the beginning of a season, there have been no games from which to judge teams' relative quality. Solutions to the cold start problem often include some measure of the previous season, perhaps weighted by what percent of the team is returning for the new season. ARGH Power Ratings is an example of a system that uses multiple previous years plus a percentage weight of returning players.

Rating methods

Permutation of standings

Several methods offer some permutation of traditional standings. This search for the "real" win–loss record often involves using other data, such as point differential or identity of opponents, to alter a team's record in a way that is easily understandable. Sportswriter Gregg Easterbrook created a measure of Authentic Games, which only considers games played against opponents deemed to be of sufficiently high quality.^[9] The consensus is that all wins are not created equal.

I went through the first few weeks of games and redid everyone’s records, tagging each game as either a legitimate win or loss, an ass-kicking win or loss, or an either/or game. And if anything else happened in that game with gambling repercussions – a comeback win, a blown lead, major dysfunction, whatever — I tagged that, too.
— Bill Simmons, sportswriter, Grantland^[10]

Pythagorean

Pythagorean expectation, or Pythagorean projection, calculates a percentage based on the number of points a team has scored and allowed. Typically the formula involves the number of points scored, raised to some exponent, placed in the numerator. Then the number of points the team allowed, raised to the same exponent, is placed in the denominator and added to the value in the numerator. Football Outsiders has used^[11]

{\text{Pythagorean wins}}={\frac {{\text{Points For}}^{2.37}}{{\text{Points For}}^{2.37}+{\text{Points Against}}^{2.37}}}\times {\text{Games Played}}.

The resulting percentage is often compared to a team's true winning percentage, and a team is said to have "overachieved" or "underachieved" compared to the Pythagorean expectation. For example, Bill Barnwell calculated that before week 9 of the 2014 NFL season, the Arizona Cardinals had a Pythagorean record two wins lower than their real record.^[12] Bill Simmons cites Barnwell's work before week 10 of that season and adds that "any numbers nerd is waving a “REGRESSION!!!!!” flag right now."^[13] In this example, the Arizona Cardinals' regular season record was 8-1 going into the 10th week of the 2014 season. The Pythagorean win formula implied a winning percentage of 57.5%, based on 208 points scored and 183 points allowed. Multiplied by 9 games played, the Cardinals' Pythagorean expectation was 5.2 wins and 3.8 losses. The team had "overachieved" at that time by 2.8 wins, derived from their actual 8 wins less the expected 5.2 wins, an increase of 0.8 overachieved wins from just a week prior.

Trading "skill points"

Originally designed by Arpad Elo as a method for ranking chess players, several people have adapted the Elo rating system for team sports such as basketball, soccer and American football. For instance, Jeff Sagarin and FiveThirtyEight publish NFL football rankings using Elo methods.^[14] Elo ratings initially assign strength values to each team, and teams trade points based on the outcome of each game.

Solving equations

Researchers like Matt Mills use

Markov chains to model college football games, with team strength scores as outcomes.^[15] Algorithms like Google's PageRank have also been adapted to rank football teams.^[16]^[17]

List of sports rating systems

Advanced NFL Stats, United States of America National Football League

ARGH Power Ratings
ATP rankings, international tennis
Colley Matrix
Dickinson System, United States of America college football
Pomeroy College Basketball Ratings, United States of America college basketball
soccer, lacrosse, and volleyball

soccer

- obsolete

TrueSkill, a Bayesian ranking system inspired by the Glicko rating system^[18]

Bowl Championship Series computer rating systems

In collegiate American football, the following people's systems were used to choose teams to play in the national championship game.

Anderson & Hester / Seattle Times
Richard Billingsley
Wes Colley / Atlanta Journal-Constitution
Richard Dunkel
Kenneth Massey
Herman Matthews
/ Scripps Howard
New York Times
David Rothman
Jeff Sagarin / USA Today
Peter Wolfe

References

Sporting News
. Retrieved 2011-03-24. This is a look at 20 of the teams (in alphabetical order) residing on this year's big ol' bubble. We've included three statistical rankings. The RPI (ratings percentage index, taken from collegeRPI.com) is considered the standard and is provided to committee members during the selection process. The two other ranking indexes include margin of victory in their formulas—the Pomeroy ratings (at kenpom.com) and Sagarin ratings (via USA Today)—aren't new but have played an increased role in discussions about potential seeds during this college basketball season.

Ken Massey [@masseyratings] (November 3, 2014). "@kenpomeroy human polls have limited value. Computer systems can objectively track all the teams. http://www.masseyratings.com/cb/compare.htm #all351" (Tweet). Retrieved 9 Nov 2014 – via Twitter
.

doi:10.1111/j.1559-1816.2010.00641.x
. Retrieved 11 November 2014.

^ Barnwell, Bill (December 20, 2013). "Safe at Home". Grantland. Retrieved November 11, 2014.

^ Russo, Ralph D. (11 November 2014). "Oregon up to 2 in playoff rankings; TCU to 4th". Associated Press. Retrieved 12 November 2014.

^ Stewart Mandel [@slmandel] (November 12, 2014). "Committee doesn't use an SOS ranking. It looks at opponents' record and opponents' opponents record" (Tweet). Retrieved 12 Nov 2014 – via Twitter.

^ Richards, Darryl (2001). "BCS removes margin-of-victory element". Fox Sports. Retrieved 12 November 2014.

^ Sagarin, Jeff (Fall 2014). "NCAAF Jeff Sagarin Ratings". USA Today. Retrieved 12 November 2014.

^ Easterbrook, Gregg (18 November 2014). "More flags on D spins scoreboards". ESPN. Retrieved 19 November 2014.

^ Simmons, Bill (24 October 2014). "Week 8 Picks: A Gambling Epiphany". Grantland. Retrieved 19 November 2014.

ISBN 978-1-4662-4613-3
.

^ Barnwell, Bill (November 5, 2014). "NFL at the Half: Breaking Down the Numbers". Grantland. Retrieved January 7, 2015.

^ Simmons, Bill (7 November 2014). "Revisiting the Y2K-Compliant Quarterbacks". Retrieved 10 November 2014.

^ Silver, Nate (4 September 2014). "Introducing NFL Elo Ratings". FiveThirtyEight. Retrieved 10 November 2014.

^ Mills, Matt (21 December 2014). "Using Continuous-Time Markov Chains to Rank College Football Teams". The Spread. Retrieved 21 December 2014.

LinkedIN
. 17 March 2016. Retrieved 17 March 2016.

^ "Modifying Google's Page Ranking Algorithm to rank teams". Reddit. 21 December 2014. Retrieved 22 December 2014.

^ Weng, Ruby C.; Lin, Chih-Jen (2011). "A Bayesian Approximation Method for Online Ranking" (PDF). Journal of Machine Learning Research. 12: 267–300.

^ "Wayne Winston: Analytics in the World of Sports". Indiana University Bloomington - Kelley School of Business - Operations & Decisions Technologies. Nov 25, 2013. Retrieved 8 Nov 2014.

Washington Times
. April 13, 2004. Retrieved 8 Nov 2014.

v
t
e
Sports rating systems
Concepts

Home advantage

Sabermetrics

Strength of schedule

Win probability

Methods and computer models

Advanced Football Analytics

ARGH Power Ratings

Bowl Championship Series

Dickinson System

English Chess Federation grading

Litkenhous Ratings

Log5

Pomeroy College Basketball Ratings

Pythagorean expectation

Rating Percentage Index (RPI)

TrueSkill

Elo family

Chessmetrics

DWZ

Elo

Glicko

Universal

Polls and opinion

AP poll

FWAA-NFF Grantland Rice Super 16 Poll

Harris Interactive College Football Poll

Legends Poll

NAIA Coaches' Poll

USA Today/Amway Coaches' Poll

People

Mark Glickman

John Hollinger

Bill James

Kenneth Massey

Ken Pomeroy

Jeff Sagarin

Nate Silver

Jeff Sonas

Peter Wolfe

v
t
e
Sports world rankings

Archery

Athletics

Badminton (junior)

Beach soccer

Beach volleyball

Baseball & softball

Basketball
men

women

Boxing
men

women

Canoe slalom

Chess

Cricket
Test

ODI

T20I

WODI & WT20I

Curling

Cycling (road)
men

women

Darts
PDC

Figure skating

Floorball

Football
men

unofficial elo

women

Golf
men

women

amateur

Field hockey
men

women

Ice hockey

Korfball

Mixed martial arts
UFC

Muay Thai

Netball

Roller hockey

Rugby league
men

women

wheelchair

Rugby union
men

women

Snooker

Squash
men

women

Table tennis

Tennis
men

women

team

Volleyball

Water polo

v
t
e
Sport
Types

Individual

Team

Military sports

Parasports

Women

Professional

Semi-professional

Amateur

Science

Exercise
Biomechanics

Practice

Periodization

Physiology

Strength training

Doping

Medicine
Athletic training

Chriopractic

Injury

Physicians

Psychology

Nutrition
Bodybuilding supplements

Sports drink

Pedagogy
Physical education

Physical activity

Rating system

Sociology

Organizations

Clubs

Governing bodies

Leagues
Season

Postseason

School

Teams

International

Business

Agents

Broadcasting

Economics

Industry

Marketing

Sponsorship

Trade

Communication

General managers

Journalism
Magazines

Podcasts

Radio

Promoters

Culture

Betting

Cheerleaders

Entertainment

Fan

History

Memorabilia

Naming
Nicknames

Numbering

Olympic culture

Philosophy

Rivalries

Sports mascots

Sportsmanship

Violence

Equipment

Artificial turf

Balls

Caving

Exercise

Flying disc

Sportswear

Politics

Sports law

Ministries

National sport

Regulation

Sport by region

Africa

Australia

Asia

Europe

North America

Oceania

South America

Sports portal

Category

Outline

Retrieved from "https://en.wikipedia.org/w/index.php?title=Sports_rating_system&oldid=1220521703"

[1] Sporting News
. Retrieved 2011-03-24. This is a look at 20 of the teams (in alphabetical order) residing on this year's big ol' bubble. We've included three statistical rankings. The RPI (ratings percentage index, taken from collegeRPI.com) is considered the standard and is provided to committee members during the selection process. The two other ranking indexes include margin of victory in their formulas—the Pomeroy ratings (at kenpom.com) and Sagarin ratings (via USA Today)—aren't new but have played an increased role in discussions about potential seeds during this college basketball season.

[2] Ken Massey [@masseyratings] (November 3, 2014). "@kenpomeroy human polls have limited value. Computer systems can objectively track all the teams. http://www.masseyratings.com/cb/compare.htm #all351" (Tweet). Retrieved 9 Nov 2014 – via Twitter
.

[3] :10.1111/j.1559-1816.2010.00641.x
. Retrieved 11 November 2014.

[4] Barnwell, Bill (December 20, 2013). "Safe at Home". Grantland. Retrieved November 11, 2014.

[5] Russo, Ralph D. (11 November 2014). "Oregon up to 2 in playoff rankings; TCU to 4th". Associated Press. Retrieved 12 November 2014.

[6] Stewart Mandel [@slmandel] (November 12, 2014). "Committee doesn't use an SOS ranking. It looks at opponents' record and opponents' opponents record" (Tweet). Retrieved 12 Nov 2014 – via Twitter.

[7] Richards, Darryl (2001). "BCS removes margin-of-victory element". Fox Sports. Retrieved 12 November 2014.

[8] Sagarin, Jeff (Fall 2014). "NCAAF Jeff Sagarin Ratings". USA Today. Retrieved 12 November 2014.

[9] Easterbrook, Gregg (18 November 2014). "More flags on D spins scoreboards". ESPN. Retrieved 19 November 2014.

[10] Simmons, Bill (24 October 2014). "Week 8 Picks: A Gambling Epiphany". Grantland. Retrieved 19 November 2014.

[11] ISBN 978-1-4662-4613-3
.

[12] Barnwell, Bill (November 5, 2014). "NFL at the Half: Breaking Down the Numbers". Grantland. Retrieved January 7, 2015.

[13] Simmons, Bill (7 November 2014). "Revisiting the Y2K-Compliant Quarterbacks". Retrieved 10 November 2014.

[14] Silver, Nate (4 September 2014). "Introducing NFL Elo Ratings". FiveThirtyEight. Retrieved 10 November 2014.

[15] Mills, Matt (21 December 2014). "Using Continuous-Time Markov Chains to Rank College Football Teams". The Spread. Retrieved 21 December 2014.

[16] LinkedIN
. 17 March 2016. Retrieved 17 March 2016.

[17] "Modifying Google's Page Ranking Algorithm to rank teams". Reddit. 21 December 2014. Retrieved 22 December 2014.

[18] Weng, Ruby C.; Lin, Chih-Jen (2011). "A Bayesian Approximation Method for Online Ranking" (PDF). Journal of Machine Learning Research. 12: 267–300.

[19] "Wayne Winston: Analytics in the World of Sports". Indiana University Bloomington - Kelley School of Business - Operations & Decisions Technologies. Nov 25, 2013. Retrieved 8 Nov 2014.

[20] Washington Times
. April 13, 2004. Retrieved 8 Nov 2014.

[1]

[2]

[3]

[4]

[5]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]