Glossary - Seamheads Negro Leagues Database

Glossary

Glossary Links
Baseball-Reference FanGraphs SABR Guide Wikipedia The Hardball Times
MLB.com Baseball Prospectus ESPN Sabermetrics (Wiki) Baseball-Almanac

Similarity Scores (Pitchers)

Intro

Similarity scores were created by Bill James and first introduced in his 1986 Abstract. This is how James describes the method:

"One of the most common arguments for any Hall of Fame candidate is the argument that Joe is comparable to Jim and Jim is in the Hall of Fame, so Joe should be, too. Similarity scores are a way of assessing the objective elements of an If-A-then-B argument."


Formula

* Normally, similarity scores use career stats. For the Negro Leagues Database, career stats per 162 games are used.

This description is taken from Bill James' book "Whatever Happened to the Hall of Fame?"....

There are seventeen "penalties" in the system for pitchers. That system is:

1,000 points
minus 1 point for each difference of one win
minus 1 point for each difference of two losses
minus 1 point for each .002 difference in winning percentage, up to a maximum of 100 points
minus 1 point for each .02 of difference in ERA, up to a limit of 100 points
minus 1 point for each 10 difference in games
minus 1 point for each 20 difference in starts
minus 1 point for each 20 difference in complete games
minus 1 point for each 50 difference in innings pitched
minus 1 point for each 50 difference in hits allowed
minus 1 point for each 30 difference in strikeouts
minus 1 point for each 10 difference in walks
minus 1 point for each 5 difference in shutouts
minus 1 point for each 3 difference in saves

To these 13 penalties, we add the following caveats and adjustments:
Subtract 10 points if you are comparing a right-handed pitcher to a left-hander. If they're relief pitchers, make it 25 points.

The penalty for winning percentage cannot be larger than 1.5 times the sum of the penalties for wins and losses. (This prevents a 50-point penalty between a pitcher who is 2-2 and a pitcher who is 2-3. With a full career, this rule will rarely come into play.)

For relief pitchers, the penalty for winning percentage is one-half of what it otherwise would be. (For the two rules which require you to identify a reliever, the definition is that a relief pitcher is any pitcher who (a) makes more relief appearances than starts in his career, and (b) has a career average of less than 4.00 innings per game.)

Since the records of pitchers before 1920 have quite different meaning than the records of pitchers after 1920, there is a 25-point penalty if one of the pitchers compared was born before 1890 and the other was born after 1890, if the two of them were born at least ten years apart (in other words, no 25-point penalty if one was born in 1888 and the other in 1891).


If you have any questions regarding Negro Leagues statistical or biographical data, please contact gary@seamheads.com. For any other questions/comments/suggestions, please contact the web developer at BaseballGauge@gmail.com.

All biographical data, copyright 2011-2018 Gary Ashwill.

Playing statistics for 1887-1922 and 1926-1938, as well as all Cuban League games (1902-1928) and Negro League vs. Major League games (1887-1944), copyright 2011-2018 Gary Ashwill.

Playing statistics for 1923 (except Negro League vs. Major League games), copyright 2011-2018 Patrick Rock.

Playing statistics for 1933 and 1943, copyright 2013-2018 Scott Simkus.

Playing statistics for 1924-1925, 1939-1942, and 1944-1946 Negro Leagues (not including Cuban League and Negro League vs. Major League games), copyright 2011-2018 Larry Lester, Wayne Stivers, Gary Ashwill.


Defensive Regression Analysis data used here was obtained with permission from Michael Humphreys, author of Wizardry

Win Shares are calculated using the formula in the book Win Shares by Bill James