This purports to be a detailed explanation of the simbasev3 engine. This
may not be 100% in date as minor changes have been made to the engine
since inception.
PLAYER CARDS
------------
Players are rated in a lot of things. The following things have 100 averages:
-- ball, strike, foul, in-play
-- groundball, flyball, line-drive
-- hard (how hard the ball is hit) for hitters
The following things have 50 averages:
-- speed, arm, defense at each position
-- hard (how hard the ball is hit) for pitchers
-- hitter stamina
Pitcher stamina, which is approximately the number of pitches a pitcher can throw before needing to exit, is
based on role. The process by which pitcher stamina evolves is dependent on age, but basically it aims for 90
for starters, 45 for middle relievers, and 15 for relievers
Literally all attributes in the game are designed to peak at age 25 (with the exception of pitcher-stamina
which is a little wonkier and harder to assess). The attributes have different raw variance, but are scaled to
contribute comparable amounts to the quality of a player -- however, there are synergies (speed is more
important for a groundball hitter, to state one).
BATTER VERSUS PITCHER CONFLICT
------------------------------
Each batter and pitcher will have ratings in each of four pitch outcomes: ball, strike, foul, and in-play. The
probability of achieving an outcome is proportional to the product of the batter and pitcher outcomes
multiplied by a normalization constant: 1.7 for balls, 1.2 for in-play, and 1 for strikes and fouls (these are
picked empirically to yield walk and strikeout rates comparable to MLB in simulation.) Pitches then proceed
until the at-bat is resolved (four balls, three strikes, or in-play). So, if we have the following cards:
BA ST FO IP
HITTER 90 110 100 80
PITCHER 100 130 120 90
we multiply the odds to get (dividing by 100 throughout):
OUTCOME 90 143 120 72
We can then multiply by the normalization to get:
OUTCOME 90 143 120 72
NORM 1.7 1 1 1.2
PROB 153 143 120 86
and the outcome of every pitch is a random variable with those odds, i.e. odds of a ball for this
batter-pitcher encounter are 153/(153+143+120+86), and so on.
BATTED BALL TYPE
----------------
When a ball is in-play, the first thing that happens is that batted ball type is determined. Again, we look at
the hitter and pitcher cards and take products. The normalization here comes from the actual MLB ratios of 45
GB : 36 FB : 19 LD. So, with the table condensed, we might have:
GB FB LD
BATTER 110 140 120
PITCHER 130 100 90
OUTCOME 143 140 108
NORM 45 36 19
PROB 64 50 21
The batted ball types differ a lot. In MLB, groundballs have a batting average of .270 or so; fly balls are
.210 or so, and line drives about .770. I tried to sim this, but there's a good chance I missed by a bunch.
However, it's definitely true that line drives have a very high BA, and fly balls have the lowest BA but can
still be desirable because only fly balls can turn into home runs (and non-HR FB hits are usually for extra
bases, while grounders are usually singles with the occasional double.)
HARDNESS OF BATTED BALL
-----------------------
The first thing that happens after the batted ball type is determined is the speed with which the ball is hit.
Both pitcher and batter will have a HARD rating. Pitcher ratings will average 50, while batter ratings will
average 100. Each combatant will contribute a random variable between 0 and their rating to the ball speed, so
if for instance the pitcher rating is 55 and the batter rating is 97, a random real number between 0 and 55
and one between 0 and 97 will be generated and added to get the batted ball speed. Batter ratings have twice
the standard deviation of pitcher ratings, so the batter is responsible for the lion's share of the variance
here. Average speed is 75. In general, it is always better to hit the ball harder. This will be discussed for
each batted ball type below.
GROUND BALLS
------------
A defensive player is chosen at random with distribution: 3B 4, SS 6, 2B 4, 1B 2. That player then tries to
make a roll to exceed the ball speed: this is rand(110) plus their defense. Since defense averages 50, the
average roll will be 105. If this exceeds the ball speed, then the defender fields the ball and has a chance
to make a play; if not, it goes into the outfield and will be a hit.
If the player does field the ball, then if there is a runner on first, a check is made for a double play on
the batter. Higher ball speed and lower batter-runner speed are more likely to produce DPs. The DP check
(contingent on fielding the ball) only depends on those two variables: defense of the fielder or pivot man,
any arm ratings, and the speed of the runner on first are all irrelevant. If the DP check fails but the ball
is hit fast enough (over 60), a force on second is produced. Runners on second and third always advance.
The batter-runner can also beat out an infield single. The check for this involves their speed and the
defender's defense: specifically, if their speed is greater than the defender's defense plus a rand(100), they
beat it out. This is independent of the batted ball's speed. So, basically, if the defender's defense is
higher than the batter-runner's speed, the batter-runner will never get an infield single, and if the
batter-runner's speed is higher, they will reach first with probability equal to the excess in percent.
You will notice that for an infielder, the only attribute that matters defensively is the defense skill at
that position: speed and arm are entirely irrelevant.
If the ground ball gets through, it is fielded by the appropriate outfielder (CF for SS and 2B, LF for 3B, RF
for 1B).
FLY BALLS
---------
If the fly ball speed is less than 40, it is a pop-up to an infielder. These are always caught with no runner
advancement. Otherwise, it is hit to a random field, with distribution LF 4, CF 6, RF 3. If the fly ball speed
is greater than 130 to center field, or 110 to left or right field, it is a home run. If the fly ball speed is
between 40 and 45, it is a bloop single: runners advance one base with fewer than two outs, and two bases with
two outs.
Otherwise, the outfielder in question has a chance to catch the ball, a very good chance in fact. Their roll
is their defense ability, plus rand(their speed), plus rand(130). Note that this averages a very high number:
140, although with lots of variance involved. If this roll exceeds the ball speed, they make the play. If not,
the ball drops for at least a double, and they field it.
Note that for outfielders, defense, speed, and arm (to be discussed later) all matter.
LINE DRIVES
-----------
A line drive is directed at a fielder with distribution given by the concatenation of the infield and outfield
numbers above. The defender tries to make a play on the ball by rolling rand(def skill) against ball speed.
Note that this is very hard to actually make a play on: average ball speed is 75, while average rand(def) is
25. However, because of the variance, line drives are caught not entirely infrequently.
If the line drive is not caught, an outfield defender (either the one the line drive is hit at, or the one
"behind" the infielder the line drive is hit at) will attempt to cut the ball off. This is based on a check of
their defense against the ball speed, with a bonus of rand(50) if the line drive was hit at the infielder.
Note that this is still relatively unlikely, especially conditional on the ball being hit hard enough to not
be caught. If the ball is cutoff, it is a single, but if it is not, it is at least a double.
RUNNER ADVANCEMENT (AND STOLEN BASES)
-------------------------------------
Essentially, whenever an outfielder has the ball, each runner can try to take an extra base. This check will
be (only) dependent on the runner's speed and the outfielder's arm, with various adjustments for the batted
ball type and location of the outfielder. This process is simulated as follows: the runner "simulates" what
would happen if he goes (i.e. his percentage of getting caught). If it exceeds the break-even point as given
by the simbase win probability chart (tuned on the most recent 2-3 seasons), he tries for the extra base.
Once he tries for it, the outfielder has a chance to make the play, and either the runner gets the base or is out
advancing.
Stolen bases actually work almost exactly the same way: after every ball or strike (not foul or in-play), the
runner simulates an attempt to steal. He goes for it if it is profitable, in which case the catcher may throw
him out. Catcher throws, however, are based on catcher defense as well as arm. In this equation, catcher arm
is worth as much as catcher defense, but catcher arm has a higher variance, so based on spring reports, arm
is more important. However, there are also wild pitches: for each ball or strike, if a catcher fails 7
checks on their defense, it is a wild pitch and all runners move up a base.
FATIGUE
-------
All hitters and pitchers fatigue at rates dependent on their stamina. Each pitch that a pitcher throws
has an equal chance of resulting in a fatigue point: 5/pstam, so that the expected amount of fatigue
when one reaches one's pstam is 5 (the game is designed around expected pulls at 5 fatigue, more or
less, which corresponds to being about 1.00 RA worse). In addition, when a pitcher is pulled, he loses
half his in-game fatigue, but he gets additional fatigue equal on average to one-fifth of the number of
pitches he has thrown. So, a starter throwing their stamina of 80 will end the day with about 18.5
fatigue. For each plate appearance during the game, hitters immediately have a chance of getting fatigue
equal to 1-(hstam/100), where hstam is their stamina (average 50).
After the game, all unused hitters lose 1 fatigue automatically. Then, all hitters have a 50/50 roll on each
subsequent fatigue with a chance to cure it.
All pitchers lose 2 fatigue automatically, and then have a 50/50 chance of losing each fatigue point. So
if our starter had 18 fatigue, he will automatically go down to 16, then have coin flips for each of the
16 points, ending up with an average of 8. This exponential decrease in fatigue means that literally
every pitching plan can work steady-state. As a thought experiment, suppose that starters always pitch
to their limit of 80 (in fact, this is not true, since the auto-manager's pull subroutine is partly
dependent on how many fatigue the starter has, so already-fatigued starters will get quicker hooks, but
bear with me).
If you have only 1 starter with a pstam of 80 pitching 80 pitches a game, he will stabilize at 16.5
fatigue (acquiring 18.5 during the start and losing an average of 18.5 of his 35.). This is basically
untenable, I'd imagine. With 2 starters, they will make each start with an average of about 4 (4 -> 22.5
-> 10 -> 4). With 3 starters, they will make each start with an average of about 1 (1 -> 19.5 -> 9 ->
3.5 -> 0.75). I think in practice, probably with a strict 3-starter rotation, most starts will be 0.
With 4 starters, they will almost always be untired for each start. 3 starters, possibly with a swingman
making spot starts when the law of small numbers kicks in, is probably correct.
At any rate, fatigue affects two things: probability of a strike (on a pitch), and batted ball speed. All
other attributes are unchanged. This means that non-strikeout pitchers probably don't care as much about
fatigue. This is largely a thought experiment as the auto-manager (for the moment, anyway) decides all
pitching-change decisions without considering anything other than the pitcher's number of pitches, fatigue,
and performance to date. There's a page floating around on simulated value of fatigue.
AGING
_____
See aging page.