This purports to be a detailed explanation of the simbasev3 engine. This may not be 100% in date as minor changes have been made to the engine since inception. PLAYER CARDS ------------ Players are rated in a lot of things. The following things have 100 averages: -- ball, strike, foul, in-play -- groundball, flyball, line-drive -- hard (how hard the ball is hit) for hitters The following things have 50 averages: -- speed, arm, defense at each position -- hard (how hard the ball is hit) for pitchers -- hitter stamina Pitcher stamina, which is approximately the number of pitches a pitcher can throw before needing to exit, is based on role. The process by which pitcher stamina evolves is dependent on age, but basically it aims for 90 for starters, 45 for middle relievers, and 15 for relievers Literally all attributes in the game are designed to peak at age 25 (with the exception of pitcher-stamina which is a little wonkier and harder to assess). The attributes have different raw variance, but are scaled to contribute comparable amounts to the quality of a player -- however, there are synergies (speed is more important for a groundball hitter, to state one). BATTER VERSUS PITCHER CONFLICT ------------------------------ Each batter and pitcher will have ratings in each of four pitch outcomes: ball, strike, foul, and in-play. The probability of achieving an outcome is proportional to the product of the batter and pitcher outcomes multiplied by a normalization constant: 1.7 for balls, 1.2 for in-play, and 1 for strikes and fouls (these are picked empirically to yield walk and strikeout rates comparable to MLB in simulation.) Pitches then proceed until the at-bat is resolved (four balls, three strikes, or in-play). So, if we have the following cards: BA ST FO IP HITTER 90 110 100 80 PITCHER 100 130 120 90 we multiply the odds to get (dividing by 100 throughout): OUTCOME 90 143 120 72 We can then multiply by the normalization to get: OUTCOME 90 143 120 72 NORM 1.7 1 1 1.2 PROB 153 143 120 86 and the outcome of every pitch is a random variable with those odds, i.e. odds of a ball for this batter-pitcher encounter are 153/(153+143+120+86), and so on. BATTED BALL TYPE ---------------- When a ball is in-play, the first thing that happens is that batted ball type is determined. Again, we look at the hitter and pitcher cards and take products. The normalization here comes from the actual MLB ratios of 45 GB : 36 FB : 19 LD. So, with the table condensed, we might have: GB FB LD BATTER 110 140 120 PITCHER 130 100 90 OUTCOME 143 140 108 NORM 45 36 19 PROB 64 50 21 The batted ball types differ a lot. In MLB, groundballs have a batting average of .270 or so; fly balls are .210 or so, and line drives about .770. I tried to sim this, but there's a good chance I missed by a bunch. However, it's definitely true that line drives have a very high BA, and fly balls have the lowest BA but can still be desirable because only fly balls can turn into home runs (and non-HR FB hits are usually for extra bases, while grounders are usually singles with the occasional double.) HARDNESS OF BATTED BALL ----------------------- The first thing that happens after the batted ball type is determined is the speed with which the ball is hit. Both pitcher and batter will have a HARD rating. Pitcher ratings will average 50, while batter ratings will average 100. Each combatant will contribute a random variable between 0 and their rating to the ball speed, so if for instance the pitcher rating is 55 and the batter rating is 97, a random real number between 0 and 55 and one between 0 and 97 will be generated and added to get the batted ball speed. Batter ratings have twice the standard deviation of pitcher ratings, so the batter is responsible for the lion's share of the variance here. Average speed is 75. In general, it is always better to hit the ball harder. This will be discussed for each batted ball type below. GROUND BALLS ------------ A defensive player is chosen at random with distribution: 3B 4, SS 6, 2B 4, 1B 2. That player then tries to make a roll to exceed the ball speed: this is rand(110) plus their defense. Since defense averages 50, the average roll will be 105. If this exceeds the ball speed, then the defender fields the ball and has a chance to make a play; if not, it goes into the outfield and will be a hit. If the player does field the ball, then if there is a runner on first, a check is made for a double play on the batter. Higher ball speed and lower batter-runner speed are more likely to produce DPs. The DP check (contingent on fielding the ball) only depends on those two variables: defense of the fielder or pivot man, any arm ratings, and the speed of the runner on first are all irrelevant. If the DP check fails but the ball is hit fast enough (over 60), a force on second is produced. Runners on second and third always advance. The batter-runner can also beat out an infield single. The check for this involves their speed and the defender's defense: specifically, if their speed is greater than the defender's defense plus a rand(100), they beat it out. This is independent of the batted ball's speed. So, basically, if the defender's defense is higher than the batter-runner's speed, the batter-runner will never get an infield single, and if the batter-runner's speed is higher, they will reach first with probability equal to the excess in percent. You will notice that for an infielder, the only attribute that matters defensively is the defense skill at that position: speed and arm are entirely irrelevant. If the ground ball gets through, it is fielded by the appropriate outfielder (CF for SS and 2B, LF for 3B, RF for 1B). FLY BALLS --------- If the fly ball speed is less than 40, it is a pop-up to an infielder. These are always caught with no runner advancement. Otherwise, it is hit to a random field, with distribution LF 4, CF 6, RF 3. If the fly ball speed is greater than 130 to center field, or 110 to left or right field, it is a home run. If the fly ball speed is between 40 and 45, it is a bloop single: runners advance one base with fewer than two outs, and two bases with two outs. Otherwise, the outfielder in question has a chance to catch the ball, a very good chance in fact. Their roll is their defense ability, plus rand(their speed), plus rand(130). Note that this averages a very high number: 140, although with lots of variance involved. If this roll exceeds the ball speed, they make the play. If not, the ball drops for at least a double, and they field it. Note that for outfielders, defense, speed, and arm (to be discussed later) all matter. LINE DRIVES ----------- A line drive is directed at a fielder with distribution given by the concatenation of the infield and outfield numbers above. The defender tries to make a play on the ball by rolling rand(def skill) against ball speed. Note that this is very hard to actually make a play on: average ball speed is 75, while average rand(def) is 25. However, because of the variance, line drives are caught not entirely infrequently. If the line drive is not caught, an outfield defender (either the one the line drive is hit at, or the one "behind" the infielder the line drive is hit at) will attempt to cut the ball off. This is based on a check of their defense against the ball speed, with a bonus of rand(50) if the line drive was hit at the infielder. Note that this is still relatively unlikely, especially conditional on the ball being hit hard enough to not be caught. If the ball is cutoff, it is a single, but if it is not, it is at least a double. RUNNER ADVANCEMENT (AND STOLEN BASES) ------------------------------------- Essentially, whenever an outfielder has the ball, each runner can try to take an extra base. This check will be (only) dependent on the runner's speed and the outfielder's arm, with various adjustments for the batted ball type and location of the outfielder. This process is simulated as follows: the runner "simulates" what would happen if he goes (i.e. his percentage of getting caught). If it exceeds the break-even point as given by the simbase win probability chart (tuned on the most recent 2-3 seasons), he tries for the extra base. Once he tries for it, the outfielder has a chance to make the play, and either the runner gets the base or is out advancing. Stolen bases actually work almost exactly the same way: after every ball or strike (not foul or in-play), the runner simulates an attempt to steal. He goes for it if it is profitable, in which case the catcher may throw him out. Catcher throws, however, are based on catcher defense as well as arm. In this equation, catcher arm is worth as much as catcher defense, but catcher arm has a higher variance, so based on spring reports, arm is more important. However, there are also wild pitches: for each ball or strike, if a catcher fails 7 checks on their defense, it is a wild pitch and all runners move up a base. FATIGUE ------- All hitters and pitchers fatigue at rates dependent on their stamina. Each pitch that a pitcher throws has an equal chance of resulting in a fatigue point: 5/pstam, so that the expected amount of fatigue when one reaches one's pstam is 5 (the game is designed around expected pulls at 5 fatigue, more or less, which corresponds to being about 1.00 RA worse). In addition, when a pitcher is pulled, he loses half his in-game fatigue, but he gets additional fatigue equal on average to one-fifth of the number of pitches he has thrown. So, a starter throwing their stamina of 80 will end the day with about 18.5 fatigue. For each plate appearance during the game, hitters immediately have a chance of getting fatigue equal to 1-(hstam/100), where hstam is their stamina (average 50). After the game, all unused hitters lose 1 fatigue automatically. Then, all hitters have a 50/50 roll on each subsequent fatigue with a chance to cure it. All pitchers lose 2 fatigue automatically, and then have a 50/50 chance of losing each fatigue point. So if our starter had 18 fatigue, he will automatically go down to 16, then have coin flips for each of the 16 points, ending up with an average of 8. This exponential decrease in fatigue means that literally every pitching plan can work steady-state. As a thought experiment, suppose that starters always pitch to their limit of 80 (in fact, this is not true, since the auto-manager's pull subroutine is partly dependent on how many fatigue the starter has, so already-fatigued starters will get quicker hooks, but bear with me). If you have only 1 starter with a pstam of 80 pitching 80 pitches a game, he will stabilize at 16.5 fatigue (acquiring 18.5 during the start and losing an average of 18.5 of his 35.). This is basically untenable, I'd imagine. With 2 starters, they will make each start with an average of about 4 (4 -> 22.5 -> 10 -> 4). With 3 starters, they will make each start with an average of about 1 (1 -> 19.5 -> 9 -> 3.5 -> 0.75). I think in practice, probably with a strict 3-starter rotation, most starts will be 0. With 4 starters, they will almost always be untired for each start. 3 starters, possibly with a swingman making spot starts when the law of small numbers kicks in, is probably correct. At any rate, fatigue affects two things: probability of a strike (on a pitch), and batted ball speed. All other attributes are unchanged. This means that non-strikeout pitchers probably don't care as much about fatigue. This is largely a thought experiment as the auto-manager (for the moment, anyway) decides all pitching-change decisions without considering anything other than the pitcher's number of pitches, fatigue, and performance to date. There's a page floating around on simulated value of fatigue. AGING _____ See aging page.