Skip to content

A project that examines the relationship between MLB teams' payroll and their success on the field.

Notifications You must be signed in to change notification settings

patkennedy11/MSDS692_PracticumProject--Play_Ball_off_of_Payroll

Repository files navigation

Play Ball off of Payroll

MSDS 692 - Practicum Project

Author: Pat Kennedy

Research Question

What effect does a MLB team’s payroll have on the success and output of the product on the field?

  • Are there correlations with:
    • Winning percentage?
    • Team statistics?
      • Hitting
      • Pitching
    • Postseason?

About the Data

Two datasets:

Examining Relationships

(From Historical Dataset)

  • Key:
    • Light Tan = Strong Positive correlation
    • Dark Purple = Strong Negative correlation

PayrollRelationships

Findings: There are strong relationships between Pitching Stats (WHIP / ERA) & Postseason

More Relationships

(From Historical Dataset) Examining the relationship between Payroll Rank and Winning Percentage

  • Key:
    • Red = Team made Postseason
    • Blue = Team did not make Postseason

Rank:Win%

Findings:

  • Teams that rank in the top 5 in payroll made the playoffs 14/20 times. That is 70%
  • There have only been 3 instances of a team in the top 5 rank that had a below .500 winning percentage.

Predictive Model

Since the goal of the project is to predict multiple outputs for the future dataset based on trends from the historical dataset, I decided to use a Random Forest Regressor wrapped in a Multi-Output Regressor.

From Historic Dataset:

  • Features: Financial Data columns
  • Targets: Performance Data columns

Scaled the historical dataset and used the same scale to the future dataset.

Findings

In the MLB, 12 Teams make the Postseason in a given season (6 from each League).

The model I built selected the teams that had the highest percentage chance of making the postseason out of 30 teams for each League. Here, they are listed, with respects to their chance of making the postseason as well as where they would be positioned in the postseason clinchings:

American League

  1. Kansas City Royals (94%) - Central Division Winner
  2. Seattle Mariners (93%) - West Division Winner
  3. Chicago White Sox (89%) - Wild Card Team #1
  4. Houston Astros (75%) - East Division Winner
  5. Toronto Blue Jays (71%) - Wild Card Team #2
  6. Texas Rangers (66%) - Wild Card Team #3

National League

  1. Los Angeles Dodgers (84%) - West Division Winner
  2. Philadelphia Phillies (77%) - Central Division Winner
  3. Atlanta Braves (75%) - East Division Winner
  4. St. Louis Cardinals (68%) - Wild Card Team #1
  5. New York Mets (58%) - Wild Card Team #2
  6. Pittsburgh Pirates (54%) - Wild Card Team #3

Predicted 2024 Postseason Bracket

PostseasonBracket

All 2024 Predictions:

2024_Predictions.csv

Conclusion

In this project, I examined the relationship between MLB teams' payroll and their output and success on the field. Furthermore, I made predictions for team statistics based on preseason payroll data. Throughout the project, I found strong correlations between pitching stats and payroll rank. In addition, there were interesting correlations between payroll rank, winning percentage, and whether or not a team made the postseason that year. To make predictions for the 2024 season, I used Multi-Output Regresson and Random Forest Regresson. I am excited measure how my preseason predictions match up at the end of the 2024 regular season in October!

About

A project that examines the relationship between MLB teams' payroll and their success on the field.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published