Skip to content

Latest commit

 

History

History
33 lines (17 loc) · 1.51 KB

README.md

File metadata and controls

33 lines (17 loc) · 1.51 KB

MLB Pitch Prediction Project

Introduction

This repository contains code and analysis for predicting a pitcher's next pitch using pitch-by-pitch data from Major League Baseball (MLB). The project aims to develop a predictive model and provide insights into a pitcher's behavior.

Project Structure

The project is structured as follows:

  • sandbox_and_EDA notebook: Initial exploratory data analysis (EDA) to understand the dataset's feature distribution and familiarize with the data.

  • pitch_data.py: Module containing a function to load the dataset into other notebooks. The module includes docstrings and exception handling to demonstrate data engineering skills.

  • dummy_results.ipynb: Development of a dummy classifier to establish baseline predictions and set performance goals. Results and code for the dummy classifier can be found in this notebook.

  • models_and_features.ipynb: Implementation of the predictive model and assessment of evaluation metrics. The notebook presents a model with a 0.48 accuracy and outlines next steps for improvement.

Usage

To replicate the analysis or explore the project, follow these steps:

  1. Open the Jupyter notebooks in the respective order mentioned above.

  2. Execute the code cells to reproduce the analysis and view the results.

Next Steps

For further development and enhancement of the project, consider the following next steps outlined in the models_and_features.ipynb notebook.

Acknowledgments

Thank you for time in reviewing this project.