Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statcast_search has broken due to batspeed and swing length being added #337

Open
mlascaleia opened this issue May 15, 2024 · 16 comments
Open

Comments

@mlascaleia
Copy link

The tibbles exported from statcast used to have 92 columns, now they have 94!

I foresee this being a continuous error as more and more stats are added. Here is my suggested fix to turn this into something that throws a warning instead of breaking the package:

# (somewhere within the statcast_search function before the payload is searched for)
colos <- c("pitch_type", "game_date", 
            "release_speed", "release_pos_x", "release_pos_z", 
            "player_name", "batter", "pitcher", 
            "events", "description", "spin_dir", 
            "spin_rate_deprecated", "break_angle_deprecated", 
            "break_length_deprecated", "zone", "des", 
            "game_type", "stand", "p_throws", 
            "home_team", "away_team", "type", 
            "hit_location", "bb_type", "balls", 
            "strikes", "game_year", "pfx_x", 
            "pfx_z", "plate_x", "plate_z", 
            "on_3b", "on_2b", "on_1b", "outs_when_up", 
            "inning", "inning_topbot", "hc_x", 
            "hc_y", "tfs_deprecated", "tfs_zulu_deprecated", 
            "fielder_2", "umpire", "sv_id", 
            "vx0", "vy0", "vz0", "ax", 
            "ay", "az", "sz_top", "sz_bot", 
            "hit_distance_sc", "launch_speed", "launch_angle", 
            "effective_speed", "release_spin_rate", 
            "release_extension", "game_pk", "pitcher_1", 
            "fielder_2_1", "fielder_3", "fielder_4", 
            "fielder_5", "fielder_6", "fielder_7", 
            "fielder_8", "fielder_9", "release_pos_y", 
            "estimated_ba_using_speedangle", "estimated_woba_using_speedangle", 
            "woba_value", "woba_denom", "babip_value", 
            "iso_value", "launch_speed_angle", "at_bat_number", 
            "pitch_number", "pitch_name", "home_score", 
            "away_score", "bat_score", "fld_score", 
            "post_away_score", "post_home_score", 
            "post_bat_score", "post_fld_score", "if_fielding_alignment", 
            "of_fielding_alignment", "spin_axis", 
            "delta_home_win_exp", "delta_run_exp")
colNumber <- ncol(payload) 
if(length(colos) != colNumber){
  newCols <- paste("newStat", 1:(length(colos) - colNumber))
  colos <- c(colos, newCols)
  message("New stats detected! baseballr will be updated soon to properly identify these stats")
}
# payload is acquired somewhere in here
# when the payload columns need to be named:
names(payload) <- colos

This way the function will still work when new stats are added, and their names can be updated whenever you update the package

@mattm14
Copy link

mattm14 commented May 16, 2024

it also fails on this function: scrape_statcast_savant_pitcher is there a work around that can be applied, similar to the above?

@camdenk
Copy link
Collaborator

camdenk commented May 16, 2024

Download the dev version with devtools::install_github("BillPetti/baseballr") and this should be fixed!

@mlascaleia
Copy link
Author

Thanks for updating! I do want to note with the fix that was implemented the code will still break in the same way if the statcast tibbles are not exactly 94 columns from here on out. Just something worth noting!

@camdenk
Copy link
Collaborator

camdenk commented May 16, 2024

Yep! We're going to add a more permanent fix, but wanted to get the hotfix out asap once the switch was made.

Thanks!

@mattm14
Copy link

mattm14 commented May 17, 2024

thanks for the update!

@aglobos49
Copy link

I reinstalled and am still getting the same column number error. I even did force = TRUE to make sure I got the newest version. Anything else I can try?

@kylemccarthy2
Copy link

I reinstalled and am still getting the same column number error. I even did force = TRUE to make sure I got the newest version. Anything else I can try?

I am having the same issue. Would appreciate any possible help!

@camdenk
Copy link
Collaborator

camdenk commented May 28, 2024

Did you install with install.packages("baseballr") or devtools::install_github("BillPetti/baseballr")?

@kylemccarthy2
Copy link

I used devtools::install_github("BillPetti/baseballr"), then to load in the library it is library(baseballr) correct?

@camdenk
Copy link
Collaborator

camdenk commented May 28, 2024

Yep! Did you restart your R session between installing and then using the package?

@kylemccarthy2
Copy link

I believe I got it, thank you so much!

@SFall34
Copy link

SFall34 commented Jun 22, 2024

Thanks so much for sharing this. Can you please explain how I would work this fix into the following line of code:
Season_Data <- scrape_statcast_savant_batter_all(start_date = "2023-09-27", end_date = "2023-10-01")

When I run the next line,
colNumber <- ncol(payload)
I get the following error:
Error in ncol(payload) : object 'payload' not found

@jmorgs11
Copy link

Its happening again : Error in setnames(x, value) :
Can't assign 92 names to a 112 column data.table

and the dev installation method didn't help so far

@mlascaleia
Copy link
Author

Thanks so much for sharing this. Can you please explain how I would work this fix into the following line of code: Season_Data <- scrape_statcast_savant_batter_all(start_date = "2023-09-27", end_date = "2023-10-01")

When I run the next line, colNumber <- ncol(payload) I get the following error: Error in ncol(payload) : object 'payload' not found

This wont work because payload only exists within the context of the function. You have to edit the function itself with the code I wrote above, then run the function with the updated code

@mlascaleia
Copy link
Author

Alright gang, I made a bad, janky fix for this issue that will act as a stopgap before the package is actually updated. It works by taking unaccounted-for columns and just calling them "newStat". I make no guarantees that it doesn't just ruin other functionalities of the package, but it will get you what you need in the meantime. I'd love to make a cleaner fix but this is a hobby not my job lol.

Run 'devtools::install_github("mlascaleia/baseballr")' in a new R session to install. This will overwrite your current version of baseballr and will not receive any updates that baseballr receives

@jmorgs11
Copy link

That works beautifully. Thank you very much for your efforts!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants