Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_results() outputs inconsistent tibble structures #272

Open
awanderingspirit opened this issue Jan 10, 2025 · 8 comments
Open

load_results() outputs inconsistent tibble structures #272

awanderingspirit opened this issue Jan 10, 2025 · 8 comments
Assignees

Comments

@awanderingspirit
Copy link

awanderingspirit commented Jan 10, 2025

The following code returns errors:

library('f1dataR')
library('tibble')
results=tibble()
for(year in 1950:2024){
    last_rnd=length(load_schedule(year)$round)
    for(rnd in 1:last_rnd){
        results=rbind(results,load_results(season=year,round=rnd))
    }
}
dir.create("data")
write.csv(results,file=paste0(getwd(),"/data/results.csv"))

The cause behind the errors is that some outputs of load_results are inconsistent. For example, the tibble returned by load_result(season=1950) in contrast with that of load_results(season=2024):

  1. has 1 less column ("time_sec") and
  2. its 11th column is named "top_speed_kpt" instead of "top_speed_kph".

The intended file is written after executing an updated version of the above script incorporating code that addresses both conditions:

library('f1dataR')
library('tibble')
results=tibble()
for(year in 1950:2024){
    last_rnd=length(load_schedule(year)$round)
    for(rnd in 1:last_rnd){
        result_block=load_results(season=year,round=rnd)
        # If "time_sec" column is found missing, it is added.
        if(!("time_sec" %in% colnames(result_block))){
           result_block$time_sec<-rep(NA,nrow(result_block))
        }
        # If "top_speed_kph" column is mispelled, it is renamed to its correct spelling.
	if(names(result_block)[11]=="top_speed_kpt"){
		names(result_block)[11]<-"top_speed_kph"
	}
        results=rbind(results,result_block)
    }
}
dir.create("data")
write.csv(results,file=paste0(getwd(),"/data/results.csv"))
@pbulsink
Copy link
Collaborator

Hey, thanks for the bug report. Can you confirm if you're using the CRAN version of the package or the current development version?

If your not sure, post the output from Sys.info(). Feel free to remove personally identifying information from that output before posting.

The reason I ask is because the data backend is changing very soon (next few weeks). The new backend might be more semantically consistent.

Thanks.

@awanderingspirit
Copy link
Author

Hey, thanks for the bug report. Can you confirm if you're using the CRAN version of the package or the current development version?

If your not sure, post the output from Sys.info(). Feel free to remove personally identifying information from that output before posting.

The reason I ask is because the data backend is changing very soon (next few weeks). The new backend might be more semantically consistent.

Thanks.

I think you meant to ask for packgeDescription() output instead, but I'll post both.

Input: Sys.info()
Output:

          sysname           release           version          nodename
        "Windows"          "10 x64"     "build 19045" 
          machine             login              user    effective_user
         "x86-64"                          

Input: packageDescription("f1dataR")
Output:

Package: f1dataR
Title: Access Formula 1 Data
Version: 1.6.0
Authors@R: c( person("Santiago", "Casanova", ,
        "[email protected]", role = c("aut", "cre", "cph")),
        person("Philip", "Bulsink", , "[email protected]", role =
        "aut", comment = c(ORCID = "0000-0001-9668-2429")) )
Description: Obtain Formula 1 data via the 'Ergast API'
        <https://ergast.com/mrd/> and the unofficial API
        <https://www.formula1.com/en/timing/f1-live> via the 'fastf1'
        'Python' library <https://docs.fastf1.dev/>.
Config/reticulate: list( packages = list( list(package = "fastf1", pip
        = TRUE) ) )
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.1
Depends: R (>= 3.5.0), reticulate (>= 1.14),
Imports: glue, magrittr, tibble, jsonlite, httr2, memoise, janitor,
        dplyr, tidyr, rlang, lifecycle, cli, rappdirs, cachem, withr
Suggests: ggplot2, httptest2, knitr, rmarkdown, testthat (>= 3.0.0),
VignetteBuilder: knitr
URL: https://scasanova.github.io/f1dataR/,
        https://github.com/SCasanova/f1dataR
BugReports: https://github.com/SCasanova/f1dataR/issues
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2024-08-27 21:13:40 UTC; santiagocasanova
Author: Santiago Casanova [aut, cre, cph], Philip Bulsink [aut]
        (<https://orcid.org/0000-0001-9668-2429>)
Maintainer: Santiago Casanova <[email protected]>
Repository: CRAN
Date/Publication: 2024-08-27 21:30:02 UTC
Built: R 4.4.2; ; 2025-01-04 03:29:33 UTC; windows

Notice where it states Repository: CRAN may answer your question.

@pbulsink
Copy link
Collaborator

Thanks again. Yeah, I think i meant sessionInfo() actually, but packageDescription() is good too. Regardless

If you have a chance, try install the dev version (devtools::install_github("SCasanova/f1dataR")) and see if that still has the issue? The dev version uses the new data source, so if the issue doesn't exist in their tables then we'll not need to change any code in the package.

@awanderingspirit
Copy link
Author

Thanks again. Yeah, I think i meant sessionInfo() actually, but packageDescription() is good too. Regardless

If you have a chance, try install the dev version (devtools::install_github("SCasanova/f1dataR")) and see if that still has the issue? The dev version uses the new data source, so if the issue doesn't exist in their tables then we'll not need to change any code in the package.

I got a chance to install the developer version of f1dataR then re-run my original script (without the tibble conditions) and the error was reproduced as follows:

Error in rbind(deparse.level, ...) :
  numbers of columns of arguments do not match
Calls: rbind -> rbind
Execution halted

See version:

Package: f1dataR
Title: Access Formula 1 Data
Version: 1.6.0.9000
Authors@R: c( person("Santiago", "Casanova", ,
        "[email protected]", role = c("aut", "cre", "cph")),
        person("Philip", "Bulsink", , "[email protected]", role =
        "aut", comment = c(ORCID = "0000-0001-9668-2429")) )
Description: Obtain Formula 1 data via the 'Jolpica API'
        <https://jolpi.ca> and the unofficial API
        <https://www.formula1.com/en/timing/f1-live> via the 'fastf1'
        'Python' library <https://docs.fastf1.dev/>.
Config/reticulate: list( packages = list( list(package = "fastf1", pip
        = TRUE) ) )
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Depends: R (>= 3.5.0), reticulate (>= 1.14),
Imports: glue, magrittr, tibble, jsonlite, httr2, memoise, janitor,
        dplyr, tidyr, rlang, lifecycle, cli, rappdirs, cachem, withr
Suggests: ggplot2, httptest2, knitr, rmarkdown, testthat (>= 3.0.0),
VignetteBuilder: knitr
URL: https://scasanova.github.io/f1dataR/,
        https://github.com/SCasanova/f1dataR
BugReports: https://github.com/SCasanova/f1dataR/issues
Config/testthat/edition: 3
RemoteType: github
RemoteHost: api.github.com
RemoteRepo: f1dataR
RemoteUsername: SCasanova
RemoteRef: HEAD
RemoteSha: 4f0c5a6e57035443ae54109b6470063203a34a56
GithubRepo: f1dataR
GithubUsername: SCasanova
GithubRef: HEAD
GithubSHA1: 4f0c5a6e57035443ae54109b6470063203a34a56
NeedsCompilation: no
Packaged: 2025-01-11 20:21:53 UTC; richa
Author: Santiago Casanova [aut, cre, cph], Philip Bulsink [aut]
        (<https://orcid.org/0000-0001-9668-2429>)
Maintainer: Santiago Casanova <santiago.casanova@yahoo.com>
Built: R 4.4.2; ; 2025-01-11 20:21:56 UTC; windows

-- File: C:/Program Files/R/R-4.4.2/library/f1dataR/Meta/package.rds

@awanderingspirit
Copy link
Author

I haven't tested if the issue is endemic to the rest of the load__() functions.

@pbulsink pbulsink self-assigned this Jan 15, 2025
pbulsink added a commit to pbulsink/f1dataR that referenced this issue Jan 15, 2025
@pbulsink
Copy link
Collaborator

So I think i have it fixed, but there's some other issues blocking merging the fixed code into the repo (the automated test system doesn't like something about Python 3.12.8 right now).

If you want, you could install the package with the changes I'm trying to merge in by running the following:

devtools::install_github(pbulsink/f1dataR, ref="bugfix")

Note that when I test the package on my local machine, and when it's tested with Python versions other than 3.12 in the automated system, that everything passes ok.

If you do test this version and it works, please let us know.

@awanderingspirit
Copy link
Author

Ran:

library('f1dataR')
library('tibble')
results=tibble()
for(year in 1950:2024){
    last_rnd=length(load_schedule(year)$round)
    for(rnd in 1:last_rnd){
        result_block=load_results(season=year,round=rnd)
	result_block$season<-rep(year,nrow(result_block))
	result_block$round<-rep(rnd,nrow(result_block))
        results=rbind(results,result_block)
    }
}
dir.create("data")
write.csv(results,file=paste0(getwd(),"/data/results.csv"))

Obtained

! Failure at Jolpica with https:// connection. Retrying as http://.Error getting Jolpica data, http status code 429.
Too Many Requests
Error in rep(year, nrow(result_block)) : invalid 'times' argument
Execution halted

So I think i have it fixed, but there's some other issues blocking merging the fixed code into the repo (the automated test system doesn't like something about Python 3.12.8 right now).

If you want, you could install the package with the changes I'm trying to merge in by running the following:

devtools::install_github(pbulsink/f1dataR, ref="bugfix")

Note that when I test the package on my local machine, and when it's tested with Python versions other than 3.12 in the automated system, that everything passes ok.

If you do test this version and it works, please let us know.

SCasanova added a commit that referenced this issue Feb 28, 2025
@SCasanova
Copy link
Owner

Sorry about the delay. The PR with a likely fix from @pbulsink has been merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants