Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RforMassSpectrometry development call 2023-05-19 #32

Open
2 of 4 tasks
jorainer opened this issue May 18, 2023 · 1 comment
Open
2 of 4 tasks

RforMassSpectrometry development call 2023-05-19 #32

jorainer opened this issue May 18, 2023 · 1 comment

Comments

@jorainer
Copy link
Member

jorainer commented May 18, 2023

Topics to discuss:

  • (better) support for peak annotations in Spectra (Add support for peak annotations Spectra#287):
    • peaksVariables to list available peak variables (default "mz", "intensity").
    • MsBackendSQL, MsBackendMassbank, MsBackendTimsTof, MsBackendCompDb and MsBackendMemory support additional peaks variables. peaksData returns them - but as a list of matrix.
    • method to access/set specifically peak annotations (not m/z and intensity values!): peaksAnnotation? peaksDataFrame (method for MsBackend) return all peaks variables except m/z and intensity as a list of data.frame. Setter method to add/replace these values).
    • ensure filtering does not cause data corruption. peaksAnnotation,Spectra needs to take care of that.
  • split documentation into smaller chunks Split Spectra documentation into smaller chunks Spectra#288
@jorainer
Copy link
Member Author

  • Agreed on a better support for peaks annotations.
  • peaks variables can be added with backendInitialize,MsBackend specifying the columns of the submitted DataFrame that contain the peaks variables (see example below).
  • to add/change peak variables to a Spectra: addPeaksVariable <- function(x, i = seq_len(x), value = list(), name = character()). Add or update a single peaks variable. Parameter i allows to specify to which spectrum/spectra in x the annotation should be added.
  • Problem due to filtering of peaks matrices (e.g. with filterMzValues, filterIntensity etc): these functions subset the peaksData on-the-fly. Current implementations fail to correctly subset peaks annotations. Solution to add rownames to the peaks matrix (representing the index of each peak in the original data as returned by the backend) was discarded. A solution based on peaksData returning a data.frame instead of a matrix will be tested.

Examples

library(Spectra)

df <- data.frame(rtime = c(1.1, 1.2, 1.3, 1.4),
                 msLevel = 1L)
df$mz <- list(c(13, 14.1, 22, 23, 24, 49),
              c(45.1, 56),
              c(34.3, 134.4, 344, 443),
              c(12.1, 31))
df$intensity <- list(c(100, 300, 30, 120, 12, 34),
                     c(345, 234),
                     c(123, 124, 145, 3),
                     c(122, 421))

#' add some arbitrary information for each peak to the data.frame
df$ann <- list(c("a", NA, "b", "c", "d", NA),
               c("e", "f"),
               c("g", "h", "i", NA),
               c("j", "k"))

B <- Spectra(df, peaksVariables = c("mz", "intensity", "ann"))

#' Peak annotations get stored into @peaksDataFrame
B@backend@peaksDataFrame

peaksVariables(B)
[1] "mz"        "intensity" "ann" 

peaksData(B@backend, columns = peaksVariables(B))[[1L]]
     mz     intensity ann
[1,] "13.0" "100"     "a"
[2,] "14.1" "300"     NA 
[3,] "22.0" " 30"     "b"
[4,] "23.0" "120"     "c"
[5,] "24.0" " 12"     "d"
[6,] "49.0" " 34"     NA 


#' Not ideal because a matrix is returned...
#' In addition, the "generic" access is also supported
B$ann

Problems with filtering:

B2 <- filterMzValues(B, 23, tolerance = 1)
peaksData(B2, columns = c("mz", "intensity"))[[1L]]
     mz intensity
[1,] 22        30
[2,] 23       120
[3,] 24        12

#' Peaks variables were/are not updated properly.
B2$ann[[1L]]
[1] "a" NA  "b" "c" "d" NA 

#' Error extracting peaksData (since a character vector is returned by `peaksData`
peaksData(B2, columns = c("mz", "intensity", "ann"))[[1L]]
Error in x * ppm : non-numeric argument to binary operator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant