SITS: Satellite Image Time Series Analysis for Earth Observation Data Cubes (submit R package for review) #596

14 of 15 tasks
gilbertocamara opened this issue Jul 2, 2023 · 82 comments


gilbertocamara commented Jul 2, 2023

Submitting Author Name: Gilberto Camara
Submitting Author Github Handle: @gilbertocamara
Other Package Authors Github handles: @rolfsimoes, @OldLipe, @pedro-andrade-inpe
Version submitted: 1.4.2
Submission type: Standard
Editor: @mpadge
Reviewers: @mikemahoney218, @paleolimbot

Due date for @mikemahoney218: 2024-08-07

Due date for @paleolimbot: 2024-08-07
Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Package: sits
Type: Package
Version: 1.4.2
Title: Satellite Image Time Series Analysis for Earth Observation Data Cubes
Authors@R: c(person('Rolf', 'Simoes', role = c('aut'), email = '[email protected]'),
             person('Gilberto', 'Camara', role = c('aut', 'cre'), email = '[email protected]'),
             person('Felipe', 'Souza', role = c('aut'), email = '[email protected]'),
             person('Lorena', 'Santos', role = c('aut'), email = '[email protected]'),
             person('Pedro', 'Andrade', role = c('aut'), email = '[email protected]'),
             person('Karine', 'Ferreira', role = c('aut'), email = '[email protected]'),
             person('Alber', 'Sanchez', role = c('aut'), email = '[email protected]'),
             person('Gilberto', 'Queiroz', role = c('aut'), email = '[email protected]')
Maintainer: Gilberto Camara <[email protected]>
Description: An end-to-end toolkit for land use and land cover classification
    using big Earth observation data, based on machine learning methods 
    applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>.
    Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, 
    Brazil Data Cube, and Digital Earth Africa using the Spatio-temporal Asset Catalog (STAC) 
    protocol (<> and the 'gdalcubes' R package 
    developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>.
    Supports visualization methods for images and time series and 
    smoothing filters for dealing with noisy time series.
    Includes functions for quality assessment of training samples using self-organized maps 
    as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. 
    Provides machine learning methods including support vector machines, 
    random forests, extreme gradient boosting, multi-layer perceptrons,
    temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, 
    residual networks by Fawaz et al (2019) <doi:10.1007/s10618-019-00619-1>, and temporal attention encoders
    by Garnot and Landrieu (2020) <arXiv:2007.00586>.
    Performs efficient classification of big Earth observation data cubes and includes 
    functions for post-classification smoothing based on Bayesian inference, and 
    methods for uncertainty assessment. Enables best
    practices for estimating area and assessing accuracy of land change as 
    recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>.
    Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.
Encoding: UTF-8
Language: en-US
Depends: R (>= 4.1.0)
License: GPL-2
ByteCompile: true
LazyData: true
    dplyr (>= 1.0.0),
    parallel (>= 4.0.5),
    purrr (>= 0.3.0),
    rstac (>= 0.9.2-3),
    sf (>= 1.0-12),
    slider (>= 0.2.0),
    terra (>= 1.5-17),
    tibble (>= 3.1),
    tidyr (>= 1.2.0),
    torch (>= 0.9.0),
    gdalcubes (>= 0.6.0),
    kohonen (>= 3.0.11),
    leafem (>= 0.2.0),
    leaflet (>= 2.1.1),
    luz (>= 0.3.0),
    RcppArmadillo (>= 0.11),
    stars (>= 0.6),
    testthat (>= 3.1.3),
    tmap (>= 3.3),
    torchopt (>= 0.1.2),
Config/testthat/edition: 3
Config/testthat/parallel: false
Config/testthat/start-first: cube, raster, regularize, data, ml
RoxygenNote: 7.2.3


  • Please indicate which category or categories from our package fit policies this package falls under:

    • [X ] geospatial data
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):
    sits is a package for satellite image time series analysis, that works with big Earth observation data sets.

  • Who is the target audience and what are scientific applications of this package?
    The target audience is made of remote sensing and environmental experts that want to classify remote sensing images for applications such as deforestation detection, agricultural and land use/land cover mapping, biodiversity conservation, and land degradation monitoring.,

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    There are currently no other open source software packages that have the same capabilities.

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
    Not applicable

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
    Not applicable

  • Explain reasons for any pkgcheck items which your package is unable to pass.

(a) Vignettes: instead of preparing vignettes, the authors have written an on-line book that describes the contents of the package in detail. The book is available at the URL

Important notes:

(1) To run the tests, examples, and code coverage, please make
sure the following environment variables are set in the R session:
Sys.setenv("SITS_RUN_TESTS" = "YES")
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")
sits is a fairly large package, and the tests take a long time to run, since they access cloud services. For this reason, testing needs to be manually enabled.

(2) Please review version 1.4.2, not yet on CRAN, which is available in the "dev" branch in the github repository.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
    The package is already on CRAN.

  • Do you intend for this package to go on Bioconductor?

  • [ x] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

Editor check started


Checks for sits (v1.4.1)

git hash: 6eac9edf

  • ✖️ Package name is not available (on CRAN).
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✖️ The following function has no documented return value: [sits_filter]
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✖️ Package has no HTML vignettes
  • ✖️ These functions do not have examples: [plot.sits_cluster, sits_filter, sits_list_collections].
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage is 0.1% (should be at least 75%).
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL-2

1. Package Dependencies

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in C++ (5% in 14 files) and R (95% in 131 files)
  • 8 authors
  • no vignette
  • 4 internal data files
  • 18 imported packages
  • 164 exported functions (median 11 lines of code)
  • 1988 non-exported functions in R (median 7 lines of code)
  • 73 R functions (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 131 99.5
files_src 14 95.0
files_vignettes 0 0.0 TRUE
files_tests 46 99.0
loc_R 18106 99.6 TRUE
loc_src 1021 62.7
loc_tests 5220 98.3 TRUE
num_vignettes 0 0.0 TRUE
data_size_total 186830 86.9
data_size_median 50517 88.2
n_fns_r 2152 99.9 TRUE
n_fns_r_exported 164 97.9 TRUE
n_fns_r_not_exported 1988 99.9 TRUE
n_fns_src 73 74.2
n_fns_per_file_r 9 84.0
n_fns_per_file_src 5 49.3
num_params_per_fn 4 54.6
loc_per_fn_r 8 20.0
loc_per_fn_r_exp 11 25.1
loc_per_fn_r_not_exp 7 18.0
loc_per_fn_src 11 28.5
rel_whitespace_R 12 98.9 TRUE
rel_whitespace_src 13 56.8
rel_whitespace_tests 14 97.2 TRUE
doclines_per_fn_exp 44 55.5
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 3546 99.5 TRUE

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges


GitHub Workflow Results

id name conclusion sha run_number date
5438924419 R-CMD-check success bc5d6c 116 2023-07-02

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking installed package size ... NOTE
    installed size is 16.7Mb
    sub-directories of 1Mb or more:
    libs 14.1Mb

R CMD check generated the following check_fail:

  1. rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 0.08

The following files are not completely covered by tests:

file coverage
R/api_accessors.R 0%
R/api_accuracy.R 0%
R/api_apply.R 0%
R/api_band.R 0%
R/api_bbox.R 0%
R/api_block.R 0%
R/api_check.R 0%
R/api_chunks.R 0%
R/api_classify.R 0%
R/api_cluster.R 0%
R/api_combine_predictions.R 0%
R/api_comp.R 0%
R/api_conf.R 0%
R/api_csv.R 0%
R/api_cube.R 0%
R/api_data.R 0%
R/api_debug.R 0%
R/api_download.R 0%
R/api_expressions.R 0%
R/api_factory.R 0%
R/api_file_info.R 0%
R/api_file.R 0%
R/api_gdal.R 0%
R/api_gdalcubes.R 0%
R/api_imputation.R 0%
R/api_jobs.R 0%
R/api_label_class.R 0%
R/api_mixture_model.R 0%
R/api_ml_model.R 0%
R/api_mosaic.R 0%
R/api_parallel.R 0%
R/api_period.R 0%
R/api_plot_raster.R 0%
R/api_plot_time_series.R 0%
R/api_point.R 0%
R/api_predictors.R 0%
R/api_raster_sub_image.R 0%
R/api_raster_terra.R 0%
R/api_raster.R 0%
R/api_reclassify.R 0%
R/api_roi.R 0%
R/api_samples.R 0%
R/api_segments.R 0%
R/api_sf.R 0%
R/api_shp.R 0%
R/api_signal.R 0%
R/api_smooth.R 0%
R/api_smote.R 0%
R/api_som.R 0%
R/api_source_aws.R 0%
R/api_source_bdc.R 0%
R/api_source_deafrica.R 0%
R/api_source_hls.R 0%
R/api_source_local.R 0%
R/api_source_mpc.R 0%
R/api_source_sdc.R 0%
R/api_source_stac.R 0%
R/api_source_usgs.R 0%
R/api_source.R 0%
R/api_space_time_operations.R 0%
R/api_stac.R 0%
R/api_stats.R 0%
R/api_summary.R 0%
R/api_tibble.R 0%
R/api_tile.R 0%
R/api_timeline.R 0%
R/api_torch.R 0%
R/api_ts.R 0%
R/api_tuning.R 0%
R/api_uncertainty.R 0%
R/api_utils.R 0%
R/api_variance.R 0%
R/api_view.R 0%
R/sits_accuracy.R 0%
R/sits_active_learning.R 0%
R/sits_apply.R 0%
R/sits_bands.R 0%
R/sits_bbox.R 0%
R/sits_classify.R 0%
R/sits_cluster.R 0%
R/sits_colors.R 0%
R/sits_combine_predictions.R 0%
R/sits_config.R 0%
R/sits_csv.R 0%
R/sits_cube_copy.R 0%
R/sits_cube.R 0%
R/sits_factory.R 0%
R/sits_filters.R 0%
R/sits_geo_dist.R 0%
R/sits_get_data.R 0%
R/sits_label_classification.R 0%
R/sits_labels.R 0%
R/sits_lighttae.R 0%
R/sits_machine_learning.R 0%
R/sits_merge.R 0%
R/sits_mixture_model.R 0%
R/sits_mlp.R 0%
R/sits_model_export.R 0%
R/sits_mosaic.R 0%
R/sits_patterns.R 0%
R/sits_plot.R 0%
R/sits_predictors.R 0%
R/sits_reclassify.R 0%
R/sits_regularize.R 0%
R/sits_resnet.R 0%
R/sits_sample_functions.R 0%
R/sits_segmentation.R 0%
R/sits_select.R 0%
R/sits_sf.R 0%
R/sits_smooth.R 0%
R/sits_som.R 0%
R/sits_summary.R 0%
R/sits_tae.R 0%
R/sits_tempcnn.R 0%
R/sits_temporal_segmentation.R 0%
R/sits_timeline.R 0%
R/sits_train.R 0%
R/sits_tuning.R 0%
R/sits_uncertainty.R 0%
R/sits_utils.R 12.5%
R/sits_validate.R 0%
R/sits_values.R 0%
R/sits_variance.R 0%
R/sits_view.R 0%
R/sits_xlsx.R 0%
src/combine_data.cpp 0%
src/kernel.cpp 0%
src/label_class.cpp 0%
src/linear_interp.cpp 0%
src/nnls_solver.cpp 0%
src/normalize_data_0.cpp 0%
src/normalize_data.cpp 0%
src/sampling_window.cpp 0%
src/smooth_bayes.cpp 0%
src/smooth_sgp.cpp 0%
src/smooth_whit.cpp 0%
src/smooth.cpp 0%
src/uncertainty.cpp 0%

Cyclocomplexity with cyclocomp

The following function have cyclocomplexity >= 15:

function cyclocomplexity
sits_cube.stac_cube 16

Static code analyses with lintr

lintr found the following 23 potential issues:

message number of times
Avoid library() and require() calls in packages 16
Lines should not be more than 80 characters. 3
Use <-, not =, for assignment. 4

Package Versions

package version

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

Many thanks for your response. Please see below the following explanation, which was included as an "Important Note" in the submission, but maybe it has failed to catch the attention of the reviewers.

  • Explain reasons for any pkgcheck items which your package is unable to pass.
    ✖️ Package name is not available (on CRAN).
    Package in already on CRAN. See

✖️ The following function has no documented return value: [sits_filter]
✖️ These functions do not have examples: [plot.sits_cluster, sits_filter, sits_list_collections].
These problems have been fixed in version 1.4.2 of the package, which is available in the "dev" branch on GitHub. To clone the "dev" branch please use the command

git clone --branch dev

✖️ Package has no HTML vignettes
Instead of preparing vignettes, the authors have written an on-line book that describes the contents of the package in detail. The book is available at the URL

✖️ Package coverage is 0.1% (should be at least 75%).
Package coverage is actually 95%. Please see

sits is a large package. There are more than 1,100 individual tests that take a long time to run. Some of these tests access cloud services, which might be temporarily offline. For this reason, testing needs to be manually enabled. To run the tests, examples, and code coverage, please set the following environment variables in the R session:

Sys.setenv("SITS_RUN_TESTS" = "YES")
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")

We are confident that sits meets the required criteria for ROpenSci review.

We would also like to respond to the lintr message:

Avoid library() and require() calls in packages - 16 

The package imports directly 17 packages, which are required for most functions. It also suggests 33 packages, which are typically used only in a few function, and need to be included only in "as-is" basis. This is based on CRAN policies that restrict the number of imported packages.

Copy link

maelle commented Jul 7, 2023

Thank you for your submission @gilbertocamara! As well as your careful response to the automatic checks. We agree with all your responses above.

However, as this a package that implements statistical and ML methods of geospatial data, rather than just the “accessing, manipulating, converting” and converting in our scope, it falls under under our newer statistical peer review program, which has its own time series and [geospatial standards]( Submission requirements are different for this as authors need to document their standards compliance with our code annotation system.

A note - SITS is a package that is very large in scope and code base, as exemplified by the fact that it has a whole book for its documentation. As such, we anticipate that it will be challenging to find reviewers and we will need to give them considerably longer than usual to review the code base and documentation in full. Most of our submissions are not as large or mature at the point of review and up for significant API or architecture changes in response to review. For something in an earlier stage we would likely have suggested breaking functionality up into smaller, more focused packages. Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the withr::local_envvar() function or similar.

Thanks again! We're happy to answer further questions.

Copy link

Dear @maelle, many thanks for your response. Please see my comments below:

Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

Good! Looking at the specific requirements for ROpenSci statistical packages, the sits package meets most of them, such as G.2 (related to data input), G.3 (algorithms), G.4 (output data). We will have to review the sits package carefully as for requirements G.1 (documentation), G.5 (testing) and those for machine learning.

At a first glance, sits complies with requirements SP (spatial software) and TS (time series) and UL (unsupervised learning) . Since these requirements are very detailed, we will carefully review them to ensure compliance. We believe we meet the PD reqs (probability distr).

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the [withr::local_envvar()]( function or similar.

Allow me to propose an alternative: please consider that the information provided in "" to be sufficient to assert that sits meets the code coverage requirements of ROpenSci. If you accept this proposal, it will save us both time and work.

We will work on improving sits so that it meets the specifications for ROpenSci statistical packages. We will report back to you when we have a new version that fully meets such specs.


Copy link

maelle commented Jul 7, 2023

Thank you! 🎉 Here's a direct link to the author guide for stat submissions:

Copy link

maelle commented Jul 8, 2023

@gilbertocamara just a clarification: your package will have to comply with one of the category standards of the statistical review system, probably spatial (because time series is for class-based manipulation of time-series data, which is not what your package does as far as I understand).

Probability distributions should be considered an "additional' category that may be complied with in addition to the main categories.

Thank you!


This comment was marked as resolved.


This comment was marked as resolved.


This comment was marked as resolved.


This comment was marked as resolved.


This comment was marked as resolved.

maelle commented Jul 11, 2023

@gilbertocamara the CONTRIBUTING guide could link to the book chapter, as long as it's easy to find all information.

I'm still waiting for R CMD check to finish, but examples ran without error (tests now running).

Thanks for the tips!

maelle commented Jul 11, 2023

Tests passed! Now on to trying autotest...

Copy link

Dear @maelle, autotest runs OK in "sits". Now, we have to work on the recommendations. Thanks!

Copy link

Dear @maelle @mpadge I would like to ask for your help to understand how autotest works. As I understand it, autotest run different diagnostics on the functions of a package. It aims to test the resilience of the function to unexpected values of the parameters, for example NA values. It also tries to guess the parameter type from the Rd documentation; here, it tests the function for invalid entries, e.g, numeric inputs for integer parameters. That's important and valuable for software designers.

In the sits package, the authors have been very careful to include pre-conditions for all parameters of all functions. All parameters are checked for valid values, and an error message is provided. However, we are finding there is a mismatch between the error messages provided by sits and those expected by autotest. For us, it is not clear what autotest considers as a valid response.

Consider the following function, which takes as input a set of spatially referenced time series and allows the user to select some of its members. Users can either select a number or a fraction of the series. The relevant part of the code is shown below:

#' @title Sample a percentage of a time series
#' @name sits_sample
#' @author Rolf Simoes, \email{}
#' @description Takes a sits tibble with different labels and
#'              returns a new tibble. For a given field as a group criterion,
#'              this new tibble contains a given number or percentage
#'              of the total number of samples per group.
#'              Parameter n: number of random samples.
#'              Parameter frac: a fraction of random samples.
#'              If n is greater than the number of samples for a given label,
#'              that label will be sampled with replacement. Also,
#'              if frac > 1 , all sampling will be done with replacement.
#' @param  data       Sits time series.
#' @param  n          Integer: number of samples to select (range: 1 to nrow(data)).
#' @param  frac       Percentage of samples to pick from each group of data.
#' @param  oversample Oversample classes with small number of samples?
#' @return            A sits tibble with a fixed quantity of samples.
#' @examples
#' # Retrieve a set of time series with 2 classes
#' data(cerrado_2classes)
#' # Print the labels of the resulting tibble
#' summary(cerrado_2classes)
#' # Samples the data set
#' data_100 <- sits_sample(cerrado_2classes, n = 100)
#' # Print the labels
#' summary(data_100)
#' # Sample by fraction
#' data_02 <- sits_sample(cerrado_2classes, frac = 0.2)
#' # Print the labels
#' summary(data_02)
#' @export
sits_sample <- function(data,
                        n = NULL,
                        frac = NULL,
                        oversample = TRUE) {
    # set caller to show in errors
    # verify if data is valid
    # verify if either n or frac is informed
        x = !(purrr::is_null(n) & purrr::is_null(frac)),
        local_msg = "neither 'n' or 'frac' parameters were informed",
        msg = "invalid sample parameters"
    # check oversample
    # check n and frac parameters
    if (!purrr::is_null(n))
        .check_num(n, allow_na = FALSE, is_integer = TRUE,
                   min = 1, max = nrow(data),
                   len_min = 1, len_max = 1,
                   msg = "invalid n parameter")
    if (!purrr::is_null(frac))
        .check_num(frac, allow_na = FALSE, is_integer = FALSE,
                   min = 0.0, max = 10.0,
                   len_min = 1, len_max = 1,
                   msg = "invalid frac parameter")

The output for autotest for this function is:

  type  test_name fn_name     parameter parameter_type operation content               test  
1 error NA        sits_sample NA        NA             NA        sits_sample: invalid… TRUE 

In the above the content column is:

sits_sample: invalid n parameter (value is not integer)

We are failing to understand what is being tested by autotest and what is the expected response. As you can see from the code above, we explicitly test for NA and test for the valid values of the input parameters. In principle, we cannot find flaws in the error messages we provide. Please see some examples below.

> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = 0.3)
> Error: sits_sample: invalid n parameter (value is not integer)

sits_sample(cerrado_2classes, frac = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, frac =  30)
> Error: sits_sample: invalid frac parameter (value should be <= 10)

We are failing to see what we might be doing wrong. What are the expectations of autotest which are not met by our input parameter tests?

We would appreciate your response.


Copy link

Dear @maelle @mpadge

Please, could you explain what appears to be an unexpected behaviour of autotest?

Today, I ran autotest twice on version 1.4.2 (dev) of the sits package. The first response had 16 issues (please see the RDS file in From what I could understand from the autotest output, it complains about the expected return values of R functions that are called for side-effects.

I tried to fix some of these problems by considering the recommendations of the tidyverse design guide. In Section 26 ("Side-effect functions should return invisibly"), the guide states: "If a function is called primarily for its side-effects, it should invisibly return a useful output. If there’s no obvious output, return the first argument". See more at

I am assuming that autotest follows the same guidelines. Thus, I included invisible return values in all sits functions that are called for side-effects. Then, I ran autotest again. To my surprise, it flagged 48 issues. Please see the second autotest output at

Could you please help me and explain why autotest increases its number of issues from 16 to 48? Your help will be most appreciated.

*** MWE ***

# install dev version
# enable examples and tests
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")
Sys.setenv("SITS_RUN_TESTS" = "YES")
# first run of autotest
autotest_1 <- autotest::autotest_package(package = "sits", test = TRUE)
# second run of autotest
autotest_2 <- autotest::autotest_package(package = "sits", test = TRUE)

Best regards

Copy link

maelle commented Jul 17, 2023

Hello! I'll get to this later this week, thanks for your patience!

Copy link

Dear @maelle @mpadge Begging your indulgence for being insistent, I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest. Consider the following case. The sits packages deals with big data, processing time series of satellite images. All functions that produce new images need to specify a directory where the results are stored. This is achieved by a parameter called output_dir, which is used in 18 functions, with the same parameter name and the same use.

Out of these 18 instances, autotest produces a diagnostic in only two (2) cases. In both instances, it produces a single_char_case diagnostic. As I understand it, this diagnostic works on the premise that changing the case of a character parameter should yield the same result. Obviously, this expectation cannot be met by operating systems where directory names are case-dependent.

Since the condition to proceed with the revision for statistical packages submitted to ROpenSci is that autotest should not find any problems with the code (no diagnostics, no warnings, no errors), I am at a loss on how to proceed. Please advise on what can be done in this case.

Please also explain why autotest only flags this condition in 2 out of the 18 cases where the parameter output_dir is used.

Many thanks for your help,

Copy link

maelle commented Jul 18, 2023

res <- autotest::autotest_package("/home/maelle/Documents/ropensci/SOFTWARE-REVIEW/sits", test = TRUE)
#> Loading required namespace: devtools
#> ℹ Loading sits
#> SITS - satellite image time series analysis.
#> Loaded sits v1.4.2.
#>         See ?sits for help, citation("sits") for use in publication.
#>         Documentation avaliable in
#> ★ Extracting example code from 107 .Rd files
#> ✔ Converted examples to yaml
#> ── autotesting sits ──
#> ✔ [1 / 19]: sits_clean
#> ✔ [2 / 19]: sits_cluster_clean

#> ✔ [3 / 19]: sits_cluster_dendro
#> ✔ [4 / 19]: sits_cluster_frequency
#> ✔ [5 / 19]: sits_config_show
#> ✔ [6 / 19]: sits_labels
#> ✔ [7 / 19]: sits_pred_features
#> ✔ [8 / 19]: sits_pred_normalize
#> ✔ [9 / 19]: sits_pred_references
#> ✔ [10 / 19]: sits_pred_sample
#> ✔ [11 / 19]: sits_predictors
#> ✔ [12 / 19]: sits_reclassify
#> ✔ [13 / 19]: sits_sample
#> ✔ [14 / 19]: sits_select
#> ✔ [15 / 19]: sits_select
#> ✔ [16 / 19]: sits_stats
#> ✔ [17 / 19]: sits_timeline
#> ✔ [18 / 19]: sits_to_csv
#> ✔ [19 / 19]: sits_validate
type test_name fn_name parameter parameter_type operation content test yaml_hash
error NA sits_clean NA NA normal function call argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error NA sits_clean NA NA NA argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error negate_logical sits_clean progress single logical Negate default value of logical parameter argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error return_successful sits_clean (return object) (return object) error from normal operation argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error NA sits_pred_sample NA NA normal function call :group_by(pred, .data[[“label”]]): argument “pred” is missing, with no default TRUE 3f65689ba22d83703d7e8dfea39329c8
error return_successful sits_pred_sample (return object) (return object) error from normal operation argument “pred” is missing, with no default TRUE 3f65689ba22d83703d7e8dfea39329c8
error NA sits_reclassify NA NA normal function call argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error NA sits_reclassify NA NA NA argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error return_successful sits_reclassify (return object) (return object) error from normal operation argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error NA sits_sample NA NA NA sits_sample: invalid value - param is not integer (value is not integer) TRUE 20524a33c207002d7fb57cf205a60f57
error NA sits_select NA NA normal function call sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
error return_successful sits_select (return object) (return object) error from normal operation sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
warning par_is_demonstrated sits_cluster_dendro bands NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
warning par_is_demonstrated sits_cluster_dendro k NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
warning par_is_demonstrated sits_validate samples_validation NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
diagnostic int_range sits_clean window_size single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [window_size = 5] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic int_range sits_clean memsize single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [memsize = 8] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic int_range sits_clean multicores single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [multicores = 2] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean output_dir single character lower-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean output_dir single character upper-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean version single character lower-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean version single character upper-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic return_desc_includes_class sits_clean (return object) (return object) Check whether description of return value specifies class Function [sits_clean] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic vector_custom_class sits_cluster_dendro samples vector Custom class definitions for vector input Function [sits_cluster_dendro] errors on vector columns with different classes when submitted as samples Error message: tryCatch: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro dist_method single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro dist_method single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro linkage single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro linkage single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro palette single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro palette single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic random_char_string sits_cluster_dendro palette single character random character string as parameter does not match arguments to expected values TRUE bb0d908f751509f98d150384274f8529
diagnostic single_par_as_length_2 sits_cluster_dendro palette single character Length 2 vector for length 1 parameter Parameter [palette] of function [sits_cluster_dendro] is only used a single character value, but responds to vectors of length > 1 TRUE bb0d908f751509f98d150384274f8529
diagnostic subst_int_for_logical sits_cluster_dendro .plot single logical Substitute integer values for logical parameter (Function call should still work unless explicitly prevented) TRUE bb0d908f751509f98d150384274f8529
diagnostic return_desc_includes_class sits_cluster_dendro (return object) (return object) Check whether description of return value specifies class Function [sits_cluster_dendro] returns a value of class [sits_cluster, sits, tbl_df, tbl, data.frame], which differs from the value provided in the description TRUE bb0d908f751509f98d150384274f8529
diagnostic vector_custom_class sits_labels data vector Custom class definitions for vector input Function [sits_labels] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE 19a24496601a7a343f58f03b8bb428f0
diagnostic return_desc_includes_class sits_pred_sample (return object) (return object) Check whether description of return value specifies class Function [sits_pred_sample] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 3f65689ba22d83703d7e8dfea39329c8
diagnostic vector_custom_class sits_predictors samples vector Custom class definitions for vector input Function [sits_predictors] errors on vector columns with different classes when submitted as samples Error message: tryCatch: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 1f67e542daa971702a44e33b80cbf7df
diagnostic single_char_case sits_reclassify rules single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify rules single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic int_range sits_reclassify memsize single integer Ascertain permissible range Function [sits_reclassify] does not respond appropriately for specified/default input [memsize = 4] TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic int_range sits_reclassify multicores single integer Ascertain permissible range Function [sits_reclassify] does not respond appropriately for specified/default input [multicores = 2] TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify output_dir single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify output_dir single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify version single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify version single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic return_desc_includes_class sits_reclassify (return object) (return object) Check whether description of return value specifies class Function [sits_reclassify] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic vector_custom_class sits_sample data vector Custom class definitions for vector input Function [sits_sample] errors on vector columns with different classes when submitted as data Error message: sits_sample: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 20524a33c207002d7fb57cf205a60f57
diagnostic int_range sits_sample n single integer Ascertain permissible range Parameter [n] defines only one positive or negative limit; plese either specify both lower and upper limits, or that values must be ‘positive’ or ‘negative’ TRUE 20524a33c207002d7fb57cf205a60f57
diagnostic vector_custom_class sits_select data vector Custom class definitions for vector input Function [sits_select] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE aa507da50025d46879f398dbdd2851a9
diagnostic single_char_case sits_select bands single character lower-case character parameter is case dependent TRUE aa507da50025d46879f398dbdd2851a9
diagnostic vector_custom_class sits_select data vector Custom class definitions for vector input Function [sits_select] errors on vector columns with different classes when submitted as data Error message: sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
diagnostic return_desc_includes_class sits_select (return object) (return object) Check whether description of return value specifies class Function [sits_select] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 48668453b1a14d43448243a281745542
diagnostic vector_custom_class sits_timeline data vector Custom class definitions for vector input Function [sits_timeline] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE 61a55bad554b362221b02d495c3015f7
diagnostic vector_custom_class sits_to_csv data vector Custom class definitions for vector input Function [sits_to_csv] errors on vector columns with different classes when submitted as data Error message: sits_metadata_to_csv: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic single_char_case sits_to_csv file single character lower-case character parameter is case dependent TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic single_char_case sits_to_csv file single character upper-case character parameter is case dependent TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic random_char_string sits_to_csv file single character random character string as parameter does not match arguments to expected values TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic return_desc_includes_class sits_to_csv (return object) (return object) Check whether description of return value specifies class Function [sits_to_csv] returns a value of class [sits, tbl_df, tbl, data.frame], which differs from the value provided in the description TRUE 6f7afd7299bacb77f4656d9672c7e46b
message NA sits_cluster_dendro NA NA normal function call calculating dendrogram… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call finding the best cut… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call best number of clusters = 6 TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call best height for cutting the dendrogram = 20.3965461960775 TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call cutting the tree… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call Plotting dendrogram… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call result is a tibble with cluster indexes… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter calculating dendrogram… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter finding the best cut… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter best number of clusters = 6 TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter best height for cutting the dendrogram = 20.3965461960775 TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter cutting the tree… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter Plotting dendrogram… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter result is a tibble with cluster indexes… TRUE bb0d908f751509f98d150384274f8529

Created on 2023-07-18 with reprex v2.0.2

Copy link

maelle commented Jul 18, 2023

@gilbertocamara regarding the output you mentioned in your comment #596 (comment), can you confirm it's gone? I don't see that exact error (I'm going through your comments chronologically).

Copy link

Dear @maelle, above you have shown the autotest output without running the actual tests. My comments above refer to the output with the parameter test set to TRUE. This is the result that counts.

Copy link

maelle commented Jul 18, 2023

I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest

Good question. To me the best answer is currently, does it help? I opened an issue in autotest because I agree the documentation could be improved on this front ropensci-review-tools/autotest#83

Copy link

maelle commented Jul 18, 2023

currently actually running the tests 😅 sorry about that

Copy link

maelle commented Jul 18, 2023

Regarding the flagging of 2/18 functions, obviously I'll have a better idea once I have the results locally, but since autotest works by scraping examples, this might be due to different examples in these 2 functions?

Copy link

maelle commented Jul 18, 2023

I updated the results I get. Are they the same as on your machine @gilbertocamara?

Copy link

Unfortunately, no. Please wait a little bit. Yesterday, we made some changes to sits trying to match the expectations of autotest. We are currently running the latest test. Please give me until the end of the morning BRT to provide you with an update.

Copy link

mpadge commented Jun 7, 2024

@ropensci-review-bot assign @mikemahoney218 as reviewer

Copy link

@mikemahoney218 added to the reviewers list. Review due date is 2024-06-28. Thanks @mikemahoney218 for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

mpadge commented Jun 7, 2024

@ropensci-review-bot assign @paleolimbot as reviewer

Copy link

mpadge commented Jun 7, 2024

Copy link

mpadge commented Jun 7, 2024

Copy link

mpadge commented Jun 7, 2024

@gilbertocamara Exctied that we now have two reviewers for your package. Owing to it's large size, review durations have been extended from the usual 3 weeks to 2 months.

Copy link

gilbertocamara commented Jun 7, 2024

Dear @mpadge @paleolimbot @mikemahoney218
Many thanks to @mpadge for motivating two reviewers and to @paleolimbot and @mikemahoney218 for accepting to do it.

May I suggest to @paleolimbot and @mikemahoney218 that they start the software review by browsing the online book ( The book is a self-learning document and takes a step-by-step approach with reproducible code.

Unlike most R packages, which are a collection of mostly independent functions, the sits API is an end-to-end solution for analysing big Earth observation data. Thus, the functions have to be executed in a well-defined order. Looking at the scripts in the online book will give one a better sense of how the package has been organised.

Again, many thanks to @paleolimbot and @mikemahoney218 for your generosity.

Copy link

Hi @paleolimbot and @mikemahoney218, just checking in to see how you're going with your reviews that were due at the beginning of this month as the current EIC. Please just let us know if you need some more time so we can update this issue.

Copy link

Deeply, deeply apologize for how far behind I am on this one. It's now on top of my pile and I should have a review soon. Apologies again for the delay.

Copy link

Dear @mikemahoney218 many thanks for your time and effort! Please note that sits is now on version 1.5.2. on CRAN. Before looking at the code, may I suggest you browse the online book that describes the package. Some additional tips that might help you:

(1) sits differs from most traditional R packages, which are collections of functions that can be used independently.
It is an end-to-end solution for Earth observation (EO) land use and land cover classification. As such, there is a certain preferred order of function execution. Please see the Introduction chapter of the online book for more details.

(2) sits is more than a collection of machine learning algorithms for EO analytics. Much effort went into providing access to EO cloud collection and organising analysis-ready data (ARD) images into data cubes. In R, there is nothing similar. Even compared with EO-ML packages in Python, sits is the only software that allows easy access to cloud collections.

(3) To access EO cloud collections, sits works together with rstac, which is the main support in R for STAC collection access. rstac has also been developed by our team.

(4) A useful comparison table has been prepared by the Python TorchGEO team. It provides a fair appraisal of the ML capabilities of different EO-ML packages.

(5) As far as we know, the algorithms for SOM-based training data quality control and Bayesian inference for post-processing are only available on sits.

(6) Much thought and effort went into designing the sits API. Please see a comparison of the sits API and that of the Google Earth Engine.

Copy link

Apologies on my end too! I'll prioritize taking a look at this this week 🙂

Copy link

Dear @paleolimbot @mikemahoney218 any news on your review of the sits package? Many thanks in advance.

Copy link

Hi @gilbertocamara , and deep apologies for the radio silence here. I'm about 8-10 hours into my review so far and am still making progress. That's not to say I'm finding a lot of concerns, but rather reviewing the book and the package has become quite a bit more of a task than anticipated.

Sorry to not have more for you this instant, other than "I've started and am making progress". I'll post comments on the book once they're finished and then will follow with package-specific comments after that.

Copy link

Dear @mikemahoney218 Many thanks for your generous effort to review the package. Indeed, we understand there is a lot to review, which only underscores our gratitude for your time.

Copy link

mpadge commented Oct 30, 2024

@paleolimbot Could you please respond to Gilberto's question above? Thanks

Copy link

Dear @mpadge, apologies for bothering you. Any news on the review?

Copy link

Apologies to all for not responding here for a very long time! As you may have suspected, my ability to put adequate time into a thorough review was really just not there, and I apologize for not responding earlier to at least say it 😬

I did take a severely time-boxed look through the book, package code, and issues this evening and I'm happy to share my thoughts (take them with a grain of salt given the amount of content that is here and the amount of time I spent reviewing it).

First, it's great! You have a well-organized repository with a number of contributors, a documented set of design principles, and files with names that make sense that contain reasonable amounts of code. Everything is documented (there's even a book, for crying out loud!), and the formatting of the code itself does not distract from its content. There's a clarity of vision here and that comes through in all aspects of the repo content.

One of the challenges with providing comments on a well-established package with several releases, users, and existing contributors, is that substantial comments about design are sort of unaddressable (because they would break compatibility for existing users). In addition to the fact that there is just a lot of content here to cover well, I have to admit that was also a factor in me taking so long to get here (i.e., do you need me!?).

The one challenge I will make here is one of scope: this is a package that does a lot of things. I don't think you should change that (for your existing users), but it may be worth thinking about whether there are some self-contained pieces of functionality that are useful outside of specifically satellite imagery or whose rate of change might differ from other parts of the package (e.g., collections of imagery are necessarily something that require updating as the APIs progress or new collections become available; a data frame-based raster time series model has much broader applicability).

Again with apologies for getting here in an inexcusably late (partial/cursory) review here 😬

Copy link

mpadge commented Jan 30, 2025

Thanks @paleolimbot for your response, and we appreciate your explanations. @mikemahoney218 How are things looking from you side?

Copy link

Dear @paleolimbot, many thanks for your kind comments. Regarding the idea of breaking sits in smaller, self-contained packages, it is a question that the developers have discussed a lot. Ultimately, we decided to keep sits together for the following reasons:

  • sits is an operational package for end-users, providing an end-to-end workflow. Many of these users have limited knowledge of R, so it is easier to follow examples such as those in the book. They don't have the mindset of R experts, which know to combine pieces from different packages.
  • Considering the challenges of documentation, s/w maintenance, and dealing with the CRAN policy, it is easier to have a single package. Also, there is a single point of communication between users and developers, which is the issues interface in github.
  • There is also the issue of branding. Many of perspective sits users are not part of the R community. In fact, most EO (Earth observation) package developers work in Python. To attract non-R experts to use sits, we must build a reputation in the global EO community. Arguably, having a single brand (sits) projects a stronger image to the global community than a number of smaller packages. Even Hadley Wickham acknowledged this problem when he created the tidyverse brand.
  • Finally, there is the question of the book. As you have noticed, we put a lot of effort into producing a book which is not simply a description of the package's functions. It is a guide on how to use big EO data for land classification, where the actions are illustrated by examples using sits. The inspiration for the book was the work "R for Data Science" by Wickham and Golemund.

So, I hope to have managed to explain why sits is a single package. Many thanks again for your time and effort.

Copy link

mpadge commented Jan 30, 2025

Thanks @gilbertocamara. Please accept once again my apologies for this process taking so long. You are nevertheless very aware that this really is an enormous package to review. I have some ideas for how we can nevertheless proceed here, but will just wait for an update from @mikemahoney218.

Copy link

Dear @gilbertocamara and @mpadge this is just to mark the start of my EiC rotation. As a note to myself, I see the communication is fluid and the last interaction was very recent. Things are taking long but it's expected given the size of the package. I'll step back and let things flow. Thanks!

Copy link

Dear @maurolepore, many thanks for your contribution!! Since you are an expert on use of geospatial data for forest studies, it seems relevant to inform you that the sits package is being using operationally in Brazil to support the production of official maps of LUCC classification in the Amazon and Cerrado biomes by INPE (Brazil's National Institute for Space Research).

Copy link

mpadge commented Feb 12, 2025

@gilbertocamara We just posted this message in our semi-private slack, repeating here for transparency, with further detail for you following that:

Attention all #spatial people: We at rOpenSci are planning to conduct our first ever Review by Committee. We have a submission which has proved simply too large to be reviewed by single reviewers. This is an open invitation for anybody keen to help the future of spatial and temporal analysis in R to be part of our review committee. Please indicate potential times of availability at We'll decide on a time by end of coming week (Fri 21st Feb), and email all those who can attend with further details.

Review process will be something like:

  1. A week or so prior, we'll circulate a document with numerous, manageably small review tasks focussing on particular aspects of software design and documentation, and either ask people to select tasks, or we'll randomly assign to ensure everything gets covered.
  2. We'll ask and expect each person to devote an hour or so at some stage prior to the meeting to having a look through code and/or documentation in the context of their tasks.
  3. The committee meeting itself will be 90 minutes long, held via zoom. With approval of all, we'll use an automated service to generate a full transcript, so nobody will need to take notes.
  4. We'll ask for one or two volunteers to then spend an additional hour or two converting the notes into an actual review which can be pasted into the GitHub issue thread.

Please also note that package authors will also be invited to attend, and are in Brazil, so UTC minus 3 hours. If possible, we'll preferentially select times at which they can attend. We may also schedule two reviews if unable to fit with everybody.

We're excited about trialling a new form of software review, and hope also to learn a lot about the process from all who attend. Thanks in advance!

Additional Information

This notion of review by committee is similar to the Live Reviews of As with that scheme, you are definitely invited to attend, although the review will proceed even if you can not. Please ensure that this invitation also reaches any co-authors who may also wish to attend.

Reflective of rOpenSci's constructive and non-adversarial review processes, the review-by-committee will focus on ways by which you might improve aspects of software design or documentation. It will primarily be an opportunity for you as package authors to engage a committee of people with (presumably) no previous familiarity with the SITS package to consider as many aspects as we can. I'll contact you privately and ask you to send a list of aspects of your package which:

  • you would particularly like to be considered,
  • about which you have concerns or uncertainties,
  • or which you think might gain the most from review consideration

Please once again accept my sincere apologies for the slow review process thus far. I think this approach will both generate a maximally useful review for you, and will expand general processes of software peer review at rOpenSci. Thanks!

