Skip to content

r-lib/mirai

Repository files navigation

mirai mirai logo

CRAN status R-universe status R-CMD-check Codecov test coverage DOI

ミライ


みらい 未来

Minimalist Async Evaluation Framework for R

mirai — future in Japanese — allows you to perform computationally intensive tasks without blocking the R session.

→ Run R code in the background, with results available once ready

→ Distribute workloads across local or remote machines

→ Execute tasks on different compute resources based on requirements

→ Perform actions as soon as tasks complete via promises


Installation

install.packages("mirai")

Quick Start

mirai(): Evaluate an R expression asynchronously in a parallel processes.

daemons(): Set and launch persistent background processes, local or remote, on which to run mirai tasks.

library(mirai)

daemons(5)
#> [1] 5

m <- mirai({
  Sys.sleep(1)
  100 + 42
})

mp <- mirai_map(1:9, \(x) {
  Sys.sleep(1)
  x^2
})

m
#> < mirai [] >
m[]
#> [1] 142

mp
#> < mirai map [4/9] >
mp[.flat]
#> [1]  1  4  9 16 25 36 49 64 81

daemons(0)
#> [1] 0

Design Concepts

mirai is designed from the ground up to provide a production-grade experience.

→ Modern

  • Current technologies built on nanonext and NNG
  • Communications layer supports IPC (Inter-Process Communication), TCP/IP and TLS

→ Efficient

  • 1,000x more responsive vs. other alternatives [1]
  • Ideal for low-latency applications e.g. real time inference & Shiny apps

→ Reliable

  • No reliance on global options or variables for consistent behaviour
  • Explicit evaluation for transparent and predictable results

→ Scalable

  • Capacity for millions of tasks over thousands of connections
  • Proven track record for heavy-duty workloads in the life sciences industry

Key Features

→ Distributed Execution: Run tasks across networks and clusters using various deployment methods (SSH, HPC clusters using Slurm, SGE, Torque, PBS, LSF, etc.)

→ Compute Profiles: Manage different sets of daemons independently, allowing tasks with different requirements to be executed on appropriate resources.

→ Promises Integration: An event-driven implementation performs actions on returned values as soon as tasks complete, ensuring minimal latency.

→ Serialization Support: Native serialization support for reference objects such as Arrow Tables, Polars DataFrames or torch tensors.

→ Error Handling: Robust error handling and reporting, with full stack traces for debugging.

→ RNG Management: L’Ecuyer-CMRG RNG streams for reproducible random number generation in parallel execution.

Powering the Ecosystem

mirai serves as a foundation for asynchronous and parallel computing in the R ecosystem:

R parallel   Implements the first official alternative communications backend for R — the ‘MIRAI’ parallel cluster — fulfilling a feature request by R-Core at R Project Sprint 2023.

purrr   Powers parallel map for the purrr functional programming toolkit, a core tidyverse package.

promises   Promises for ‘mirai’ and ‘mirai_map’ objects are event-driven, providing the lowest latency and highest responsiveness for performance-critical applications.

Shiny   The primary async backend for Shiny, with full ExtendedTask support, providing the next level of responsiveness and scalability for Shiny apps.

plumber2   The built-in async evaluator behind the @async tag in plumber2; also provides an async backend for Plumber.

torch   Allows Torch tensors and complex objects such as models and optimizers to be used seamlessly across parallel processes.

Arrow   Allows queries using the Apache Arrow format to be handled seamlessly over ADBC database connections hosted in background processes.

Polars   R Polars is a pioneer of mirai’s serialization registration mechanism, which allows transparent use of Polars objects across parallel processes, with no user setup required.

targets   Targets, a make-like pipeline tool, uses crew as its default high-performance computing backend. Crew is a distributed worker launcher extending mirai to different computing platforms, from traditional clusters to cloud services.

Thanks

We would like to thank in particular:

Will Landau for being instrumental in shaping development of the package, from initiating the original request for persistent daemons, through to orchestrating robustness testing for the high performance computing requirements of crew and targets.

Joe Cheng for integrating the ‘promises’ method to work seamlessly within Shiny, and prototyping event-driven promises.

Luke Tierney of R Core, for discussion on L’Ecuyer-CMRG streams to ensure statistical independence in parallel processing, and making it possible for mirai to be the first ‘alternative communications backend for R’.

Travers Ching for a novel idea in extending the original custom serialization support in the package.

Hadley Wickham for original implementations of the scoped helper functions, on which ours are based.

Henrik Bengtsson for valuable insights leading to the interface accepting broader usage patterns.

Daniel Falbel for discussion around an efficient solution to serialization and transmission of torch tensors.

Kirill Müller for discussion on using parallel processes to host Arrow database connections.

Links & References

◈ mirai R package: https://mirai.r-lib.org/
◈ nanonext R package: https://nanonext.r-lib.org/

mirai is listed in CRAN High Performance Computing Task View:
https://cran.r-project.org/view=HighPerformanceComputing

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Contributors 7

Languages