The futurize package makes it extremely simple to parallelize your
existing map-reduce calls, but also a growing set of domain-specific
calls. All you need to know is that there is a single function called
futurize() that will take care of everything, e.g.
y <- lapply(x, fcn) |> futurize()
y <- map(x, fcn) |> futurize()
b <- boot(city, ratio, R = 999) |> futurize()The futurize() function parallelizes via futureverse, meaning
your code can take advantage of any supported future backends,
whether it be parallelization on your local computer, across multiple
computers, in the cloud, or on a high-performance compute (HPC) cluster.
The futurize package has only one hard dependency - the
future package. All other dependencies are optional "buy-in"
dependencies as shown in the below tables.
In addition to getting access to all future-based parallel backends,
by using futurize() you also get access to all the benefits that
come with futureverse, including structured concurrency. For
example, it ensures that remaining parallel tasks are cancelled if
there is an error or an interrupt. Also, if the function you
parallelize outputs messages and warnings, they will be relayed from
the parallel worker to your main R session, just as you get when
running sequentially. This is particularly useful when troubleshooting
or debugging.
Using futurize comes with a zero risk buy-in. If there is ever a
parallel universe where futurize() suddenly stops working, setting
futurize <- identical avoids rewrites while make all code to run
sequentially.
The futurize package supports transpilation of functions from multiple packages. The tables below summarize the supported map-reduce (Table 1) and domain-specific (Tables 2 and 3) functions, respectively. To programmatically see which packages are currently supported, use:
futurize_supported_packages()To see which functions are supported for a specific package, use:
futurize_supported_functions("caret")| Package | Functions | Requires |
|---|---|---|
| base | lapply(), sapply(), tapply(), vapply(), mapply(), .mapply(), Map(), eapply(), apply(), by(), replicate(), Filter() |
future.apply |
| stats | kernapply() |
future.apply |
| purrr | map() and variants, map2() and variants, pmap() and variants, imap() and variants, modify(), modify_if(), modify_at(), map_if(), map_at() |
furrr |
| crossmap | xmap() and variants, xwalk(), map_vec(), map2_vec(), pmap_vec(), imap_vec() |
- |
| foreach | %do%, e.g. foreach() %do% { }, times() %do% { } |
doFuture |
| plyr | aaply() and variants, ddply() and variants, llply() and variants, mlply() and variants |
doFuture |
| pbapply | pblapply(), pbsapply() and variants, pbby(), pbreplicate() and pbwalk() |
future.apply |
| BiocParallel | bplapply(), bpmapply(), bpvec(), bpiterate(), bpaggregate() |
doFuture |
Table 1: Map-reduce functions currently supported by futurize() for parallel transpilation.
Here are some examples:
library(futurize)
plan(multisession)
xs <- 1:10
ys <- lapply(xs, sqrt) |> futurize()
xs <- 1:10
ys <- purrr::map(xs, sqrt) |> futurize()
xs <- 1:10
ys <- crossmap::xmap_dbl(xs, ~ .y * .x) |> futurize()
library(foreach)
xs <- 1:10
ys <- foreach(x = xs) %do% { sqrt(x) } |> futurize()
xs <- 1:10
ys <- plyr::llply(xs, sqrt) |> futurize()
xs <- 1:10
ys <- pbapply::pblapply(xs, sqrt) |> futurize()
xs <- 1:10
ys <- BiocParallel::bplapply(xs, sqrt) |> futurize()and
ys <- replicate(3, rnorm(1)) |> futurize()
y <- by(warpbreaks, warpbreaks[,"tension"],
function(x) lm(breaks ~ wool, data = x)) |> futurize()
xs <- EuStockMarkets[, 1:2]
k <- kernel("daniell", 50)
xs_smooth <- stats::kernapply(xs, k = k) |> futurize()You can also futurize calls from a growing set of domain-specific CRAN and Bioconductor packages that have optional built-in support for parallelization.
| Package | Functions | Requires |
|---|---|---|
| boot | boot(), censboot(), tsboot() |
- |
| caret | bag(), gafs(), nearZeroVar(), rfe(), safs(), sbf(), train() |
doFuture |
| DiceKriging | km() |
doFuture |
| ez | ezBoot(), ezPerm(), ezPlot2() |
doFuture |
| fwb | fwb(), vcovFWB() |
- |
| gamlss | add1All(), add1TGD(), drop1All(), drop1TGD(), gamlssCV() |
- |
| glmmTMB | profile() for 'glmmTMB' |
- |
| glmnet | cv.glmnet() |
doFuture |
| kernelshap | kernelshap(), permshap() |
doFuture |
| lme4 | allFit(), bootMer(), influence() and profile() for 'merMod' |
- |
| metafor | profile(), rstudent(), cooks.distance(), dfbetas() for 'rma' |
- |
| mgcv | bam(), predict() for 'bam' |
- |
| modelsummary | modelsummary(), msummary(), modelplot() |
future.apply |
| parameters | bootstrap_model(), bootstrap_parameters() |
- |
| partykit | cforest(), ctree_control(), mob_control(), varimp() for 'cforest' |
future.apply |
| pls | mvr(), plsr(), pcr(), cppls(), crossval() |
- |
| pvclust | pvclust() |
- |
| riskRegression | Score() for 'list' |
doFuture |
| rugarch | arfimacv(), arfimadistribution(), arfimaroll(), autoarfima(), multifilter(), multifit(), multiforecast(), ugarchboot(), ugarchdistribution(), ugarchroll() |
- |
| sandwich | vcovBS(), vcovJK() |
future.apply |
| seriation | seriate_best(), seriate_rep() |
doFuture |
| shapr | explain(), explain_forecast() |
- |
| Sim.DiffProc | MCM.sde() |
- |
| SimDesign | runSimulation(), runArraySimulation() |
- |
| stars | st_apply() |
future.apply |
| strucchange | breakpoints() for 'formula' |
doFuture |
| SuperLearner | CV.SuperLearner() |
- |
| tm | TermDocumentMatrix(), tm_index(), tm_map() |
- |
| TSP | solve_TSP() |
doFuture |
| vegan | adonis(), adonis2(), anova() for 'cca', anosim(), cascadeKM(), estaccumR(), mantel(), mantel.partial(), metaMDSiter(), mrpp(), oecosimu(), ordiareatest(), permutest() for 'betadisper', and 'cca' |
- |
Table 2: CRAN packages with domain-specific functions currently
supported by futurize() for parallel transpilation.
Here are some examples:
ratio <- function(d, w) sum(d$x * w)/sum(d$u * w)
b <- boot::boot(boot::city, ratio, R = 999) |> futurize()
ctrl <- caret::trainControl(method = "cv", number = 10)
model <- caret::train(Species ~ ., data = iris, method = "rf", trControl = ctrl) |> futurize()
rt <- ez::ezBoot(data = ANT, dv = rt, wid = subnum, within = .(cue, flank), between = group) |> futurize()
f <- fwb::fwb(boot::city, ratio, R = 999) |> futurize()
m <- DiceKriging::km(~., design = design, response = response, multistart = 8L) |> futurize()
cv <- gamlss::gamlssCV(y ~ pb(x), data = abdom, K.fold = 10) |> futurize()
cv <- glmnet::cv.glmnet(x, y) |> futurize()
ks <- kernelshap::kernelshap(model, X = x_explain, bg_X = bg_X) |> futurize()
m <- lme4::allFit(models) |> futurize()
fit <- metafor::rma(yi, vi)
pr <- profile(fit) |> futurize()
b <- mgcv::bam(y ~ s(x0, bs = bs) + s(x1, bs = bs), data = dat) |> futurize()
fit <- parameters::bootstrap_model(model, iterations = 1000) |> futurize()
cf <- partykit::cforest(dist ~ speed, data = cars) |> futurize()
m <- pls::plsr(density ~ NIR, ncomp = 10, data = yarn, validation = "CV") |> futurize()
fit <- pvclust::pvclust(mtcars, nboot = 1000) |> futurize()
v <- sandwich::vcovBS(fm) |> futurize()
sc <- riskRegression::Score(list("CSC" = fit), data = d,
formula = Hist(time, event) ~ 1, times = 5, B = 100,
split.method = "bootcv") |> futurize()
roll <- rugarch::ugarchroll(spec, sp500ret, n.start = 1000,
refit.window = "moving", refit.every = 100) |> futurize()
result <- shapr::explain(model, x_explain, x_train, approach = "empirical", phi0 = phi0) |> futurize()
o <- seriation::seriate_best(d_supreme) |> futurize()
res <- Sim.DiffProc::MCM.sde(model, statistic = stat, R = 100) |> futurize()
res <- SimDesign::runSimulation(Design, replications = 1000,
generate = Generate, analyse = Analyse, summarise = Summarise) |> futurize()
s <- stars::st_as_stars(matrix(1:20, nrow = 5, ncol = 4))
res <- stars::st_apply(s, MARGIN = 1, FUN = mean) |> futurize()
bp <- strucchange::breakpoints(Nile ~ 1) |> futurize()
res <- SuperLearner::CV.SuperLearner(Y, X, SL.library = SL.library) |> futurize()
m <- tm::tm_map(crude, content_transformer(tolower)) |> futurize()
tour <- TSP::solve_TSP(USCA50, method = "nn", rep = 10) |> futurize()
md <- vegan::mrpp(dune, Management) |> futurize()| Package | Functions | Requires |
|---|---|---|
| DESeq2 | DESeq(), lfcShrink(), results() |
doFuture |
| fgsea | fgsea(), fgseaMultilevel(), fgseaSimple(), fgseaLabel(), geseca(), gesecaSimple(), collapsePathwaysGeseca() |
doFuture |
| GenomicAlignments | summarizeOverlaps() |
doFuture |
| GSVA | gsva(), gsvaRanks(), gsvaScores(), spatCor() |
doFuture |
| Rsamtools | countBam(), scanBam() |
doFuture |
| scater | calculatePCA(), calculateTSNE(), calculateUMAP(), runPCA(), runTSNE(), runUMAP(), runColDataPCA(), nexprs(), getVarianceExplained(), plotRLE() |
doFuture |
| scuttle | calculateAverage(), logNormCounts(), normalizeCounts(), perCellQCMetrics(), perFeatureQCMetrics(), addPerCellQCMetrics(), addPerFeatureQCMetrics(), addPerCellQC(), addPerFeatureQC(), numDetectedAcrossCells(), numDetectedAcrossFeatures(), sumCountsAcrossCells(), sumCountsAcrossFeatures(), summarizeAssayByGroup(), aggregateAcrossCells(), aggregateAcrossFeatures(), librarySizeFactors(), computeLibraryFactors(), geometricSizeFactors(), computeGeometricFactors(), medianSizeFactors(), computeMedianFactors(), pooledSizeFactors(), computePooledFactors(), fitLinearModel() |
doFuture |
| SingleCellExperiment | applySCE() |
doFuture |
| sva | ComBat(), read.degradation.matrix() |
doFuture |
Table 3: Bioconductor packages with domain-specific functions
currently supported by futurize() for parallel transpilation.
Here are some examples:
dds <- DESeq2::DESeq(dds) |> futurize()
res <- fgsea::fgsea(pathways, stats) |> futurize()
se <- GenomicAlignments::summarizeOverlaps(features, bam_files) |> futurize()
es <- GSVA::gsva(GSVA::gsvaParam(expr, geneSets)) |> futurize()
counts <- Rsamtools::countBam(bamViews) |> futurize()
sce <- scater::runPCA(sce) |> futurize()
qc <- scuttle::perFeatureQCMetrics(sce) |> futurize()
result <- SingleCellExperiment::applySCE(sce, scuttle::perFeatureQCMetrics) |> futurize()
adjusted <- sva::ComBat(dat = dat, batch = batch) |> futurize()