furrr works perfectly with future.batchtools. If you have a loop with 3 elements you get 3 jobs on the cluster:
library(furrr)
library(future.batchtools)
plan(batchtools_sge)
nothingness <- future_map(c(2, 2, 2), ~Sys.sleep(.x))
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14375 0.50500 jobb856749 fred r 06/03/2023 22:23:03 all.q@ip-xxxec2.inte 1
14376 0.50500 job2b8fc0d fred r 06/03/2023 22:23:03 all.q@ip-1xxx.ec2.inte 1
14377 0.50500 jobfdf28f1 fred r 06/03/2023 22:23:03 all.q@ip-xx.ec2.inte 1
With foreach, however, I only get 1 job:
library(foreach)
library(future)
library(furrr)
library(future.batchtools)
library(doFuture)
mu <- 1.0
sigma <- 2.0
registerDoFuture()
plan(batchtools_sge)
x %<-% {
foreach(i = 1:3) %dopar% {
Sys.sleep(3)
set.seed(123)
rnorm(i, mean = mu, sd = sigma)
}
}
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14378 0.50500 job7472e4f fred r 06/03/2023 22:26:03 all.q@ip-xxx.ec2.inte 1
and when it return we get the following 2 strange warnings:
Warning messages: 1: executing %dopar% sequentially: no parallel backend registered 2: UNRELIABLE VALUE: Future ( ) unexpectedly generated random numbers without specifying argument 'seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore". >
Why does it say there is no parallel backend registered when I'm running registerDoFuture()?
Alternatively, I tried %dofuture% instead of %dopar% but it still only generates 1 job.
x %<-% {
foreach(i = 1:10) %dofuture% {
Sys.sleep(3)
set.seed(123)
rnorm(i, mean = mu, sd = sigma)
}
}
f = futureOf(x)
resolved(f)
x
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14379 0.50500 job4b3cd22 fred r 06/03/2023 22:28:48 all.q@ipxxx 1
This time I only get the random number warning:
Warning message: UNRELIABLE VALUE: At least one of iterations 1-10 of the foreach() %dofuture% { … }, part of chunk #1 ( doFuture2-1 ), unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify foreach() argument '.options.future = list(seed = TRUE)'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, set option 'doFuture.rng.onMisuse' to "ignore". >
I also tried other options in foreach like .options.future = list(scheduling=1) but they don't seem to have any effect.
Is it possible that foreach somehow chunks all the iteration in one task? Or is something else not working.
Thanks
Fred
furrr works perfectly with future.batchtools. If you have a loop with 3 elements you get 3 jobs on the cluster:
With foreach, however, I only get 1 job:
and when it return we get the following 2 strange warnings:
Why does it say there is no parallel backend registered when I'm running registerDoFuture()?
Alternatively, I tried %dofuture% instead of %dopar% but it still only generates 1 job.
This time I only get the random number warning:
I also tried other options in foreach like
.options.future = list(scheduling=1)but they don't seem to have any effect.Is it possible that foreach somehow chunks all the iteration in one task? Or is something else not working.
Thanks
Fred