Hi
Trying to use 'future' to parallelize for-loops, however measuring time for sequencial versus multiprocess yielded no optimal result for the parallel process. and i wonder why.
Could be something in my code or perhaps my understanding, wither way i'm here to learn :)
In the code i used FindAllMarkers function (from Seurat package) on a subset of my data.
A snippet of my code:
Sequencial - took 33 seconds
plan(sequential)
p <- list()
start_time <- Sys.time()
for (i in seq(1:3)){
markers <- FindAllMarkers(CD3, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25, return.thresh = 0.05)
#print(p[i])
}
end_time <- Sys.time()
end_time-start_time
Concole:
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
end_time <- Sys.time()
end_time-start_time
Time difference of 32.96481 secs
multiprocess - took 38 seconds
plan("multiprocess", workers=6)
p <- list()
start_time <- Sys.time()
for (i in seq(1:3)){
markers <- FindAllMarkers(CD3, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25, return.thresh = 0.05)
#print(p[i])
}
end_time <- Sys.time()
end_time-start_time
console:
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
end_time <- Sys.time()
end_time-start_time
Time difference of 38.50585 secs
First thing that comes up to my eyes in the console is that the multiprocess is actually running sequencially.
If more data is needed, i'd be happy to share.
Thanks very much!!
Hi
Trying to use 'future' to parallelize for-loops, however measuring time for sequencial versus multiprocess yielded no optimal result for the parallel process. and i wonder why.
Could be something in my code or perhaps my understanding, wither way i'm here to learn :)
In the code i used FindAllMarkers function (from Seurat package) on a subset of my data.
A snippet of my code:
Sequencial - took 33 seconds
Concole:
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
Calculating cluster 1
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Calculating cluster 5
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
multiprocess - took 38 seconds
console:
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
Calculating cluster 1
Calculating cluster 3
Calculating cluster 5
First thing that comes up to my eyes in the console is that the multiprocess is actually running sequencially.
If more data is needed, i'd be happy to share.
Thanks very much!!