Skip to content

Cascade effect in vCPU usage using future. #481

@juliombarros

Description

@juliombarros

Hello!
I am using future_pmap in a multicore setting in a machine running ubuntu to run atmospheric trajectories of air parcels (using HYSPLIT model and the related package splitr. I set up my code so that I map across each air parcel. I would add a reproducible example here but I can't believe of something so intensive.

I'm seeing this very weird behavior in the CPU usage: after a while using 100% of the vCPUs there's a drop which will follow until the end of the process. By the end of this domino effect, the CPU usage is close to 0% and the computation of each air parcel trajectory becomes very sluggish.

This picture might give a hint of what might be happening. There are two processes there -- and only them, I am not running anything else -- separated by a couple of hours. The first one runs well, but takes longer than necessary if all the vCPUs were being used. The second one didn't finish, maybe because of that big slump around hour 20. At that time, the code automatically downloaded a couple of files from NOAAs website that were missing in the directory. That is, it added another process (wget) to what's being run.
image

Each of the processes above is a year and to each year I do the following:

run_model_year = function(inp_data, yr){

  #' Run HYSPLIT model in a yearly basis
  #' 
  #' @param inp_data `data.frame` containing date, time, lat, long, BCSMASS, OCCMASS
  #' @param yr `integer` of the year to run
  #' @param mo `integer` of the month to run
  
  df_run <- inp_data %>% 
    relocate(acq_date, acq_time, latitude, longitude, BCSMASS, OCCMASS) %>%
    filter(year(acq_date) == yr)

  tic() ## START
  
  plan(multicore)	
  hysplit_runs <- future_pmap(df_run, 
                                ~hysplit_model(..1, ..2, ..3, ..4, ..5, ..6,
                                               HALF_LIFE_BC, HALF_LIFE_OC),
                                .progress = TRUE)
  toc() ## STOP

}

run_model_year(df,yr = 2016)

Where hysplit_model is just a wrapper of splitr::create_trajectory_model(add_trajectory_params(run_model(...)).
By the way, when I check what the machine is running at the moment future is not using 100% of the vCPUs with top, I see many of hysplit executables being "D", uninterruptible sleep .
I'll add a picture here when I get to that point again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions