Skip to content
Junghwan John Goh edited this page Apr 3, 2017 · 7 revisions

create-batch : job submitter to the KCMS batch clusters

create-batch is a script to submit jobs to the batch clusters available for the KCMS members - KISTI tier-3, CERN lxplus, KNU tier-2, etc. All of these farms are based on different batch system like condor, lsb or torque(PBS). Create-batch splits CMSSW batch jobs, create temporary workarea, prepare job configuration files and submit them to the batch system.

create-batch  : create pbs jobs
  Mandatory options :
   --jobName  NAME                  Name of job
   --fileList DATA_FILES            File list text file
   --maxFiles N                     Maximum number of files per job
   --nJobs    N                     Number of Job sections
   --cfg      CONFIG_FILE_cfg.py    Configuration file
  Optional :
   --queue QUEUE_NAME               Set the batch queue name
   --secondFileList DATA_FILES      Secondary file list text file
   -n                               Do not submit jobs to batch
   --transferDest OUTPUT_LOCATION   OUTPUT DIRECTORY (/store will be assumed to SE)
   -g                               Grid certificate is required
   --maxEvent N                     Maximum number of events per job (-1 by default)
   --transferFiles                  Additional files to transfer
   --customise CUSTOMISE_cfg.py     Configuration file for customization
   --args                           general arguments
   --firstRun N                     For MC: run number
  Optional, condor-specific :
   --blacklist HOST1,HOST2,...      Remove specific hosts

Working examples

User data analysis

Prepare your usual cfg file (e.g, myAnalysis_cfg.py) and a text file with list of root files (e.g, InputFiles.txt).

create-batch --jobName MyAnalysis --fileList InputFiles.txt \
             --cfg myAnalysis_cfg.py --maxFiles 10

Will create working directory MyAnalysis and job archive file (job.tar.gz), run scripts (run_YOUR_JOB_NAME.sh, submit.jds for condor system,...), etc will be located under this directory. Configuration files job_000_cfg.py, job_001_cfg.py, ... will be also available in the same directory, but they are only for debugging.

If you want to submit jobs later, put -n option.

Saving output files to storage elements

rfmkdir /store/user/$USER/MyAnalysis
create-batch --jobName MyAnalysis --fileList InputFiles.txt \
             --cfg myAnalysis_cfg.py --maxFiles 10 \
             --transferDest /store/user/$USER/MyAnalysis

MC generation, emptySource

Create-batch allows MC generations, there's no input root files but should give random seeds by its job section number. create-batch automatically detects MC generation jobs by looking at the process.source type and it switches to the MC generation mode if it is EmptySource. Then it does not ask file list and job splitting is done with --nJobs option.

create-batch --jobName MyMCGeneration --cfg generator_cfg.py \
             --nJobs 100 --firstRun 5