Setting run parameters
Our workflows are executed using defaults that specify parameters for setting requirements for memory, threads, environment, e.c.t. Each of these parameters can be modified within the pipeline.
Modifiable run parameters
job_memory: Number of slots (threads/cores/CPU) to use for the task. Default: “4G”
job_total_memory: Total memory to use for a job.
to_cluster: Send the job to the cluster. Default: True
without_cluster: When this is set to True the job is ran locally. Default: False
cluster_memory_ulimit: Restrict virtual memory. Default: False
job_condaenv: Name of the conda environment to use for each job. Default: will use the one specified in bashrc
job_array: If set True, run statement as an array job. Job_array should be tuple with start, end, and increment. Default: False
Specifying parameters to job
Parameters can be set within a pipeline task as follows:
@transform( '*.unsorted', suffix('.unsorted'), '.sorted')
def sortFile( infile, outfile ):
statement = '''sort -t %(tmpdir)s %(infile)s > %(outfile)s'''
P.run(statement,
job_condaenv="sort_environment",
job_memory=30G,
job_threads=2,
without_cluster = False,
job_total_memory = 50G)