CGAT-core documentation!

LicenceCondaDocumentation Status TravisTwitter Followers Twitter URL

CGAT-core is a workflow management system that allows users to quickly and reproducibly build scalable data analysis pipelines. CGAT-core is a set of libraries and helper functions used to enable researchers to design and build computational workflows for the analysis of large-scale data-analysis.

Used in combination with CGAT-apps, we have deomonstrated the functionality of our flexible workflow management system using a simple RNA-seq pipeline in cgat-showcase.

CGAT-core is open-sourced, powerful and user-friendly, and has been continually developed as a Next Generation Sequencing (NGS) workflow management system over the past 10 years.

For more advanced examples of cgatcore utilities please refer to our cgat-flow repository, however please be aware that this is in constant development and has many software dependancies.

Citation

Our workflow management system is published in F1000 Research:

Cribbs AP, Luna-Valero S, George C et al. CGAT-core: a python framework for building scalable, reproducible computational biology workflows [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2019, 8:377 (https://doi.org/10.12688/f1000research.18674.1)

Support

  • Please refer to our FAQs section

  • For bugs and issues, please raise an issue on github

  • For contributions, please refer to our contributor section and github source code.

Examples

cgat-showcase

This is a toy example of how to develop a simple workflow. Please refer to the github page and the documentation.

cgat-flow

As an example of the flexibility and functionality of CGAT-core, we have developed a set of fully tested production pipelines for automating the analysis of our NGS data. Please refer to the github page for information on how to install and use our code.

Single cell RNA-seq

The cribbs lab use CGAT-core to develop pseudoalignment pipelines for single cell dropseq methods The sansom lab use the CGAT-core workflow engine to develop single cell sequencing analysis workflows.

Selected publications using CGAT-core

CGAT-core has been developed over the past 10 years and as such has been used in many previously published articles

For a non-comprehensive list of citations please see our :citing and Citing and Citations