Integrating Bacterial ChIP-seq and RNA-seq Data With SnakeChunks
- Escherichia coli K‐12
- FAIR Guiding Principles
- Reproducible science
Next‐generation sequencing (NGS) is becoming a routine approach in most domains of the life sciences. To ensure reproducibility of results, there is a crucial need to improve the automation of NGS data processing and enable forthcoming studies relying on big datasets. Although user‐friendly interfaces now exist, there remains a strong need for accessible solutions that allow experimental biologists to analyze and explore their results in an autonomous and flexible way. The protocols here describe a modular system that enable a user to compose and fine‐tune workflows based on SnakeChunks, a library of rules for the Snakemake workflow engine. They are illustrated using a study combining ChIP‐seq and RNA‐seq to identify target genes of the global transcription factor FNR in Escherichia coli, which has the advantage that results can be compared with the most up‐to‐date collection of existing knowledge about transcriptional regulation in this model organism, extracted from the RegulonDB database.