Nextflow Integration#
Use bdp generate nextflow to emit a Nextflow config file with resolved file paths as params.bdp.* variables, ready to include in your DSL2 pipeline.
Generate Config#
bash
bdp generate nextflowOutput (conf/bdp_data.config):
groovy
// BDP data paths - auto-generated by 'bdp generate nextflow'. Do not edit.// Include in nextflow.config: includeConfig 'conf/bdp_data.config'params { bdp { clinvar_variants_vcf = "${projectDir}/.bdp/data/sources/clinvar/variants/2024.01/variants_2024.01.vcf" ensembl_homo_sapiens_gtf = "${projectDir}/.bdp/data/sources/ensembl/homo_sapiens/110/homo_sapiens_110.gtf" uniprot_p01308_fasta = "${projectDir}/.bdp/data/sources/uniprot/P01308/2024.03/P01308_2024.03.fasta" }}Include in nextflow.config#
groovy
// nextflow.configincludeConfig 'conf/bdp_data.config'Example: Variant Annotation Pipeline (DSL2)#
A 3-process pipeline using BioContainers for reproducibility:
groovy
nextflow.enable.dsl = 2 process FILTER_PATHOGENIC { container 'biocontainers/bcftools:1.19--h8b25389_1' input: path vcf output: path "pathogenic.vcf", emit: vcf script: """ bcftools view -i 'INFO/CLNSIG~"Pathogenic"' ${vcf} > pathogenic.vcf """} process ANNOTATE_GENES { container 'biocontainers/bedtools:2.31.1--hf5e1c6e_1' input: path vcf path gtf output: path "variant_genes.tsv", emit: tsv script: """ bedtools intersect -a ${vcf} -b ${gtf} -wa -wb > variant_genes.tsv """} workflow { ch_vcf = Channel.fromPath(params.bdp.clinvar_variants_vcf, checkIfExists: true) ch_gtf = Channel.fromPath(params.bdp.ensembl_homo_sapiens_gtf, checkIfExists: true) FILTER_PATHOGENIC(ch_vcf) ANNOTATE_GENES(FILTER_PATHOGENIC.out.vcf, ch_gtf)}Run:
bash
nextflow run main.nfWorkflow#
- Add sources:
bdp source add clinvar:variants-vcf@2024.01 - Pull data:
bdp pull - Generate config:
bdp generate nextflow - Include in pipeline:
includeConfig 'conf/bdp_data.config' - Access via
params.bdp.clinvar_variants_vcfetc. - Commit
bdp.yml,bdp.lock, andconf/bdp_data.config
DSL2 Patterns#
Access BDP data through channels:
groovy
workflow { // Create channels from BDP-managed files ch_vcf = Channel.fromPath(params.bdp.clinvar_variants_vcf, checkIfExists: true) ch_gtf = Channel.fromPath(params.bdp.ensembl_homo_sapiens_gtf, checkIfExists: true) ch_fasta = Channel.fromPath(params.bdp.uniprot_p01308_fasta, checkIfExists: true) // Wire into processes FILTER_PATHOGENIC(ch_vcf) ANNOTATE_GENES(FILTER_PATHOGENIC.out.vcf, ch_gtf)}Uses checkIfExists: true to fail fast if bdp pull hasn't been run.
Full Example#
See the complete Nextflow pipeline example on Codeberg — includes a DSL2 variant annotation pipeline with BioContainers.