Documentation

BDP CLI Reference#

This documentation is auto-generated from the CLI source code.

Overview#

BDP (Bioinformatics Dependencies Platform) is a command-line tool for managing biological datasets with version control, checksums, and audit trails.

Installation#

From Source#

bash
git clone https://codeberg.org/datadir/bdp.git
cd bdp
cargo install --path crates/bdp-cli

Using Cargo#

bash
cargo install bdp-cli

Quick Start#

bash
# Initialize a new project
bdp init --name my-project
# Add data sources
bdp source add uniprot:P01308-fasta@1.0
# Download sources
bdp pull
# Check status
bdp status
# View audit trail
bdp audit list

Commands#

Command Overview:

bdp#

BDP - Bioinformatics Dependencies Platform

Usage: bdp [OPTIONS] [COMMAND]

Subcommands:#
  • init — Initialize a new BDP project
  • source — Manage data sources
  • pull — Download and cache sources from manifest
  • status — Show status of cached sources
  • audit — Audit trail management
  • clean — Clean cache
  • config — Manage configuration
  • uninstall — Uninstall BDP from your system
  • search — Search for data sources and tools in the registry
  • cache — Manage local data cache directory
  • generate — Generate workflow integration files (Python, Snakemake, Nextflow, CWL, R)
  • completions — Generate shell completion scripts
  • auth — Manage authentication
  • query — Advanced SQL-like querying of data sources and metadata
Options:#
  • -v, --verbose — Verbose output

  • --server-url <SERVER_URL> — Server URL

    Default value: http://localhost:8000

bdp init#

Initialize a new BDP project

Usage: bdp init [OPTIONS] [PATH]

Arguments:#
  • <PATH> — Project directory or name (creates directory if it doesn't exist)

    Default value: .

Options:#
  • -n, --name <NAME> — Project name (defaults to directory name)

  • -V, --version <VERSION> — Project version

    Default value: 0.1.0

  • -d, --description <DESCRIPTION> — Project description

  • -f, --force — Force overwrite if bdp.yml exists

bdp source#

Manage data sources

Usage: bdp source <COMMAND>

Subcommands:#
  • add — Add a source to the manifest
  • remove — Remove a source from the manifest
  • list — List sources in the manifest

bdp source add#

Add a source to the manifest

Usage: bdp source add <SOURCE>

Arguments:#
  • <SOURCE> — Source specification (e.g., "uniprot:P01308-fasta@1.0")

bdp source remove#

Remove a source from the manifest

Usage: bdp source remove <SOURCE>

Arguments:#
  • <SOURCE> — Source specification

bdp source list#

List sources in the manifest

Usage: bdp source list

bdp pull#

Download and cache sources from manifest

Usage: bdp pull [OPTIONS]

Options:#
  • -f, --force — Force re-download even if cached
  • --dry-run — Show what would be downloaded without fetching anything

bdp status#

Show status of cached sources

Usage: bdp status

bdp audit#

Audit trail management

Usage: bdp audit <COMMAND>

Subcommands:#
  • list — List audit events
  • verify — Verify audit trail integrity
  • export — Export audit trail to regulatory format

bdp audit list#

List audit events

Usage: bdp audit list [OPTIONS]

Options:#
  • -l, --limit <LIMIT> — Limit number of events to show

    Default value: 20

  • -s, --source <SOURCE> — Show events from specific source

bdp audit verify#

Verify audit trail integrity

Usage: bdp audit verify

bdp audit export#

Export audit trail to regulatory format

Usage: bdp audit export [OPTIONS]

Options:#
  • -f, --format <FORMAT> — Export format (fda, nih, ema, das, json)

    Default value: fda

  • -o, --output <OUTPUT> — Output file path (optional, defaults to audit-{format}.{ext})

  • --from <FROM> — Filter events from date (ISO 8601)

  • --to <TO> — Filter events to date (ISO 8601)

  • -n, --project-name <PROJECT_NAME> — Project name for report

  • --project-version <PROJECT_VERSION> — Project version for report

bdp clean#

Clean cache

Usage: bdp clean [OPTIONS]

Options:#
  • -a, --all — Clean all cached files
  • --search-cache — Clean only search cache
  • -y, --yes — Skip confirmation prompt

bdp config#

Manage configuration

Usage: bdp config <COMMAND>

Subcommands:#
  • get — Get configuration value
  • set — Set configuration value
  • show — Show all configuration

bdp config get#

Get configuration value

Usage: bdp config get <KEY>

Arguments:#
  • <KEY> — Configuration key

bdp config set#

Set configuration value

Usage: bdp config set <KEY> <VALUE>

Arguments:#
  • <KEY> — Configuration key
  • <VALUE> — Configuration value

bdp config show#

Show all configuration

Usage: bdp config show

bdp uninstall#

Uninstall BDP from your system

Usage: bdp uninstall [OPTIONS]

Options:#
  • -y, --yes — Skip confirmation prompt
  • --purge — Also remove cache and configuration files

Search for data sources and tools in the registry

Usage: bdp search [OPTIONS] <QUERY>...

Arguments:#
  • <QUERY> — Search query (multiple words will be joined)
Options:#
  • -o, --org <ORG> — Filter by organization (e.g., uniprot, ncbi)

  • -t, --type <ENTRY_TYPE> — Filter by entry type (can be repeated)

  • -s, --source-type <SOURCE_TYPE> — Filter by source type (can be repeated)

  • -f, --format <FORMAT> — Output format

    Default value: interactive

  • --no-interactive — Force non-interactive mode

  • -l, --limit <LIMIT> — Number of results per page (1-100)

    Default value: 10

  • -p, --page <PAGE> — Page number (for non-interactive pagination)

    Default value: 1

bdp cache#

Manage local data cache directory

Usage: bdp cache <COMMAND>

Subcommands:#
  • set — Set cache directory path
  • show — Show current cache directory
  • reset — Reset cache path to default (.bdp/data)

bdp cache set#

Set cache directory path

Usage: bdp cache set <PATH>

Arguments:#
  • <PATH> — Path to cache directory (relative to project root, or absolute)

bdp cache show#

Show current cache directory

Usage: bdp cache show

bdp cache reset#

Reset cache path to default (.bdp/data)

Usage: bdp cache reset

bdp generate#

Generate workflow integration files (Python, Snakemake, Nextflow, CWL, R)

Usage: bdp generate <COMMAND>

Subcommands:#
  • python — Generate Python data paths module (bdp_data.py)
  • snakemake — Generate Snakemake config file (config/bdp_data.yaml)
  • nextflow — Generate Nextflow config file (conf/bdp_data.config)
  • cwl — Generate CWL v1.2 inputs file (cwl/bdp-inputs.yml)
  • r — Generate R data config and loader (r/bdp_data.yml + r/bdp_data.R)

bdp generate python#

Generate Python data paths module (bdp_data.py)

Usage: bdp generate python

bdp generate snakemake#

Generate Snakemake config file (config/bdp_data.yaml)

Usage: bdp generate snakemake

bdp generate nextflow#

Generate Nextflow config file (conf/bdp_data.config)

Usage: bdp generate nextflow

bdp generate cwl#

Generate CWL v1.2 inputs file (cwl/bdp-inputs.yml)

Usage: bdp generate cwl

bdp generate r#

Generate R data config and loader (r/bdp_data.yml + r/bdp_data.R)

Usage: bdp generate r

bdp completions#

Generate shell completion scripts

Usage: bdp completions <SHELL>

Arguments:#
  • <SHELL> — Shell to generate completions for

    Possible values: bash, elvish, fish, powershell, zsh

bdp auth#

Manage authentication

Usage: bdp auth <COMMAND>

Subcommands:#
  • login — Authenticate with the BDP server
  • logout — Clear stored credentials
  • status — Show current authentication state
  • api-key — Manage API keys

bdp auth login#

Authenticate with the BDP server

Usage: bdp auth login [OPTIONS]

Options:#
  • --api-key <API_KEY> — Authenticate with an API key instead of email/password

bdp auth logout#

Clear stored credentials

Usage: bdp auth logout

bdp auth status#

Show current authentication state

Usage: bdp auth status

bdp auth api-key#

Manage API keys

Usage: bdp auth api-key <COMMAND>

Subcommands:#
  • create — Create a new API key
  • list — List your API keys
  • revoke — Revoke an API key by ID

bdp auth api-key create#

Create a new API key

Usage: bdp auth api-key create <NAME>

Arguments:#
  • <NAME> — Human-readable name for this key

bdp auth api-key list#

List your API keys

Usage: bdp auth api-key list

bdp auth api-key revoke#

Revoke an API key by ID

Usage: bdp auth api-key revoke <ID>

Arguments:#
  • <ID> — API key ID to revoke

bdp query#

Advanced SQL-like querying of data sources and metadata

Usage: bdp query [OPTIONS] [ENTITY]

Arguments:#
  • <ENTITY> — Entity to query (protein, gene, genome, tools, orgs, etc.) or use --sql for raw SQL
Options:#
  • --select <SELECT> — Select specific fields (comma-separated)

  • -w, --where <WHERE_CLAUSE> — Filter results (can be repeated, AND combined) Simple: --where organism=human Complex: --where "organism='human' AND downloads>1000"

  • --order-by <ORDER_BY> — Sort results by field[:asc|desc]

  • -l, --limit <LIMIT> — Limit number of results (default: 1000)

    Default value: 1000

  • --offset <OFFSET> — Skip first N results

  • --group-by <GROUP_BY> — Group results by field

  • --aggregate <AGGREGATE> — Aggregation expression (COUNT(*), SUM(field), etc.)

  • --having <HAVING> — Filter grouped results

  • --join <JOIN> — Join with another entity/table

  • --on <ON> — Join condition

  • --sql <SQL> — Execute raw SQL query directly

  • -f, --format <FORMAT> — Output format

  • -o, --output <OUTPUT> — Write output to file instead of stdout

  • --no-header — Omit header row (for CSV/TSV)

  • --explain — Show query execution plan

  • --dry-run — Show generated SQL without executing


This document was generated automatically by clap-markdown.

Environment Variables#

  • BDP_SERVER_URL - Backend server URL (default: http://localhost:8000)
  • RUST_LOG - Logging level (e.g., debug, info, warn, error)

Configuration#

BDP uses a bdp.yml manifest file in your project directory. This file is created automatically when you run bdp init.

Example bdp.yml:

yaml
name: my-project
version: 0.1.0
description: My biological data project
sources:
- id: uniprot:P01308-fasta@1.0
checksum: sha256:abc123...

Audit Trail#

BDP maintains a cryptographically-linked audit trail of all operations in .bdp/audit.db. This provides:

  • Tamper-evident logging
  • Regulatory compliance (FDA 21 CFR Part 11, NIH, EMA)
  • Full traceability of data sources

Export audit trails for compliance:

bash
# Export to FDA format
bdp audit export --format fda --output audit-report.pdf
# Export to JSON
bdp audit export --format json --output audit.json

Examples#

Working with Multiple Sources#

bash
# Initialize project
bdp init --name multi-source-project
# Add multiple sources
bdp source add uniprot:P01308-fasta@1.0
bdp source add ncbi-taxonomy:taxdump@1.0
bdp source add genbank:NC_000001.11@latest
# List all sources
bdp source list
# Pull all sources
bdp pull
# Check what's cached
bdp status

Verifying Data Integrity#

bash
# Verify audit trail integrity
bdp audit verify
# Export audit trail with date filter
bdp audit export \
--format fda \
--from 2024-01-01 \
--to 2024-12-31 \
--project-name "My Project" \
--project-version "1.0.0"

Support#

  • Codeberg Issues: https://codeberg.org/datadir/bdp/issues
  • Documentation: https://bdp.datadir.io/docs

This documentation is automatically generated from the CLI source code. To update, run cargo xtask docs cli.