CLI Reference¶
AlleleFlux ships a main alleleflux entrypoint plus console scripts that power the Snakemake workflow. Most users only need alleleflux run; the other tools are available for advanced or ad‑hoc use.
Main commands¶
alleleflux run — execute the workflow¶
Execute the complete AlleleFlux pipeline with flexible resource control and scheduling options.
alleleflux run --config config.yml [options] [-- <extra snakemake args>]
Arguments¶
Option |
Default |
Description |
|---|---|---|
|
(required) |
Path to the AlleleFlux configuration YAML. |
|
|
Working directory for Snakemake execution. |
|
None |
Max concurrent jobs (local only; ignored when |
|
None |
Total threads available for local runs. |
|
None |
Total memory for local runs (e.g., |
|
None |
Snakemake profile directory for cluster/HPC execution (e.g., |
|
False |
Plan the DAG without running jobs. |
|
False |
Unlock a previously crashed working directory. |
|
None |
Quoted string of extra Snakemake flags (alternative to |
Examples¶
# Run with a config file
alleleflux run --config config.yml
# Run with limited resources
alleleflux run --config config.yml --threads 16 --memory 64G
# Dry run to see what would be executed
alleleflux run --config config.yml --dry-run
# Run with SLURM profile
alleleflux run --config config.yml --profile slurm_profile/
# Force rerun all jobs with reasoning
alleleflux run --config config.yml -- --forceall --reason
# Run with specific working directory
alleleflux run --config config.yml --working-dir /path/to/workdir
Notes¶
Pass additional Snakemake flags either after
--or via--snakemake-args.When using
--profile, job/thread/memory parameters are overridden by profile settings.See Running the Workflow for detailed scheduling instructions.
alleleflux init — create a config¶
Create a new AlleleFlux configuration file interactively or from a template.
alleleflux init [--template] [--output alleleflux_config.yml]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
False |
Print a template config to stdout instead of interactive mode. |
|
|
Output configuration file path. |
Examples¶
# Interactive mode (prompts for settings)
alleleflux init
# Print template to stdout
alleleflux init --template
# Interactive mode with custom output file
alleleflux init --output my_alleleflux_config.yml
# Save template to file
alleleflux init --template > my_template.yml
alleleflux info — show install paths¶
Display version, package location, and Snakefile paths. Useful for debugging installation issues.
alleleflux info
alleleflux tools — list console scripts¶
List all available console scripts grouped by functional category.
alleleflux tools [--category {Analysis,Preprocessing,Statistics,Evolution,Accessory,Visualization}]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
None |
Filter by category (optional). Lists all if not specified. |
Examples¶
# List all tools
alleleflux tools
# List only Analysis tools
alleleflux tools --category Analysis
# List Preprocessing tools
alleleflux tools --category Preprocessing
Console scripts by stage¶
These are invoked automatically by the workflow but can be run manually for testing or custom tasks. Run any script with --help for full arguments.
Analysis tools¶
alleleflux-profile — profile BAM files into per-MAG allele tables¶
Extract base-level coverage and allele information from aligned BAM files.
alleleflux-profile --bam-path BAM --fasta-path FASTA --prodigal-fasta GENES \
--mag-mapping-file MAPPING --output-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Path to sorted BAM file. |
|
Path to reference FASTA file (must match BAM alignment reference). |
|
Path to Prodigal predicted genes (DNA FASTA format). |
|
Tab-separated file mapping contigs to MAG IDs (columns: |
|
Output directory for profiles. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
All available CPUs |
Number of processors to use. |
|
From BAM filename |
Sample identifier (auto-extracted if not provided). |
|
30 |
Minimum base quality score to include a base. |
|
2 |
Minimum mapping quality score to include a read. |
|
False |
Include reads without properly paired mate. |
|
False |
Do not ignore overlapping read segments (may double-count). |
|
INFO |
Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL). |
Output¶
Creates {output_dir}/{sampleID}/{sampleID}_{mag_id}_profiled.tsv.gz with columns:
contig: Contig identifierposition: 0-based genomic positionref_base: Reference base at positiontotal_coverage: Total read coverageA,C,G,T,N: Base countsmapq_scores: MAPQ scores for readsgene_id: Overlapping gene identifier (if any)
Examples¶
# Basic profiling
alleleflux-profile --bam-path sample1.bam --fasta-path reference.fa \
--prodigal-fasta genes.fna --mag-mapping-file mag_mapping.tsv \
--output-dir profiles/
# With custom sample ID and resource limits
alleleflux-profile --bam-path sample1.bam --fasta-path reference.fa \
--prodigal-fasta genes.fna --mag-mapping-file mag_mapping.tsv \
--output-dir profiles/ --sampleID my_sample --cpus 8
# Stricter quality filtering
alleleflux-profile --bam-path sample1.bam --fasta-path reference.fa \
--prodigal-fasta genes.fna --mag-mapping-file mag_mapping.tsv \
--output-dir profiles/ --min-base-quality 35 --min-mapping-quality 10
alleleflux-allele-freq — compute allele frequencies per MAG¶
Analyze allele frequencies across samples and timepoints.
alleleflux-allele-freq --mag-id MAG --mag-metadata-file METADATA \
--fasta FASTA --output-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
MAG identifier to process. |
|
Path to MAG metadata file (TSV with sample_id, file_path, group, time). |
|
Path to reference FASTA file. |
|
Output directory for results. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
0.1 |
Minimum breadth of coverage (0-1). |
|
|
Analysis type: |
|
False |
Keep constant positions (all samples same allele). |
Output¶
Creates {mag_id}_allele_freq.tsv.gz with allele frequency data per position/sample.
alleleflux-scores — calculate parallelism and divergence scores¶
Derive MAG-level scores from statistical test results.
alleleflux-scores --rootDir DIR --output-dir DIR [options]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Directory containing |
|
Required |
Output directory. |
|
All available |
Number of processors. |
Examples¶
# Score all MAGs
alleleflux-scores --rootDir metadata/ --output-dir scores/
# With custom CPU count
alleleflux-scores --rootDir metadata/ --output-dir scores/ --cpus 16
alleleflux-cmh-scores — CMH-specific score aggregation¶
Calculate CMH test scores for a MAG. See also: alleleflux-cmh for running CMH tests.
alleleflux-cmh-scores --cmh-df INPUT --mag-id MAG --output-dir DIR [options]
Preprocessing tools¶
alleleflux-metadata — build MAG metadata from profiles¶
Generate MAG metadata files from sample profiles and sample sheet.
alleleflux-metadata --metadata-file INPUT --profiles-dir DIR \
--mag-id MAG --output-dir DIR [options]
Arguments¶
Argument |
Description |
|---|---|
|
Input sample metadata file (CSV/TSV). |
|
Directory containing profile files. |
|
MAG ID to process. |
|
Output directory. |
alleleflux-qc — quality control on profiles¶
Perform coverage and breadth QC on MAG profiles.
alleleflux-qc --root-dir PROFILES --mag-id MAG --output-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Directory containing profile files. |
|
MAG ID to process. |
|
Output directory. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
None |
Path to reference FASTA (optional). |
|
None |
Contig-to-MAG mapping file (optional). |
|
0.1 |
Minimum breadth of coverage (0-1). |
|
1.0 |
Minimum average coverage depth. |
|
|
Analysis type: |
Output¶
Creates {mag_id}_QC.tsv with QC results including breadth_threshold_passed column.
Examples¶
# Basic QC
alleleflux-qc --root-dir profiles/ --mag-id MAG000001 --output-dir qc/
# Custom thresholds
alleleflux-qc --root-dir profiles/ --mag-id MAG000001 --output-dir qc/ \
--breadth-threshold 0.2 --coverage-threshold 5.0
alleleflux-eligibility — generate MAG eligibility tables¶
Create eligibility tables for statistical tests based on QC results.
alleleflux-eligibility --qc-dir QC_DIR --output-file OUTPUT [options]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Directory containing QC files. |
|
Required |
Output eligibility file path. |
|
4 |
Minimum number of samples required. |
|
|
Analysis type: |
Output¶
Creates eligibility table with columns:
mag_id: MAG identifierunpaired_test_eligible: Eligible for unpaired testspaired_test_eligible: Eligible for paired testssingle_sample_eligible_*: Per-group single-sample eligibility
Statistical test tools¶
alleleflux-cmh — Cochran-Mantel-Haenszel stratified test¶
Run CMH tests for stratified allele frequency analysis (typically stratified by replicate).
alleleflux-cmh --input-df INPUT --mag-id MAG --output-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Path to input allele frequency dataframe. |
|
MAG ID to process. |
|
Output directory. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
None |
Path to filtered dataframe for position filtering. |
|
4 |
Minimum number of strata (replicates) required. |
|
|
Analysis mode: |
|
None |
Group name for |
|
All available |
Number of processors. |
Output¶
Creates {mag_id}_cmh.tsv.gz with columns:
mag_id: MAG identifiercontig: Contig identifiergene_id: Gene identifierposition: 0-based positionnum_pairs: Number of replicate pairs testedp_value_CMH: CMH test p-valuetime: Timepoint (for longitudinal data)notes: Error messages or warnings
Examples¶
# Basic CMH test
alleleflux-cmh --input-df allele_freq.tsv --mag-id MAG000001 --output-dir cmh_results/
# Across timepoints mode
alleleflux-cmh --input-df allele_freq.tsv --mag-id MAG000001 \
--output-dir cmh_results/ --data-type across_time --group fat
# With preprocessing filter
alleleflux-cmh --input-df allele_freq.tsv --mag-id MAG000001 \
--output-dir cmh_results/ --preprocessed-df preproc.tsv --cpus 16
alleleflux-lmm — Linear mixed models for longitudinal analysis¶
Run LMM tests for longitudinal data with mixed effects.
alleleflux-lmm --input-df INPUT --preprocessed-df PREPROCESSED \
--group GROUP --mag-id MAG --output-dir DIR [options]
Arguments¶
Argument |
Description |
|---|---|
|
Path to input allele frequency dataframe. |
|
Path to filtered dataframe. |
|
Group name to analyze. |
|
MAG ID to process. |
|
Output directory. |
alleleflux-two-sample-unpaired — unpaired two-sample tests¶
Perform unpaired Mann-Whitney U tests comparing two groups.
alleleflux-two-sample-unpaired --input-df INPUT --mag-id MAG \
--output-dir DIR [options]
Arguments¶
Argument |
Description |
|---|---|
|
Path to input allele frequency dataframe. |
|
MAG ID to process. |
|
Output directory. |
alleleflux-two-sample-paired — paired two-sample tests¶
Perform paired Wilcoxon signed-rank tests on matched samples.
alleleflux-two-sample-paired --input-df INPUT --mag-id MAG \
--output-dir DIR [options]
Arguments¶
Argument |
Description |
|---|---|
|
Path to input allele frequency dataframe. |
|
MAG ID to process. |
|
Output directory. |
Evolution tools¶
alleleflux-dnds-from-timepoints — calculate dN/dS ratios¶
Compute dN/dS ratios from significant evolutionary sites.
alleleflux-dnds-from-timepoints --input INPUT --output OUTPUT [options]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Path to input significant sites table. |
|
Required |
Output dN/dS results file. |
See dN/dS Analysis Guide for detailed workflow.
Accessory tools¶
alleleflux-create-mag-mapping — generate MAG mapping and combined FASTA¶
Create contig-to-MAG mapping file and concatenate individual MAG FASTA files.
alleleflux-create-mag-mapping --dir MAG_DIR --extension EXT \
--output-fasta COMBINED --output-mapping MAPPING [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Directory containing individual MAG FASTA files. |
|
File extension of MAG files (e.g., |
|
Path for combined output FASTA. |
|
Path for contig-to-MAG mapping file (TSV). |
Output¶
Combined FASTA: all contigs from all MAGs concatenated
Mapping file:
contig_name\tmag_id(tab-separated)
Examples¶
# Create mapping from directory of MAG FASTAs
alleleflux-create-mag-mapping --dir mags/ --extension fa \
--output-fasta combined_reference.fa --output-mapping mag_mapping.tsv
# With different extension
alleleflux-create-mag-mapping --dir mags/ --extension fasta \
--output-fasta reference.fasta --output-mapping mapping.tsv
alleleflux-add-bam-path — add BAM file paths to metadata¶
Fill bam_path column in sample metadata by matching with BAM files.
alleleflux-add-bam-path --metadata INPUT --output OUTPUT \
--bam-dir DIR [options]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Path to input metadata file. |
|
Required |
Path to save updated metadata. |
|
|
Directory containing BAM files. |
|
|
Extension of BAM files. |
|
False |
Drop samples without matching BAM files. |
alleleflux-coverage-allele-stats — compute coverage and allele statistics¶
Calculate coverage and allele statistics summary for all MAGs.
alleleflux-coverage-allele-stats --input-dir DIR --output-file OUTPUT [options]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Directory containing profile files. |
|
Required |
Output statistics file path. |
|
All available |
Number of processors. |
Output¶
Summary statistics per MAG: mean coverage, breadth, allele diversity metrics.
alleleflux-list-mags — enumerate MAG IDs¶
List all unique MAG IDs from a directory of profile files.
alleleflux-list-mags --input-dir DIR [--output-file FILE]
Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
Required |
Directory containing MAG profile files. |
|
None |
Optional output file (prints to stdout if not specified). |
|
|
Glob pattern for file matching. |
Additional accessory tools¶
alleleflux-positions-qc— Position-level QC filteringalleleflux-copy-profiles— Copy or symlink profile filesalleleflux-single-sample— Within-group single-sample testalleleflux-preprocess-between-groups— Position filtering between groupsalleleflux-preprocess-within-group— Position filtering within groupsalleleflux-preprocessing-eligibility— Aggregate preprocessing statusalleleflux-p-value-summary— Summarize p-values across testsalleleflux-outliers— Flag outlier genesalleleflux-taxa-scores— Derive taxa-level scoresalleleflux-gene-scores— Derive gene-level scores
Visualization tools¶
alleleflux-plot-trajectories — plot allele frequency trajectories¶
Generate allele frequency trajectory visualizations from tracked allele data.
alleleflux-plot-trajectories --input-file FILE [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Long-format frequency table from |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
|
Column for ranking sites: |
|
10 |
Number of top sites for line plots (or |
|
|
Number of sites for box/violin plots. |
|
|
X-axis column: |
|
None |
Custom x-axis order (space-separated values). |
|
|
Plot types: |
|
False |
Generate individual plots per site. |
|
None |
Number of sites for per-site plots. |
|
|
Output directory. |
|
|
Format: |
|
False |
Aggregate trajectories by replicate. |
|
None |
Day binning width (requires |
|
1 |
Minimum samples per time bin. |
|
0.8 |
Line transparency (0-1). |
Output¶
{mag_id}_line_plot.{format}: Combined line trajectories{mag_id}_box_plot.{format}: Box plots by timepoint{mag_id}_violin_plot.{format}: Violin plots by timepointper_site/{contig}_{position}_{gene}_line.{format}: Per-site plots (if enabled)
Examples¶
# Basic plotting
alleleflux-plot-trajectories --input-file tracked_alleles.tsv
# Multiple plot types with custom output
alleleflux-plot-trajectories --input-file tracked_alleles.tsv \
--plot-types line box violin --output-dir results/plots/ \
--output-format pdf
# Per-site plots for top 5 sites
alleleflux-plot-trajectories --input-file tracked_alleles.tsv \
--per-site --n-sites-per-site 5 --output-format svg
# With binning and custom axis order
alleleflux-plot-trajectories --input-file tracked_alleles.tsv \
--bin-width 7 --x-order "baseline week1 week2 week4 week8"
alleleflux-track-alleles — track allele trajectories¶
Track anchor allele frequencies across all samples and timepoints.
alleleflux-track-alleles --mag-id MAG --anchor-file FILE \
--metadata META --output-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
MAG identifier to process. |
|
Path to terminal nucleotides file (from |
|
Enhanced metadata file with |
|
Output directory. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
|
Anchor column to use for tracking. |
|
0 |
Minimum coverage required per site. |
|
All available |
Number of processors. |
Output¶
{mag_id}_frequency_table.wide.tsv: Sites × samples matrix{mag_id}_frequency_table.long.tsv: Tidy format (for plotting)
alleleflux-prepare-metadata — prepare metadata for visualization¶
Standardize and combine metadata tables for visualization workflows.
alleleflux-prepare-metadata --metadata-in INPUT --metadata-out OUTPUT \
--base-profile-dir DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Input metadata table (TSV). |
|
Output standardized metadata file. |
|
Base directory containing sample profile subdirectories. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
|
Column name for sample IDs. |
|
|
Column name for experimental groups. |
|
|
Column name for timepoints. |
|
|
Column name for day/order (optional). |
|
|
Column name for replicates (optional). |
|
|
Column name for subject IDs. |
Output¶
Standardized metadata with columns: sample_id, group, time, subjectID, sample_profile_dir.
alleleflux-terminal-nucleotide — identify terminal nucleotides¶
Find dominant terminal alleles at significant genomic sites.
alleleflux-terminal-nucleotide --significant-sites SITES \
--profile-dir DIR --metadata META --group GROUP \
--timepoint TP --output DIR [options]
Required Arguments¶
Argument |
Description |
|---|---|
|
Path to significant sites table (from p-value summary). |
|
Directory containing sample profile subdirectories. |
|
Sample metadata file. |
|
Target group name for terminal nucleotide calculation. |
|
Target timepoint (typically endpoint). |
|
Output directory. |
Optional Arguments¶
Argument |
Default |
Description |
|---|---|---|
|
|
Significance column: |
|
0.05 |
Maximum p-value to include site. |
|
|
Test type to filter sites. |
|
None |
Optional additional group filter. |
|
All available |
Number of processors. |
|
INFO |
Logging level. |
Output¶
{mag_id}/{mag_id}_terminal_nucleotides.tsv: Terminal alleles per site{mag_id}/{mag_id}_frequencies.tsv: Full frequency dataterminal_nucleotide_analysis_summary.tsv: Summary across MAGs
Getting help¶
View detailed help for any tool:
# Main command help
alleleflux --help
# Subcommand help
alleleflux run --help
alleleflux init --help
# Console script help
alleleflux-profile --help
alleleflux-cmh --help
alleleflux-plot-trajectories --help
For configuration details, see Configuration Reference. For how to run the workflow end to end, see Running the Workflow.