call-germline-dna-vars

Call germline DNA variants in a long-read DNA BAM file.

Usage

Template:

exacto call-germline-dna-vars \
    --bam-file <bam_file> \
    --bam-bai-file <bam_bai_file> \
    --fasta-file <fasta_file> \
    --output-tsv-file <output_tsv_file> \
    [--preset PRESET] \
    [--num-threads NUM_THREADS] \
    [--regions REGIONS [REGIONS ...]] \
    [--min-reads MIN_READS] \
    [--min-mapping-quality MIN_MAPPING_QUALITY] \
    [--min-base-quality MIN_BASE_QUALITY] \
    [--min-total-depth MIN_TOTAL_DEPTH] \
    [--min-alt-allele-fraction MIN_ALT_ALLELE_FRACTION] \
    [--min-size-proportion MIN_SIZE_PROPORTION] \
    [--max-ins-norm-edit-distance MAX_INS_NORM_EDIT_DISTANCE] \
    [--max-intrachromosomal-distance MAX_INTRACHROMOSOMAL_DISTANCE] \
    [--max-intrachromosomal-distance-tau MAX_INTRACHROMOSOMAL_DISTANCE_TAU] \
    [--max-interchromosomal-distance MAX_INTERCHROMOSOMAL_DISTANCE] \
    [--max-slippage-repeat-length MAX_SLIPPAGE_REPEAT_LENGTH] \
    [--chunk-size CHUNK_SIZE] \
    [--max-records MAX_RECORDS] \
    [--expected-variant-allele-fraction EXPECTED_VARIANT_ALLELE_FRACTION] \
    [--expected-mutation-rate EXPECTED_MUTATION_RATE] \
    [--expected-sequencing-error EXPECTED_SEQUENCING_ERROR] \
    [--expected-slippage-probability EXPECTED_SLIPPAGE_PROBABILITY] \
    [--max-f1-fraction MAX_F1_FRACTION] \
    [--max-fpr MAX_FPR] \
    [--temp-dir TEMP_DIR]

Example:

exacto call-germline-dna-vars \
    --bam-file sample_dna.sorted.bam \
    --bam-bai-file sample_dna.sorted.bam.bai \
    --fasta-file reference_genome.fasta \
    --preset pb \
    --output-tsv-file sample_germline_dna_variants.tsv

Description

Call germline DNA variants in a long-read DNA BAM file.

NoteAt a glance

Inputs: *.bam, *.bam.bai, *.fasta

Outputs: *.tsv (one row per germline DNA variant call)

Typical next step: annotate-vars

Required arguments

Flag Type Description
--bam-file str Input BAM file.
--bam-bai-file str Input BAM.BAI file.
--fasta-file str Input reference genome FASTA file.
--output-tsv-file str Output TSV file.

Optional arguments

Flag Type Default Description
--preset str (pb|ont) Sequencing-platform preset that fills platform-typical defaults for –expected-sequencing-error, and –expected-slippage-probability. Choices: ‘pb’ (PacBio HiFi) or ‘ont’ (Oxford Nanopore). Any explicit parameter wins over the preset.
--num-threads int 4 Number of threads.
--regions str Genomic regions in which to identify variants (e.g. –regions chr1 chr2 or –regions chr1:1-1000000 chr2:1-1000000). If unspecified, Exacto identifies variants in all contigs found in the BAM file (–bam-file BAM_FILE).
--min-reads int 3 Minimum number of supporting reads.
--min-mapping-quality int 4 Minimum mapping quality.
--min-base-quality int 30 Minimum base quality.
--min-total-depth int 3 Minimum total depth.
--min-alt-allele-fraction float 0.2 Minimum alternate allele fraction.
--min-size-proportion float 0.5 Minimum size proportion between two variants. Size proportion = smaller variant size / longer variant size.
--max-ins-norm-edit-distance float 0.5 Maximum insertion normalized edit (Levenshtein) distance. Normalized edit distance = edit distance / longer insertion size.
--max-intrachromosomal-distance int 1000 Maximum distance for clustering intrachromomsomal variants.
--max-intrachromosomal-distance-tau int 2000 Maximum distance tau for clustering intrachromomsomal variants.
--max-interchromosomal-distance int 1000 Maximum distance for clustering intrachromomsomal variants.
--max-slippage-repeat-length int 30 Maximum slippage repeat length.
--chunk-size int 100000 Chunk size for variant calling.
--max-records int 7 Maximum number of records. Read names having more than this value will be excluded.
--expected-variant-allele-fraction float 0.5 Expected variant allele fraction.
--expected-mutation-rate float 0.001 Expected mutation rate.
--expected-sequencing-error float Expected sequencing error rate.
--expected-slippage-probability float Expected slippage probability. Suggested: 2x the expected sequencing error rate.
--max-f1-fraction float 0.99 Maximum F1 value fraction.
--max-fpr float 1e-06 Maximum false positive rate.
--temp-dir str Temp directory.