build-genome-var-graph
Build a genome variation graph.
Usage
Template:
exacto build-genome-var-graph \
--variants-tsv-file <variants_tsv_file> \
--fasta-file <fasta_file> \
--output-fasta-file <output_fasta_file> \
--sequence-prefix <sequence_prefix> \
[--remove-unknown-bases REMOVE_UNKNOWN_BASES] \
[--only-variant-sequences ONLY_VARIANT_SEQUENCES] \
[--graph-type GRAPH_TYPE] \
[--num-threads NUM_THREADS]Example:
exacto build-genome-var-graph \
--variants-tsv-file tumor_dna_rna_variants_integrated.tsv \
--fasta-file reference_genome.fasta \
--output-dir genome_var_graph_outputs/Description
Build a genome variation graph.
NoteAt a glance
Inputs: *.tsv (variants), *.fasta (reference genome)
Outputs: Graph files written to --output-dir
Typical next step: Endpoint — feeds downstream graph-aware analyses
Required arguments
| Flag | Type | Description |
|---|---|---|
--variants-tsv-file |
str |
Variants TSV file. Expected columns: ‘variant_id’, ‘chromosome_1’, ‘position_1’, ‘operation_1’, ‘strand_1’, ‘chromosome_2’, ‘position_2’, ‘operation_2’, ‘strand_2’, ‘sequence’. |
--fasta-file |
str |
Reference genome FASTA file (variation graph backbone). |
--output-fasta-file |
str |
Output FASTA file. |
--sequence-prefix |
str |
Sequence prefix. |
Optional arguments
| Flag | Type | Default | Description |
|---|---|---|---|
--remove-unknown-bases |
str2bool |
yes |
If ‘yes’, then unknown nucleotides (‘N’ or ‘n’) are removed. |
--only-variant-sequences |
str2bool |
no |
If ‘yes’, then only variant sequences will be output. |
--graph-type |
str |
individual |
Variation graph type. Either ‘individual’ or ‘population’. |
--num-threads |
int |
4 |
Number of threads. |