tcrpmhcdataset.pMHC
The purpose of this python3 script is to implement the pMHC dataclass.
Define a Meaningful pMHC Class.
Args:
- peptide (str): The peptide sequence
- hla_allele (str): The HLA allele
- cognate_tcr (TCR): The cognate TCR
- reference (str): The reference for this pMHC
Attributes:
- peptide (str): The peptide sequence
- allele (str): The HLA allele
- tcrs (set): The set of cognate TCRs
- mhc (str): The MHC sequence
- pseudo (str): The pseudo MHC sequence
- references (set): The set of references for this pMHC
Implements:
- __post_init__: Initialize the pMHC object
- __repr__: Return a string representation of the pMHC object for tokenization purposes
- __str__: Return a string representation of the pMHC object for user interaction
- __eq__: Check if two pMHC objects are equal
- __hash__: Return the hash of the pMHC object
- add_tcr: Add a cognate TCR to the set of cognate TCRs for this pMHC
- get_tcrs: Get the set of cognate TCRs for this pMHC
- add_reference: Add a reference to the set of references for this pMHC
- get_references: Get the set of references for this pMHC
- hla_allele_parser: Custom HLA allele to standardize the allele and impute canonical HLA from Haplotypes
- check_mutations: Check the locus of the mutations before making mutations to base sequence
- apply_mutations: Apply mutations to the base sequence
- hla_allele2seq: Take a MHCgnomes standardized allele name and return the IMGT, HLAdb sequence
- hla_allele2pseudo: Take an imperfect allele name and return the NetMHC Pseudo-sequence
Add a cognate TCR to the set of cognate TCRs for this pMHC
Args:
- cognate_tcr (TCR): The TCR to add
Returns:
- None
Get the set of cognate TCRs for this pMHC
Returns:
- tcrs: The set of cognate TCRs for this pMHC
If another reference is supports this PMHC, add it to the existing set of references.
Args:
- reference (str): The reference to add. Can be in any format recognized by the user.
Returns:
- None
Get the set of references for this pMHC
Returns:
- references: The set of references for this pMHC
NOTE: There are weird edge cases that cause failure to impute.
Enthusiastic parser. Match the HLA gene and if the serotype information is given, cycle through the allele possibilities (1,11) to identify the canonical serotype as the first matched one from the available sequenced MHCs.
Args:
- hla_string (str): The HLA information
Returns:
- parsed_hla_string (str): The standardized HLA allele string
Check the locus of the mutations before making mutations to base sequence
Args:
- mutations - Tuple of (position, aa_original, aa_mutant)
- og_seq - WT consensus sequence [May or may not have linker sequence]
Returns:
- True if matches in all loci, false otherwise
Apply mutations to the base sequence.
Args:
- mutations - Tuple of (position, aa_original, aa_mutant)
- og_seq - WT consensus sequence [May or may not have linker sequence]
Returns:
- seq - The mutated sequence