tcrpmhcdataset.pMHC

The purpose of this python3 script is to implement the pMHC dataclass.

@dataclass(frozen=True)
class pMHC:

Define a Meaningful pMHC Class.

Args:

  • peptide (str): The peptide sequence
  • hla_allele (str): The HLA allele
  • cognate_tcr (TCR): The cognate TCR
  • reference (str): The reference for this pMHC

Attributes:

  • peptide (str): The peptide sequence
  • allele (str): The HLA allele
  • tcrs (set): The set of cognate TCRs
  • mhc (str): The MHC sequence
  • pseudo (str): The pseudo MHC sequence
  • references (set): The set of references for this pMHC

Implements:

  • __post_init__: Initialize the pMHC object
  • __repr__: Return a string representation of the pMHC object for tokenization purposes
  • __str__: Return a string representation of the pMHC object for user interaction
  • __eq__: Check if two pMHC objects are equal
  • __hash__: Return the hash of the pMHC object
  • add_tcr: Add a cognate TCR to the set of cognate TCRs for this pMHC
  • get_tcrs: Get the set of cognate TCRs for this pMHC
  • add_reference: Add a reference to the set of references for this pMHC
  • get_references: Get the set of references for this pMHC
  • hla_allele_parser: Custom HLA allele to standardize the allele and impute canonical HLA from Haplotypes
  • check_mutations: Check the locus of the mutations before making mutations to base sequence
  • apply_mutations: Apply mutations to the base sequence
  • hla_allele2seq: Take a MHCgnomes standardized allele name and return the IMGT, HLAdb sequence
  • hla_allele2pseudo: Take an imperfect allele name and return the NetMHC Pseudo-sequence
pMHC( peptide: Optional[str] = None, hla_allele: Optional[str] = None, cognate_tcr: Optional[object] = None, reference: Optional[str] = None, use_pseudo: bool = True, use_mhc: bool = False, eager_impute: bool = False, tcrs: Set[object] = <factory>, references: Set[str] = <factory>)
peptide: Optional[str] = None
hla_allele: Optional[str] = None
cognate_tcr: Optional[object] = None
reference: Optional[str] = None
use_pseudo: bool = True
use_mhc: bool = False
eager_impute: bool = False
tcrs: Set[object]
references: Set[str]
allele
mhc
pseudo
def add_tcr(self, cognate_tcr):

Add a cognate TCR to the set of cognate TCRs for this pMHC

Args:

  • cognate_tcr (TCR): The TCR to add

Returns:

  • None
def get_tcrs(self):

Get the set of cognate TCRs for this pMHC

Returns:

  • tcrs: The set of cognate TCRs for this pMHC
def add_reference(self, reference):

If another reference is supports this PMHC, add it to the existing set of references.

Args:

  • reference (str): The reference to add. Can be in any format recognized by the user.

Returns:

  • None
def get_references(self):

Get the set of references for this pMHC

Returns:

  • references: The set of references for this pMHC
def hla_allele_parser(self, hla_string):

NOTE: There are weird edge cases that cause failure to impute.

Enthusiastic parser. Match the HLA gene and if the serotype information is given, cycle through the allele possibilities (1,11) to identify the canonical serotype as the first matched one from the available sequenced MHCs.

Args:

  • hla_string (str): The HLA information

Returns:

  • parsed_hla_string (str): The standardized HLA allele string
@staticmethod
def check_mutations(mutations, og_seq):

Check the locus of the mutations before making mutations to base sequence

Args:

  • mutations - Tuple of (position, aa_original, aa_mutant)
  • og_seq - WT consensus sequence [May or may not have linker sequence]

Returns:

  • True if matches in all loci, false otherwise
@staticmethod
def apply_mutations(mutations, og_seq):

Apply mutations to the base sequence.

Args:

  • mutations - Tuple of (position, aa_original, aa_mutant)
  • og_seq - WT consensus sequence [May or may not have linker sequence]

Returns:

  • seq - The mutated sequence
def hla_allele2seq(self):

Take a MHCgnomes standardized allele name and return the IMGT, HLAdb sequence. Capable of Handling mutations through MHCgnomes parser. DISCLAIMER: Potential error with C*03:03

Returns:

  • seq (str): The sequence of the MHC allele
def hla_allele2pseudo(self):

Take an imperfect allele name and return the NetMHC Pseudo-sequence

NOTE: Does not contain values for all the Alleles or handle mutations given predefined pseudo-sequences.

Returns:

  • pseudo_seq (str): The pseudo-sequence of the MHC allele