Custom classes
Defines classes for representing proteins and their structural components. This includes handling domains, fragments, and subsections within protein sequences, intended for use as part of the AlphaFragment workflow.
- Classes:
Domain: Represents a domain within a protein sequence.
Protein: Models a protein, including sequence, domains, and fragments.
ProteinSubsection: Represents a subsection of a protein sequence.
- class alphafragment.classes.Domain(identifier, start, end, domain_type)
Bases:
objectRepresents a domain in a protein sequence, defined by start/end positions and a type.
- Attributes:
id (str): Identifier for the domain.
start (int): Start position of the domain in the sequence. Must be >= 0.
end (int): End position of the domain in the sequence. Must be >= 0.
domain_type (str): Type of the domain.
- Raises:
ValueError: If start or end is less than 0, or ‘start’ > ‘end’.
- class alphafragment.classes.Protein(name, accession_id, sequence, first_res=0, last_res=None, domain_list=None, fragment_list=None, fragment_sequences=None)
Bases:
objectModels a protein, including sequence, domains and fragments. Domains are significant structural or functional units within the protein, while fragments refer to specific subsequences of the protein sequence created as part of the AlphaFragment workflow.
- Attributes:
name (str): Name of the protein.
accession_id (str): UniProt accession ID.
sequence (str): Amino acid sequence of the protein.
first_res (int): Index of the first residue.
last_res (int): Index of the last residue, defaults to the sequence length.
domain_list (list of Domain instances, optional): Domains within the protein.
fragment_list (list of tuples, optional): Fragments identified in the protein sequence, represented as a tuple in the form (start_pos, end_pos), using pythonic slice notation (ie inclusive of start, exclusive of end).
fragment_sequences (list of str, optional): Sequences of the fragments - only used in fragment reinitialization, in order to allow dimeric fragments to be added
- add_domain(domain)
Adds a Domain instance to the protein’s domain list.
- Parameters:
domain (Domain): The Domain instance to be added.
- Raises:
ValueError: If the input is not an instance of the Domain class.
- add_fragment(*args)
Adds a fragment to the protein’s fragment list. Allows input as either separate start and end integers or a tuple (start, end). Ensures that the fragment’s start and end positions are positive integers and that the start is less than the end. Also checks for continuity with the last fragment added.
- Parameters:
args: Can be two integers (start, end) or a single tuple (start, end).
- Raises:
ValueError: If start or end are not positive integers or if start is not less than end. Also if the fragment does not follow sequentially after the last added fragment, or does not fit within sequence bounds
- add_fragment_sequences(sequence_list)
Adds a list of fragment sequences to the protein.
- Parameters:
sequence_list (list of str): List of sequences corresponding to the fragments.
- class alphafragment.classes.ProteinSubsection(parent_protein, first_res, last_res)
Bases:
ProteinRepresents a specific subsection of a protein sequence, inheriting from the Protein class. Initialized based on a parent protein and specified first_res/last_res positions within the parent’s sequence. Takes sequence within the specified region and inherits all domains and fragments from the parent.
- Attributes:
parent_protein (Protein): The original protein from which the subsection is derived.
first_res (int): Start position of the subsection in the parent protein sequence.
last_res (int): End position of the subsection in the parent protein sequence.
- Note:
first_res and last_res are expected to be in 0-based indexing, and inclusive of the start and end.