Utility functions for protein fragmentation
” Internal utility functions for protein fragmention process.
- Functions:
validate_fragmentation_parameters: Validates the parameters used for protein fragmentation.
merge_overlapping_domains: Merges overlapping domains within a list of domains.
check_valid_cutpoint: Helper function to validate potential fragment boundaries.
recursive_fragmentation: Main function for recursively generating fragments.
- Dependencies:
Domain: A class representing a domain within a protein sequence.
Protein: A class representing a protein sequence.
- alphafragment.fragmentation_methods.check_valid_cutpoint(res, domains, sequence_end)
Checks if a slicing index is a valid cutpoint.
- Parameters:
res (int): The residue position to check (will be sliced before this residue).
domains (list of Domain): The domains within the protein.
sequence_end (int): The last residue position in the protein sequence.
- Returns:
bool: True if the residue position is a valid cutpoint; False otherwise.
- alphafragment.fragmentation_methods.merge_overlapping_domains(domains)
Merges overlapping domains within a list of domains.
- Parameters:
domains (list of Domain): List of domain objects.
- Returns:
list of Domain: A list of domains where overlapping domains have been merged into single entries.
- alphafragment.fragmentation_methods.recursive_fragmentation(protein, domains, fragment_start, length, overlap, cutpoints=None)
Recursively splits a protein sequence into overlapping fragments, avoiding breaking domains.
- Parameters:
protein (Protein): The protein object to fragment.
domains (list of Domain): The list of domains within the protein - doesn’t use protein.domain_list as overlapping domains should be merged.
fragment_start (int): The starting position for fragmentation.
length (dict): Dictionary containing the ideal, minimum, and maximum length values, in the format: {‘min’: min_len, ‘ideal’: ideal_len, ‘max’: max_len} where min_len, ideal_len, and max_len are all integers, with min_len <= ideal_len <= max_len.
overlap (dict): Dictionary containing the ideal, minimum, and maximum overlap values, in the format: {‘min’: min_overlap, ‘ideal’: ideal_overlap, ‘max’: max_overlap} where min_overlap, ideal_overlap, and max_overlap are all integers, with min_overlap <= ideal_overlap <= max_overlap.
cutpoints (list of tuples, optional): Accumulator for storing fragment cutpoints.
- Returns:
list of tuples or None: The list of fragment cutpoints if successful; otherwise, None.
- alphafragment.fragmentation_methods.validate_fragmentation_parameters(protein, length, overlap)
Validates the parameters used for protein fragmentation.
- Parameters:
protein (Protein): The protein object to be fragmented.
length (dict): Dictionary containing the ideal, minimum, and maximum length values, in the format: {‘min’: min_len, ‘ideal’: ideal_len, ‘max’: max_len} where min_len, ideal_len, and max_len are all integers, with min_len <= ideal_len <= max_len.
overlap (dict): Dictionary containing the ideal, minimum, and maximum overlap values, in the format: {‘min’: min_overlap, ‘ideal’: ideal_overlap, ‘max’: max_overlap} where min_overlap, ideal_overlap, and max_overlap are all integers, with min_overlap <= ideal_overlap <= max_overlap.
- Returns:
None
- Raises:
ValueError: If any of the parameter validations fail.
TypeError: If the protein input is not an instance of the Protein class.