Creating input files for AlphaFold or AlphaPulldown

This module contains functions for creating protein fragment pairs and files using these, for use as input to AlphaFold or AlphaPulldown.

Functions:
  • get_protein_combinations: Generates a list of protein pairs based on the method selected.

  • output_fastas: Creates fasta files for each combination of protein fragments.

  • output_pulldown: Creates a txt file with protein fragment combinations compatible with AlphaPulldown.

Dependencies:
  • csv: For reading CSV files.

  • os: For creating directories and files.

  • combinations: For generating combinations of proteins.

  • Protein: A class representing a protein with sequence and domain information.

alphafragment.fragment_file_creation.get_protein_combinations(proteins, method, combinations_csv, one_protein)

Generates a list of protein pairs based on the method selected.

Parameters:
  • proteins (list): A list of Protein objects.

  • method (str): Which protein combinations to use - ‘all’ for all v all, ‘one’ for all v one, ‘specific’ for specific combinations (specified in a csv file using the combinations_csv argument).

  • combinations_csv (csv path, optional): The path to a CSV file containing specific protein combinations. Each row in the CSV should represent one combination, with two columns for the names of the two proteins, and no column headings. Required if method is ‘specific’, otherwise ignored.

  • one_protein (str, optional): Name of a protein to use for one v all combinations. Required if method is ‘one’, otherwise ignored.

Returns:
  • protein_combinations (list): A list of tuples, where each tuple contains two Protein objects.

Raises:
  • ValueError: If any of the inputs are invalid.

Note:
  • For ‘all’ and ‘one’ methods, self-combinations are included.

alphafragment.fragment_file_creation.output_fastas(proteins, save_location=None, method='all', combinations_csv=None, one_protein=None)

Creates a folder for each protein pair, containing fasta files for each combination of fragments for that pair.

Parameters:
  • proteins (list): A list of Protein objects.

  • save_location (str, optional): The path to the directory where the folders and fasta files will be saved. Default is None, which saves in the current working directory.

  • method (str, optional): Which protein combinations to use - ‘all’ for all v all, ‘one’ for all v one, ‘specific’ for specific combinations (specified using the combinations_csv argument). Default is ‘all’.

  • combinations_csv (csv path, optional): The path to a CSV file containing specific protein combinations to generate. Each row in the CSV should represent one combination, with two columns for the names of the two proteins. Default is None. Required if method is ‘specific’, otherwise ignored.

  • one_protein (str, optional): Name of a protein to use for one v all combinations. Default is None. Required if method is ‘one’, otherwise ignored.

Returns:
  • None, but creates folders and .fasta files for each protein pair.

alphafragment.fragment_file_creation.output_pulldown(proteins, output_name='pulldown_input.txt', fasta_name='pulldown_sequences.fasta', method='all', combinations_csv=None, one_protein=None)

Creates a fasta and txt file with protein fragment combinations compatible with AlphaPulldown.

Parameters:
  • proteins (list): A list of Protein objects.

  • output (str, optional): The name of the output file. Default is ‘pulldown_input.txt’.

  • fasta_name (str, optional): The name of the fasta output containing all protein sequences.

  • method (str, optional): Which protein combinations to use - ‘all’ for all v all, ‘one’ for all v one, ‘specific’ for specific combinations (specified using the combinations_csv argument). Default is ‘all’.

  • combinations_csv (csv path, optional): The path to a CSV file containing specific protein combinations to generate. Each row in the CSV should represent one combination, with two columns for the names of the two proteins. Default is None. Required if method is ‘specific’, otherwise ignored.

  • one_protein (str, optional): Name of a protein to use for one v all combinations. Default is None. Required if method is ‘one’, otherwise ignored.

Returns:
  • None, but creates a txt file with protein fragment combinations.