Usage Guide
This section provides a quick start guide to using the AlphaFragment package. This workflow demonstrates how to process protein data from a CSV file, identify domains, fragment proteins, and visualize the results. For a more thorough guide with explanations of how the package works and different options available, refer to AlphaFragment usage guide.ipynb.
Workflow
Setting Up Your Python Script
Begin by setting up your Python script with the necessary paths for input and output:
input_csv_path = "input.csv" output_csv_path = "output.csv" image_save_location = "folder_path"
Initializing Protein Data from CSV
Import protein data from a CSV file. This step converts the CSV into a list of Protein objects and a DataFrame containing all data in the input file.
from alphafragment import initialize_proteins_from_csv proteins, df = initialize_proteins_from_csv(input_csv_path)
Protein Domain Identification and Fragmentation
For each protein in the dataset, identify domains and fragment the protein accordingly:
from alphafragment import compile_domains, fragment_protein for protein in proteins: domains = compile_domains(protein, protein_data=df) for domain in domains: protein.add_domain(domain) fragments = fragment_protein(protein) for fragment in fragments: (protein.add_fragment(start, end) for start, end in fragment)
Visualization of Fragmentation
Create a graphic that illustrates the domain locations and fragmentation results:
from alphafragment import plot_fragmentation_output for protein in proteins: plot_fragmentation_output(protein, fragments, image_save_location)
Updating and Saving Output Data
Update the DataFrame with protein information and save the updated data to a CSV file:
from alphafragment import update_csv_with_fragments update_csv_with_fragments(df, output_csv_path, proteins)
Output Generation
Generate FASTA files and AlphaPulldown input files for further analysis:
from alphafragment import output_fastas, output_pulldown output_fastas(proteins) output_pulldown(proteins)