Robin Dowell & Debra Goldberg
Objective: Build a framework/skeleton for a mini-project that can be used in different related, but distinct, classes. The basic framework we outlined:
Protein sequences fold into 3D structure within the cell (See PDB). The structure of the sequence constrains the permissible evolutionary changes. This exercise is designed to examine and explore the relationship between 3D structure and evolutionary conservation using ConSurf.
This general work flow can all be done within ConSurf:
sequence => generate a MSA => score conservation => visualization
Basic Analysis Steps:
- 1. Begin with a sequence of interest.
- This sequence should have a known 3D structure (e.g. in PDB) in order for subsequent visualization to work.
- 2. Identify related sequences:
- 3. Align the set of sequences using a multiple sequence alignment algorithm (MSA)
- ConSurf algorithms include MAFFT, clustalw, etc
- Ultimately MSA depends on an underlying scoring scheme and some inference of a phylogeny (the relatedness between the sequences).
- 4. Score the multiple sequence alignment for conservation.
- Scoring methods at ConSurf include Bayesian and Maximum Likelihood.
- 5. Visualize various relationships:
- The primary sequences utilized
- The phylogenetic relationship between the sequences
- The multiple sequence alignment, colored by conservation
- The 3D structure, colored by conservation
So how do we adapt this for different classes? A graduate level algorithms class or a senior undergraduate biology course might emphasize different aspects of this exercise.
Variations and Extensions:
- Modify the number of sequences considered in the multiple sequence alignment.
- Consider different levels of conservation given the same number of sequences.
- Compare different multiple sequence alignment methods:
- Select different ConSurf options
- Use/Write your own sequence alignment method.
- If considering a protein domain, use the PFAM manually curated alignment.
- Compare Bayesian to Maximum likelihood scoring schemes for conservation.
- Contemplate why some aspects of the molecule are more conserved than others.
- Consider the problem of highlighting differences in individuals within the same species – use PyMol to visualize key differences in an alignment of very closely related sequences.
- Prepare an entry for Protopedia (http://www.proteopedia.org) using this information.
One classroom presentation would be to lead with an example (An excellent example by Team Volunteers) and then give the above as a handout with details specific to the class objectives.