EXERCISE
There are three parts to this exercise.
1. Use the program GENSCAN to identify a possible coding
region within a genomic sequence
2. Use the program BLASTP to see how well your prediction
matches with the actual coding region
3. Use the site UniGene to examine the location of this
gene on a chromosmal map.
The following four accession numbers identify genes that are relatively
small (<10,000 bp) and contain just a few exons. You could use
other gene sequences or
random segments of genomic DNA sequence as well.
J00265 Insulin
J00120 Myc
J00148 Human growth hormone
V00499 Beta globin
PART 1.
Use Entrez to obtain the sequences of these genes.
Select FASTA under Display to get just the DNA sequence.
Copy the sequence and open GENSCAN.
Paste your genomic sequence into the box and select Run GENSCAN.
You will get results which predict the location of intron and
exon splice sites and the predicted coding sequence.
How many introns?
What was the length of the genomic sequence you submitted?
What was the length of the regions that encoded protein?
What other regions were identified by the program (poly adenylation sites, promoters, repeated elements, CpG islands)?
What are the functions of these other regions?
PART 2.
Once you think you have identified the coding sequence of the gene, copy that amino acid sequence.
Use the predicted amino acid sequence to perform a BLASTP alignment with other proteins to see how well your prediction matches with the correct cDNA sequence. Instructions on the use of BLAST are provided.
In the BLAST program paste the amino acid sequence into the box and select blastp under program and then select search.
You will get back an alignment of known sequences.
Compare these amino acid sequences with the sequence you predicted.
Did the entire sequence match, or were there regions that did not match?
Explain any unexpected results.
If your prediction was not accurate go back to GENSCAN to see if you can figure out where the error occurred.
PART 3.
Go to the site UniGene to learn more about your gene.
Paste the acession number into the search box and hit GO.
If so what percent homology does the human protein have with each of the other species indicated?
Which chromosome your gene is located on?
Does it contain any STS? If so which ones
Which organs express this gene (EST)?