Case Study 1
Constructing Phylogenies

Introduction  
This case study is designed to give students some experience with protein data bases and the analyses of such data and is based upon and largely excerpted from an article by Bilardello and Valdes (1998). Phylogenies or evolutionary histories of organisms help in classification, show possible evolutionary relationships, and are based on a variety of evidence, including fossils, morphology of living organisms, embryology, and molecular traits. Among the latter are the base sequences in DNA and RNA and amino acid sequences for homologous proteins in different organisms. The purpose of this case study is to constructing a phylogeny using molecular traits, and then compare this to a phylogeny based on morphology..

 

When using molecular trait data to develop phylogenies, differences in the sequence of DNA/RNA bases or of amino acids of a specific protein are determined. This information is available on the internet at  www.expasy.ch. Swiss-Prot  is a protein sequence database. The number of differences indicates the evolutionary distance between the organisms being investigated. For example, the greater the number of differences, presumably the more distantly related, are the organisms, i.e. their common ancestor existed longer ago than did the common ancestor of two organisms with fewer differences in their amino acid sequences.

 

For example, the amino acid sequence in cytochrome c for five different organisms is presented in the following table:

 

Human

GDVEKGKKIF   IMKCSQCHTV   EKGGKHKTGP   NLHGLFGRKT   GQAPGYSTYTA    ANKNKGIIWG

EDTLMEYLEN  PKKYIPGTKM   IFVGIKKKEE       RADLIAYLKK    ATNE

Horse

GDVEKGKKIF   VQKCQCHTV    EKGGKHKTGP   NLHGLFGRKT   GQAPGFTYTD   ANKNKGITWK

EETLMEYLEN   PKKYIPGTKM   IFAGIKKKTE      REDLIAYLKK    ATNE

Equas

GDVEKGKKIF   VQKCAQCHTV   EKGGKHKTGP   NLHGLFGRKT   GQAPGFSYTD   ANKNKGITWK

EETLMEYLEN   PKKYIPGTKM   IFAGIKKKTE        REDLIAYLKK    ATNE

Mouse

GDVEKGKKIF   VQKCAQCHTV   EKGGKHKTGP   NLHGLFGRKT   GQAAGFSYTD   ANKNKGITWG

EDTLMEYLEN   PKKYIPGTKM   IFAGIKKKGE   RADLIAYLKK      ATNE

Dog

GDVEKGKKIF   VQKCAQCHTV   EKGGKHKTGP   NLHGLFGRKT   GQAPGFSYTD   ANKNKGITWG

EETLMEYLEN   PKKYIPGTKM   IFAGIKKTGE   RADLIAYLKK       ATKE

 

D: aspartic        W: tryptophan      H: histidine    K: lysine             R: arginine      F: phenylalanine        I: isoleucine          L: leucine          V: valine              Y: tyrosine     M: methionine    A: alamine      C: cytosine                Q: glutamine    N: asparagines   E: glutamic acid   P: proline       S: serine             T: threonine     G: glucine

 

 

 

To develop a phylogeny of the five species, you must first develop a similarity matrix illustrating the similarities in amino acid sequences, as shown below.

 

 

Similarity matrix

 

Hu

Ho

E

M

D

Hu

100

88

89

91

89

Ho

 

100

99

94

94

E

 

 

100

95

95

M

 

 

 

100

96

D

 

 

 

 

100

 

 

There is a website (/http://clustalw.genome.jp/) which helps you to do this. It is an alignment tool. Next a phylogenetic tree is developed from this information.

 

 

  You can develop a tree (see Figure 1) using this similarity matrix or by going to the bottom if the CLUSTALW page to the select tree menu, select dendrogram, and execute. The branching tree developed will be based upon the similarities in the amino acid sequences.

Figure 1. Dendrogram based on similarities in amino acid sequences.

 

 

 

 

 

 

 

 

Project

 

1. Use netscape to find the amino acid sequences of the protein hemoglobin a for the following species:

HBA_ALLMI american alligator

HBA_AMMLE Barbary sheep
HBA_ATEGE Black-handed spider monkey
HBA_DANRE zebrafish
HBA_HETPO Port Jackson shark
HBA_Human human
HBA_LATCH coelacanth
HBA_TARGR roughskin newt

Choose two more species from different groups (e.g. a bird)

 

2. Create a similarity matrix for hemoglobin a for the ten species.

To do this, open the web site for ExPASy, type in the name of the protein you wish to compare at the top of the page in the cell following Search Swiss-Prot/TrEMBL for ___________. In our case this will be hemoglobin a. Then select Go. The list of sequences produced is long and may seem confusing. For our species, go down to, for example, HBA_HUMAN.  Click on the sequence. This brings up the sequence entry. Scroll down to the sequence information. Click on FASTA format. You need to copy the sequence (the entire sequence) in this FASTA format and then go to CLUSTALW  

(http://clustalw.genome.jp/. Use another browser for this) and paste the sequence into the window

provided. You need to do this for the other nine species. Then execute the multiple alignment by hitting the execute multiple alignment button at the bottom of the page. A CLUSTALW results form will be presented.

 

Develop a similarity matrix for these results. For example, the table below is a difference matrix we developed last semester for five species: human, horse, donkey, mouse and dog.

 

 Difference matrix based upon amino acid differences for hemoglobin a.

 

Hu

Ho

E

M

D

Hu

0

17

20

20

23

Ho

 

0

3

22

26

E

 

 

0

24

28

M

 

 

 

0

26

D

 

 

 

 

0

 

You can then develop a tree using these results by going to the bottom if the CLUSTALW page to the select tree menu, select dendrogram, and execute. The branching tree developed will be based upon the differences in the amino acid sequences. Compare this to the method used above. Are your results the same?

 

3. Show the results of your evolutionary phylogeny based upon these differences or similarities in heloglobin a amino acid sequences..

 

4. Compare this to the development of cladograms in chapter 23 of your text by developing your own cladogram based upon morphological features.

 

5. Does your molecular cladogram compare to a cladogram you developed based upon morphological features? Explain.

 

6. Submit this as a report using WORD.

 

7. You must work with a partner for this project and each of you independently fill out and hand

in a Peer Evaluation Form

 

8. Grading:
Table 1 = 3 points
Table 2 = 3 points

Table 3 = 3 points

Table 4 = 3 points

Figure 1 = 3 points

Discussion = 3 points

Peer evaluation forms (2) = 7 points

 

Project due March 19at 11:00 AM for section 2 and 2:00 PM for section 1. Minus 25% each day late.

 

References
Bilardello, N. and Valdes, L. 1998. Constructing phylogenies. The American Biology Teacher 60 (5): 369-373.