Workshop Announcement
Schedule
Resources
Application
Travel Directions

Background
Data Summary
Tools
Investigations
Goals
Getting Started

 


HIV Laboratory

Background

Richard Markham and colleagues (Proc. Natl. Acad. Sci. 95(21):12568-73, 1998), published data on the pattern of HIV evolution and the rate of CD4 T-cell decline sequence in 15 subjects by collecting blood samples at six month intervals for up to four years. From each visit, a 285 base pair region within the env gene from all the varieties of the gene (referred to as clones) were sequenced and CD4 T-cell counts were made. Six hundred and sixty-six nucleotide sequences were deposited in GenBank providing a rich resource for looking closely at the patterns of change in HIV over time.

Exploring HIV Evolution (686 kb pdf)

Data Summary

Subjects: 15

Number of visits: between 3-9 for each patient

Number of clones observed per subject per visit: 2-18

Total number of sequences available: 666

CD4 cell counts for each visit

 

Subject

Total Number of Visits

Total Number of Clones


Visit Number


Number of Clones



CD4 Count1

1

3

42

1

13

464

0

0

0

2

16

305

0

0

0

5

13

15

2

32

24

1

6

715

0

0

0

3

9

825

0

0

0

4

9

830

3

5

39

1

4

819

0

0

0

3

10

375

0

0

00

4

9

265

0

0

0

5

10

100

0

0

0

6

6

45

4

4

47

1

3

1028

0

0

0

2

13

710

0

0

0

3

18

470

0

0

0

4

13

135

5

5

43

1

8

749

0

0

0

2

12

770

0

0

0

3

11

650

0

0

0

4

7

550

0

0

0

5

5

700

6

7

54

1

3

405

0

0

0

2

3

225

0

0

0

3

9

350

0

0

0

4

12

390

0

0

0

5

9

475

0

0

0

7

9

400

0

0

0

9

9

560

7

5

43

1

10

1072

0

0

0

2

7

735

0

0

0

3

8

330

0

0

0

4

9

270

0

0

0

5

9

310

8

7

49

1

5

538

0

0

0

2

5

800

0

0

0

3

7

605

0

0

0

4

6

510

0

0

0

5

6

625

0

0

0

6

10

515

0

0

0

7

10

250

9

8

64

1

5

489

0

0

0

2

5

485

0

0

0

3

8

440

0

0

0

4

11

370

0

0

0

5

9

365

0

0

0

6

8

665

0

0

0

7

10

555

0

0

0

8

8

270

10

5

49

1

7

833

0

0

0

2

6

850

0

0

0

4

16

420

0

0

0

5

10

150

0

0

0

6

10

15

11

4

32

1

7

753

0

0

0

2

6

600

0

0

0

3

10

270

0

0

0

4

9

175

12

6

37

1

4

772

0

0

0

2

4

780

0

0

0

3

5

1285

0

0

0

4

6

1030

0

0

0

5

10

1395

0

0

0

8

8

850

13

5

26

1

4

671

0

0

0

2

2

825

0

0

0

3

7

835

0

0

0

4

7

770

0

0

0

5

6

975

14

9

77

1

6

523

0

0

0

2

6

580

0

0

0

3

6

570

0

0

0

4

10

595

0

0

0

5

7

460

0

0

0

6

11

420

0

0

0

7

10

460

0

0

0

8

9

450

0

0

0

9

12

350

15

4

40

1

12

707

0

0

0

2

9

250

0

0

0

3

9

75

0

0

0

4

10

15

NOTE: Some of the visit numbers are not sequential. In all cases visit 1 represents the first time the subject was evaluated. The subsequent time points (visits 2 through 9) represent six month intervals from the initial visit. Thus, if a subject missed their six-month appointment, then their visits would be numbered 1, 3, 4, etc.

Summary table of information available on the subjects studied in Markham et al. (1998).

1 The CD4 count for time 1 is reported in Table 1 of Markham et al. (1998); the other values are estimated from the Figure 1 of the same publication.

2 The paper reports 5 visits for subject 2, only 3 visits were identified in the GenBank records.

The summary table is available electronically and there is a file containing all of the sequences for each subject in the data set on your computer.

HIV-1 GP120 core complex with CD4 and a neutralizing human antibody (link to http://www.rcsb.org and I believe it is ID number 1GC1A)

top of page

Tools

  • Biology Workbench (http://workbench.sdsc.edu -- the San Diego one)
  • GeneDoc (locally installed)
  • SwissPDB Viewer (locally installed)
  • Microsoft Excel (locally installed)
  • JMP (installed on John's laptop)
  • If network access is interrupted, please use the locally installed versions of Phylip and ClustalX in place of Biology Workbench

Potential Investigations

Session I

    • What is the pattern of HIV evolution within a subject?
      • Do the number of clones change in any regular way over time?
      • Do certain clones appear to survive (leave descendents) over time while others disappear (go extinct)?
    • What is the pattern of HIV evolution across subjects?
      • Is the change in nucleotide sequence over time consistent within or across subjects?
      • Do the sequences in different individuals diverge over time?
      • Are clones within in one patient monophyletic and more close to one another than clones from other patients?
    • What is the pattern of HIV evolution within the env sequence?
      • Are there particular positions in these sequences that are more or less likely to mutate?
      • Are there different rates of synonymous (silent) and non-synonymous mutations? Are these distributed on particular branches of associated trees? How are these related to viral speciation?

Session II

    • What are the patho-physiological effects of HIV evolution?
      • Is there any relationship between rate of sequence divergence and CD4 T-cell count?
      • Are particular types of sequence changes associated with a change in CD4 T-cell count?
      • Can you infer anything about the efficiency of treatment?
    • What can you establish about the relationships of the strains of the HIV virus that infected these 15 subjects?
      • Do you think they all came from the same source?
      • Do you think any of them experienced multiple (for example, serial) infections?
    • The Markham et al. (1998) paper mentions that subjects 1 and 2 were known to be epidemiologically related.
      • What do you think this means?
      • Using your analyses, can you defend or refute this statement?
    • Do some sequence changes play a more important structural/functional role?
      • Which of the nucleotide sequence changes lead to "significant" amino acid substitutions (hydrophobic/hydrophilic, uncharged/charged, bulky/tiny, aromatic/non-aromatic, etc.)?
      • Based on the position of the residue in the 3-dimensional structure of the protein, are some changes more significant than others (distances, electrostatics, surface, etc.)?

Goals

Session I

    • Identify definable questions that can be approached using phylogenetics
    • Use the data & tools provided to begin to explore these questions
    • Share your insights with the group and develop additional questions

Session II

    • Using your results from Session I, begin to approach the larger questions
    • Using the your analyses, the data & tools provided begin to explore these questions
    • Share your insights with the group
    • Evaluate the effectiveness of this exercise and these two sessions

Getting Started

Analyzing DNA Sequences for HIV env Protein 

GETTING INTO BIOLOGY WORKBENCH

1. Open Netscape Navigator or Internet Explorer and type the following URL into the Location box: http://workbench.sdsc.edu

OPENING YOUR ACCOUNT

2. Click on the hotlink "Enter the Biology Workbench 3.2," which is blue and is large and underlined. This will bring up a small screen.

3. Supply your username and password. If you donít have one, click the NEW USER button and create an account for yourself.

4. Click on the "OK" button. This will give you a new screen.

STARTING A NEW SESSION

5. Scroll down and click on the "Session Tools" button.

6. Click on the "NEW" button.

7. At the next screen, create a session name (e.g. "HIV env") in the white box to the right of the words "Session Description."

8. Click on the "Start New Session" button.

UPLOADING NUCLEIC ACID SEQUENCES

9. Click on the "Nucleic Tools" button.

10. Scroll down and click on the "Add" button in the middle of the top line of buttons.

11. Click the "Browse" button.

12. Find the Desktop and double click on the "HIV Sequences" folder. This will open the folder. Go to the popup menu for "Files of type:". Select "All Files" from the menu. This will bring up the data for all subjects. Choose your first subject by double clicking on that subject's number.

13. Click on the "Upload File" button. At this point, you will be able to see the sequences on your screen.

14. IMPORTANT!!!! Click the "SAVE" button at the top of the page.

15. Repeat steps 12-16 to add the data for a second subject.

ALIGNING SELECTED SEQUENCES

16. Click on any number of DNA sequences PER SUBJECT to activate them. A small checkmark should appear in the box to the left of all the sequences that you will want to analyze.

Note: The names of the files will look like the following: S10V5-6. This stands for subject 10, visit 5, clone 6.

17. Click on the "CLUSTALW" button on the right of the second row of buttons at the bottom of the screen. This will give you a new screen with the selected sequences listed.

18. On this screen, click the "Submit" button. Then scroll down to see your alignment.

19. Your alignment can be downloaded to your PC and viewed more easily in GeneDoc.

MAKING AN EVOLUTIONARY TREE

20. Scroll to the top of the page and click on the "Import Alignment(s)" button.

21. In the next screen, click in the box to the left of "CLUSTALW-Nucleic" to activate the set of aligned sequences.

22. Click on the "DRAWGRAM" button.

23. Click on the "Submit" button. Then scroll down to see your tree.

24. Print your tree by choosing "Print" under the "File" menu.
top of page
 
 
Author: Sam Donovan -- 2002