(extracts for 1990-91 field test manual by Dan Hornbach)

The BioQUEST project is based on the pedagogy of the 3 P's - Problem Posing, Problem Solving and Persuasion. Most people jump to the conclusion that statistics is only about the "persuasion" part of the three P's. However as you really begin to understand statistics, you learn that statistical analysis, and its underlying philosophy, begins with Problem Posing - just as in the 3 P's. Statistics can aid in problem posing by allowing you to organize your observations so they may suggest hypotheses to be tested.

Statistical analysis also aids in the development of experiments to test hypotheses (problem solving); if you're going to test a hypothesis, you must make sure that when you construct an experiment there will be an objective means by which we can judge the outcome. Experimental design is an entire branch of statistics.

Finally, statistical analysis does aid in persuasion. It allows you to describe the confidence you have in your analysis. One must be careful however for it is very easy to misuse statistics. You must be certain that you have chosen the correct tests, that your data conform to the underlying assumptions of the tests, and that you have not performed a posteriori tests (that is, tests constructed after the fact of collecting your data).

There are a number of "canned" statistical packages available for the Macintosh. A good listing of these can be found in the Macintosh Buyers Guide, which is published quarterly and can be purchased at most book stores.

Most of the tests that are outlined in this module can easily be performed with a spreadsheet program such as Excel or Wingz. With many of these types of spreadsheet programs there are integrated graphics; thus you can examine your data visually as well as quantitatively. The drawback to this type of analysis is that the equations must be entered by hand. In addition, the size of the data set often will alter how the equations are written. The advantage, however, is that you learn more about what the statistical test actually does with your data.

In choosing a statistics package, you should consider one that has as its underlying focus Exploratory Data Analysis (EDA). This means that rather than just performing particular tests, the software allows you to explore your data: that is, to ask the "what if" questions. Two such programs that are available for the Macintosh are JMP (by SAS Institute) and DataDesk (by Odesta Corporation).

To help you in your introduction to statistical analysis, a Hypercard stack has been included that is related to this manual. The stack has been designed to allow you to decide, based on the data you have collected, which statistical tests are most appropriate for your situation. The stack is not all encompassing; that is, it only deals with those statistical test found in this manual and not all possible situations.

Statistical analysis is one of the most widely used, and abused, techniques in the biological sciences. Statistics are ostensibly used to allow an investigator to be objective. That is, the researcher uses statistical tests to determine whether or not his/her hypothesis is supported by the data collected. Unfortunately, the choice of the particular statistical test is often not objective and the underlying limitations of individual tests are often ignored or unknown by the researcher. Yet statistical analysis, when appropriately applied, allows scientists to examine the probability that their hypotheses are or are not supported by the data collected.

Almost any hypothesis set forth by a biologist lends itself to statistical analysis. For example, if you are testing the effect of a drug on the survival of mice and 15 of 30 mice taking the drug survived while only 12 of 30 mice taking a placebo survived, would you conclude that the drug increases the chance of survival? Isn't it possible that, by chance, 3 more mice in the experimental group survived and their survival was unrelated to the drug effects? Another example: you are examining the potential effect of insect grazers on the growth of milkweed plants. You decide to measure the weight of the milkweed plants from 2 locations -- one where predation is high, the other where predation is low. You find that the average weight of a plant from each location is 13 grams. but in the location where predation is high plants vary in weight from 2-25 grams, whereas in the location where predation is low the variation in weight is only from 10-15 grams. How do you test whether the insects affect the weight of the plants?

There are many ways that statistical analysis can aid in hypothesis testing. The most important use of statistical analysis, however, is to allow a researcher to describe objectively the confidence he/she has in the acceptance or rejection of a given hypothesis. This is done by testing a NULL HYPOTHESIS; (H0) which is. the hypothesis that there is no difference between two or more groups being examined. For example, you might want to test the following null hypothesis: is the weight attained by bean plants watered daily the same as that of those which are watered every third day. Every null hypothesis has an ALTERNATIVE HYPOTHESIS; (HA) and in this experiment HA is that the weight gained by plants watered every third day is NOT the same (either greater or less) as that of plants watered every day. After you collect your data (the weight of the plants after some period of time), you can statistically examine whether or not the average weights are different. There is always some chance that the weights of the two groups were different based on chance alone. How do you tell the difference? YOU DON'T! You can only state with some degree of certainty (which is based on the statistical test that you've performed) that they are different; you ASSUME that the difference is due to the treatment.

There are two types of errors you can make when assuming something about your hypothesis: you can reject the hypothesis when it is true, or you can accept the hypothesis when it is false. The first type of error is called a type I error; (sometimes called an alpha error;) the second is called a type II; (or beta) error; (see Table 1)

The manual goes on from here to provide examples and analysis in a tutorial fashion.

Statistics on the Macintosh i

Biometrics Hypercard Stack ii

Introduction 1

Error and Hypothesis Testing 1

Experimental Design 3

Statistical Analysis 3

Data Collection 3

Measurement Scales 3

Accuracy and Precision 4

Distributions 5

Measures of Central Tendency and Dispersion 10

Measures of Central Tendency 10

Measures of Variability or Dispersion 12

Statistical Tests 18

Goodness of Fit 19

F-Test and T-Test 24

Mann-Whitney 28

One-Way Anova 30

Kruskal-Wallis 33 Regression 37 Correlation 43 A Note on One-Tailed and Two-Tailed Tests 47

Two Good Biostatistical References 48

Appendix A. Critical Values of The c2 Distribution 49

Appendix B. Critical Values of The F Distribution 50

Appendix C. Critical Values of The t Distribution 52

Appendix D. Critical Values of The U Distribution 53

Appendix E. Critical Values of The q Distribution 56

Appendix F. Critical Values of The H For The Kruskal-Wallis Test 58

Appendix G. Critical Values of The Q For Multiple Comparisons 59

Appendix H. Critical Values of r, The Correlation Coefficient 60

Index 61