Distributions, Hypotheses and Guppies

Humans are pack rats. Not literally *Neotoma cinerea*, but pack rats all the same. We are especially fond of shiny bits of information, which we store in the middens of our minds.

We tend to store our information as reference distributions—that is, we determine typical and extreme values for stuff. For example, you might be the kind of pack rat that has a penchant for tee-shirts. You could tell anyone who might ask you, whether a tee-shirt’s price was average, expensive, or a great buy. For someone else it might be that they store sports information (*e.g.*, batting averages), or even the number of pages in novels.

Have you ever noticed when a new baby is born, everyone—Ok, especially women who already have had a baby—comments on the baby’s birth weight? That’s right; people have reference distributions for baby’s weights squirreled away in their heads, too.

So, exactly what is a reference distribution? A reference distribution is a frequency distribution that we use for making comparisons. A frequency distribution is simply a graph, table or chart that shows the number of observations falling into each of several categories (see, *e.g.*, http://www.statcan.ca/english/edu/power/ch8/frequency.htm ).

A person might have it in mind—from looking at a lot of baby girls—that examples of girl birth weights are (in *lbs*, for a forty-week gestation):

Very small 5 ½

Small 6 ½

Average 7 ½

Big 8 ½

Very big 9 ½

(That means most baby girls are about 7 ½ lbs, and baby girls weighing less than 5 ½ lbs or more than 9 ½ lbs are relatively rare. Notice that here, the numbers—5 ½, 6 ½, 7 ½, etc—are the categories, while the words—very small, small, average, etc—hint at the relative frequencies of the categories.)

So, if their next door neighbor delivers an 11 *lb *baby girl, they would know that’s some baby! They know this because of the reference distribution they carry around.

We create reference distributions for everything from gas prices to grades at school. Of course, some of us are more accurate than others in creating them, but the more accurate we are, the more reliable our inferences will be. (Yeah, that glow-in the-dark Bobby Sherman tee-shirt is a steal at $62.99…not! For those who need to ask… http://www.youtube.com/watch?v=QJOuTr0BXb4 ). So, it is important to make sure you have accurate data before making an inference.

***

Scientists use reference distributions to evaluate hypotheses; this is called hypothesis testing. Whenever they do an experiment, they have a reference distribution in mind and compare their experimental results to it.

Guppies!

Let’s imagine a scientist is attempting to create a new and improved formula for guppy feed. If she is successful—she strikes gold, because she will make a ton of money selling it.

She has two fish tanks. Each tank holds 30 female guppies, all the same age. Tank ‘A’ guppies got the standard feed; tank ‘B’ guppies got the test formula feed. The guppies were fed on these regimes for the first two-months of their lives and then weighed.

The average weight of guppies in Tank ‘A’ was 0.302 grams. When the scientist arranged the thirty Tank ‘A’ guppy weights from lightest to heaviest, she found:

- the lightest guppy weighed 0.219 grams,
- about 25% of the guppies (
*i.e.*, the seven smallest guppies) each weighed less than 0.280 grams, - half the guppies each weighed less than 0.300 grams,
- about 75% of them (twenty-two guppies) each weighed less than 0.321 grams, and
- the largest guppy weighed 0.384 grams.

These Tank ‘A’ guppies gave the scientist her reference distribution. Very small, small, average, large and very large guppy weights were: 0.22, .28, 0.30, 0.32, and 0.38 grams, respectively. This is known as a five-number summary (http://wand.cs.waikato.ac.nz/pubs/22/html/node35.html ).

She found that the average (i.e., mean) weight of the thirty Tank ‘B’ guppies was 0.368 grams; the smallest guppy from this group weighed 0.291 grams.

Even the average guppy from Tank ‘B’ was bigger than large guppies from Tank ‘A’! Moreover, the smallest Tank ‘B’ guppy was bigger than any *small *guppy from Tank ‘A’.

She concluded (*i.e.*, inferred) that her new feed formula is better than the standard feed.

Hypothesis Testing

The guppy feed example demonstrates how an experiment works. The null hypothesis (H_{0}) is that there is no difference between the standard feed and the new feed (H_{0} is often referred to as the hypothesis of *no difference*). The alternative hypothesis (H_{A}) is that the new feed is better than the standard feed.

The scientist then has two possible choices:

**(1)** reject H_{0}, or

**(2)** fail to reject H_{0}

(scientists never like to say they accept anything, so instead of stating that they *accept H _{A}* they state

*reject H*and instead of

_{0 }*accepting*

*H*they

_{0}*fail to reject*

*H*. Also, note that the choice is always in reference to a particular H

_{0}_{A};

_{ }a different H

_{A}might result in a different choice regarding H

_{0}).

In this guppy example, the data indicates she should choose **(1)**. However, scientists are only human and so experiments are never perfect. The data might lead to one of two types of errors:

First, the data might indicate she should choose **(1)** when she should, in fact, choose **(2)**. That is a **Type I error** (in other words, deciding against H_{0} when it is actually true). Type I errors also are known as ‘false-positives’ (because the experimenter falsely believes the experimental treatment is different than the control).

Second, the data might indicate she should choose **(2)** when she should, in fact, choose **(1)**. That is a **Type II error** (in other words, deciding in favor of H_{0 }when it is actually false).

**[**It can be difficult keeping the definitions for Type I and Type II errors straight. You can avoid confusion by using the parody of a prospector panning for gold in a river bed as a useful mnemonic device.

Imagine that a prospector has a nugget of something in his pan. His choice is to toss it back into the river or add it to his pile of gold. So, H_{0} is stating that “the nugget is no different than ordinary rock” and H_{A} is stating that “the nugget is more valuable than ordinary rock (*e.g.*, gold).”

Now, if it is actually rock but he adds it to his pile of gold, then that would be a mistake (a false hope: Type I error). If it is actually gold—but he tosses it back into the river—then that would be a mistake, too (a missed opportunity: Type II error).

Remember, a Type I error occurs when we reject a true null hypothesis—*T1RTN*. A Type II error occurs when we reject a true alternative hypothesis—*T2RTA*.**]**

You can learn more about hypothesis testing at these websites (which give a more advanced description):

http://www.psychstat.missouristate.edu/introbook/SBK18.htm

http://teacher.pas.rochester.edu/phy_labs/appendixe/appendixe.html

http://www.unc.edu/courses/2007spring/biol/145/001/docs/lectures/Sep29.html

http://www.stattutorials.com/understanding-hypothesis-testing.html

***

A note on statistical errors (Type I and Type II).

How can it be that the data would lead you to make an error? It happens just by chance, even with good data. Consider testing the fairness of flipping a coin. (H_{0 }: the coin is *not* *different from* ‘fair’; H_{A }: the coin is ‘*not* fair’.) A fair coin should come up heads 1/2 of the time and tails 1/2 of the time.

You flip a coin five times, and five times it comes up heads. The data indicates you should declare the coin to be ‘not fair’ (reject H_{0}). However, that is not an absolute statement; it is one based on probability (in other words, you infer that the coin is *probably* not fair—and you reject H_{0}—but you are not absolutely certain that the coin is ‘not fair’).

It is possible that just by chance, with a fair coin, you could get five heads in a row—not very likely, but it is possible (actually, you can expect that to happen with a frequency of (½)^{5}, which is about 3% of the time. Imagine one thousand people, each flipping five fair coins; you could expect that about thirty of them would get all heads). If this is one of those ‘3%’ times, you would have made a Type I error in declaring a coin to be loaded when it is actually fair!

***

See also Science: Data, Description and Experimentation https://wepoplaski.wordpress.com/2008/10/06/science-data-description-and-experimentation/

[…] – bookmarked by 1 members originally found by LioneATcakE on 2008-10-04 Distributions, Hypotheses and Guppies https://wepoplaski.wordpress.com/2008/07/09/distributions-hypotheses-and-guppies/ – bookmarked by 6 […]

By:

Bookmarks about Exampleson October 16, 2008at 9:00 am

[…] https://wepoplaski.wordpress.com/2008/07/09/distributions-hypotheses-and-guppies/ for more information about Type I and Type II […]

By:

THE COMPLEAT EXPERIMENTER: Types of Mistakes « Veritason March 2, 2009at 10:15 am

[…] guppy example will show how this equation can be used. In that example, a scientist tested a new formulation of […]

By:

THE COMPLEAT EXPERIMENTER: 4. Getting Power « Veritason August 8, 2009at 2:26 pm