matched pairs

matched pairs.

Learn by Doing

Matched Pairs: In this lab you will learn how to conduct a matched pairs T-test for a population mean using StatCrunch. We will work with a data set that has historical importance in the development of the T-test.

Paired T hypothesis test:

μD = μ1 – μ2 : Mean of the difference between Regular seed and Kiln-dried seed
H0 : μD = 0
HA : μD > 0
Hypothesis test results:

Difference Mean Std. Err. DF T-Stat P-value
Regular seed – Kiln-dried seed -33.727273 19.951346 10 -1.6904761 0.9391

Some features of this activity may not work well on a cell phone or tablet. We highly recommend that you complete this activity on a computer.

Here are the directions, grading rubric, and definition of high-quality feedback for the Learn by Doing discussion board exercises.

A list of StatCrunch directions is provided at the bottom of this page.

Context

Gosset’s Seed Plot Data

William S. Gosset was employed by the Guinness brewing company of Dublin. Sample sizes available for experimentation in brewing were necessarily small. At that time, Gosset contacted a famous statistician Karl Pearson (1857-1936) and was told that there were no techniques for developing probability models for small data sets. Gosset studied under Pearson, and the outcome of his study was perhaps the most famous paper in statistical literature, “The Probable Error of a Mean” (1908), which introduced the T-distribution.

Since Gosset was employed by Guinness, any work he produced would be owned by Guinness, so he published under a pseudonym, “Student”; hence, the T-distribution is often referred to as Student’s T-distribution.

To illustrate his analysis, Gosset used the results of seeding 11 different plots of land with two different types of seed: regular and kiln-dried. He wanted to determine if drying seeds before planting increased plant yield. Since different plots of soil may be naturally more fertile, this confounding variable was eliminated by using the matched pairs design and planting both types of seed in all 11 plots.

The resulting data (corn yield in pounds per acre) are as follows.

Plot Regular seed Kiln-dried Seed
1 1903 2009
2 1935 1915
3 1910 2011
4 2496 2463
5 2108 2180
6 1961 1925
7 2060 2122
8 1444 1482
9 1612 1542
10 1316 1443
11 1511 1535

We use these data to test the hypothesis that kiln-dried seed yields more corn than regular seed.

Because of the nature of the experimental design (matched pairs), we are testing the difference in yield.

Plot Regular seed Kiln-dried Seed Difference
1 1903 2009 –106
2 1935 1915 20
3 1910 2011 –101
4 2496 2463 33
5 2108 2180 –72
6 1961 1925 36
7 2060 2122 –62
8 1444 1482 –38
9 1612 1542 70
10 1316 1443 –127
11 1511 1535 –24

Note that the differences were calculated: regularkiln-dried.

Variables

Regular seed: regular seeds that were traditionally used for planting
kiln-dried: seed that were kiln-dried before planting

Data

Download the seed (Links to an external site.) data file, and then upload the file into StatCrunch.

Prompt

  1. State the hypotheses and define the parameter.
  2. Checking conditions: Since Gosset invented the T-distribution, we will assume that his sample meets the conditions and proceed with the T-test. Regardless, answer these questions to demonstrate your understanding of the conditions for use of the T-model.

    But first you will need to review the dotplots for the data (opens in a new tab).

    1. Which graph is used to check conditions? Why?
    2. What do we look for in the graph to verify that conditions are met?
    3. What else do we need to know about the sample of seeds before using the T-test?
  3. Use StatCrunch to find the T-score and the P-value. Hint: as you work through the StatCrunch directions, keep in mind that we want to calculate the differences as regularkiln-dried . So you will choose Regular seed for Sample 1 and kiln-dried seed for Sample 2. (directions)
    Copy and paste the information in the StatCrunch output window into your initial post.
  4. State a conclusion based on the context of this scenario.

EXAMPLE TO RIGHT ANSWER

1. Ho: μ=0

Ha: μ>0

The average difference is -33.73

2. a) We use the graph of the differences because that is what we are analyzing.

b) We look to see if the graph is normally distributed, not skewed, and doesn’t have outliers.

c) We don’t know if the data is randomly selected.

3.

Paired T hypothesis test:

μD = μ1 – μ2 : Mean of the difference between Regular seed and Kiln-dried seed
H0 : μD = 0
HA : μD > 0
Hypothesis test results:

Difference Mean Std. Err. DF T-Stat P-value
Regular seed – Kiln-dried seed -33.727273 19.951346 10 -1.6904761 0.9391

Differences stored in column, Differences.

4. Based on the P-value of 0.9391, we do not have enough evidence to reject the null hypothesis. There is no statistically significant evidence to show that kiln-dried seeds yield more than regular seeds.

matched pairs