Chapter 7: Inference for Means

Inference for the Mean of a Populaton

Beer Example

One night Jon was drinking a 12 pack of beer. After 3 beers he started worrying that he was getting cheated by the beer company and was not getting the amount of beer that he paid for. The remaining beer cans were taken to the UMM chemistry lab and the actual contents were accurately measured. The results in ounces are as follows: 15.8, 16.2, 16.3, 15.9, 15.5, 15.9, 16.0, 15.6, 15.8 Is Jon being cheated? Test at 5% level. The data analyzed in Statlets gives the following result: Summary Statistics for Beer

Sample size = 9
Mean = 15.8889
Median = 15.9
Standard deviation = 0.257121
Minimum = 15.5
Maximum = 16.3
Range = 0.8
Standardized skewness = 0.192705
Standardized kurtosis = -0.245636
The hypotheses are: tex2html_wrap_inline229 vs tex2html_wrap_inline231 . The test statistic is:

displaymath14

The p-value is: tex2html_wrap_inline233 is between .10 and .15.

  1. The probability of observing data like ours is between .10 and .15 if the null hypothesis is true.
  2. These data are likely to occur if the null hypothesis was true.
  3. The data are consistent with the null hypothesis.
  4. No evidence to doubt the null hypothesis.
  5. No evidence to support the alternative hypothesis.
  6. No evidence to suggest that the beer bottles are underfilled. No evidence that Jon is being cheated. Jon should drink the remaining beer and quit whining.

Beer CI

Research Question is: Is 16oz in the rangle of plausible values in this population?

displaymath27

Where the degrees of freedom were df=n-1=9-1=8. Conclusion

Milk carton example

One day Jon went to TMC and bought 6 cartons of milk to see if he was getting cheated by the milk company and not getting as much milk as he payed for. The milk was then taken to a high tech measurement lab and the actual contents were measured. The results were as follows: 233.5, 233, 233.5, 234, 233.5, 233.5 Is Jon being cheated? Test at a 5% level. The Statlets analysis revealed the following results by going into the stats tab and the t-test tab, clicking on options and picking the less than option.

Summary Statistics for milk
 
Sample size = 6
Mean = 233.5
Median = 233.5
Standard deviation = 0.316228
Minimum = 233.0
Maximum = 234.0
Range = 1.0
Standardized skewness = 0.0
Standardized kurtosis = 1.25

Estimation of Population Mean for milk
 
Sample size = 6
Mean = 233.5
 
95.0% upper confidence bound for mean: 233.5 + 0.260142   [233.76]
 
t-test
------
Null hypothesis: mean = 236.0
Alt. hypothesis: less than
Computed t-statistic = -19.3649
P-value = 3.38744E-6
Reject the null hypothesis for alpha = 0.05
The hypotheses are: tex2html_wrap_inline241 vs tex2html_wrap_inline243 . The test statistic is:

displaymath35

The p-value is: tex2html_wrap_inline245 is less than .0005.

  1. The probability of observing data like ours is less than .0005 if the null hypothesis is true.
  2. The data are unusual if the null hypothesis is true.
  3. The data are inconsistent with the null hypothesis.
  4. The data suggest evidence for the alternative hypothesis.
  5. Evidence for the alternative hypothesis means evidence that mu<236ml. Evidence that the milk cartons are underfilled and that Jon is getting cheated. Jon should whine and complain to somebody.

Milk Carton CI

The research question is: Is 236ml in the range of plausible values for this population?

displaymath48

Conclusion

Analysis of Matched Pairs Experiments

Boy's Shoes Example

A shoe company was trying to determine if two types of shoe materials had different durability. These materials are denoted materials A and B. Material B was suspected to wear out sooner, but it was a cheaper substance. 10 kids were tested and the difference was calculated by B-A for each child. The mean difference was .41 and the standard deviation was .39.

Hypothesis Test

The hypotheses are: tex2html_wrap_inline255 vs tex2html_wrap_inline257 . The test statistic is:

displaymath59

The p-value is: tex2html_wrap_inline259 is equal to .0043.

  1. The probability of observing data like ours is .0043 if the null hypothesis is true.
  2. The data are unusual if the null hypothesis is true.
  3. The data are inconsistent with the null hypothesis.
  4. The data suggest evidence for the alternative hypothesis.
  5. Evidence for the alternative hypothesis means evidence that tex2html_wrap_inline261 . Evidence that material B is less durable than material A. The company should use material A even though it is more expensive.

Confidence Interval

Research question: Is material B less durable than material A?

displaymath71

Conclusion

Inference for Two Means

In this section we are usually trying to determine if two populations have different means. That is, we are trying to compare means tex2html_wrap_inline267 . Both significance testing and confidence intervals are used to make these comparisons.

Hypothesis Test Approach

The usual null hypothesis is no difference between the two population means. This is usually written as: tex2html_wrap_inline269 , the alternative is usually that there is a difference between the two group means: tex2html_wrap_inline271 . One sided alternative hypotheses are also possible.

The test statistic is a T, with formula given by,

displaymath88

Where tex2html_wrap_inline273 are the sample mean, standard deviation, and sample size for the first group, and tex2html_wrap_inline275 are the sample values for the second group.

The next step is to compute the p-value. However, this can get a bit complicated because this test statistic theoretically does not follow a T distribution. Luckily for us it does follow a T distribution approximately, but the best degrees of freedom to use is a sticky problem. If you are doing the problem by hand without the aid of a computer, the degrees of freedom for you should use is the smallest group sample size minus one. That is, the smallest tex2html_wrap_inline277 . If you are using a computer program that provides an estimate of the appropriate degrees of freedom on the output, go ahead and use what is provided as it will be slightly more accurate than using the by-hand method.

The p-value is computed as usual, for a two sided test, the p-value is tex2html_wrap_inline279 and the appropriate modifications for one sided tests. Do the conclusions as usual.

Confidence Interval

The confidence interval for tex2html_wrap_inline281 is given by the following formula:

displaymath114

All that needs to be done is to place the appropriate sample values into the formula. The degrees of freedom rules should be the same as above, either use the smaller of the two sample sizes minus one, or use the complex degrees of freedom computed by your favorite stat package if provided.

The crucial value to compare with is zero. If zero is not contained in the interval it means that zero is not a plausible value for the difference in the two means. Zero not plausible means that no difference between the groups is not a reasonable statement, and thus there is evidence the groups differ. How do they differ? This depends on what the interval values are. If the whole interval is above zero, it implies that the first group mean is higher than the second, and vice-versa if the interval contains all negative values. If the interval contains zero it implies that no difference between the groups is a plausible statement, and thus there is no evidence to suggest they differ.

Hot Dog Example

The MINITAB output for comparing the (fat or sodium?, cant recall at the moment) contents of two types of hotdogs mystery meat and poultry. For our purposes here the test statistic and p-value along with the confidence interval on the output shows evidence that mystery meat hot dogs have higher sodium/fat content than do poultry dogs. Do you see why?

 MTB > set c1
 DATA> 173 191 182 190 172 147 146 139 175 136 179 153 107 195 135 140 138
 DATA> end
 MTB > set c2
 DATA> 129 132 102 106 94 102 87 99 170 113 135 142 86 143 152 146 144
 DATA> end
 MTB > name c1 'meat'
 MTB > name c2 'poultry'
 MTB > twosample data in 'meat' 'poultry'
 
 TWOSAMPLE T FOR meat VS poultry
           N      MEAN     STDEV   SE MEAN
 meat     17     158.7      25.2       6.1
 poultry  17     122.5      25.5       6.2
 
 95 PCT CI FOR MU meat - MU poultry: ( 18.5,  54.0)
 
 TTEST MU meat = MU poultry (VS NE): T= 4.17  P=0.0002  DF=  31
 
 MTB > stack c1 c2 in c3;
 SUBC> subscripts in c4.
 MTB > boxplot c3;
 SUBC> by c4.
         
 C4      
 
                                       ---------------------
 1                     ----------------I      +            I--------
                                       ---------------------
 
                    ----------------------
 2          --------I             +      I-------------
                    ----------------------
           --------+---------+---------+---------+---------+--------C3      
                 100       120       140       160       180

An Assortment of Two Sample Examples

Male Waiter, Smiley Face Data

In this example there are two samples of tip percentages for a male waiter. In one set a smiley face was put on the check and in the other set, no smiley face was used. The data summary is as follows:

 MTB > twosample 'wface' 'wnoface'
 
 TWOSAMPLE T FOR wface VS wnoface
           N      MEAN     STDEV   SE MEAN
 wface    23     17.83      5.52       1.2
 wnoface  21      21.4      12.7       2.8
 
95 PCT CI FOR MU wface - MU wnoface: 
( -9.7,  2.6)
 
TTEST MU wface = MU wnoface (VS NE): T= -1.19  
P=0.25  DF=  26

Hypothesis Test Approach

Notice that in this solution I used the minitab degrees of freedom, without this information the degrees of freedom should have been 20.

  1. The hypotheses are tex2html_wrap_inline283 versus tex2html_wrap_inline285 , because we are interested in an increase in tip percentage for either method.
  2. The test statistic is T=-1.19.
  3. The p-value is tex2html_wrap_inline289 which gives a probability of 2*(.1, .15)=(.2, .3).
  4. Conclusion.
    (a)
    The probability of observing data like our is between (.2,.3) if the null hypothesis was true.
    (b)
    This is a big chance.
    (c)
    The data are likely to occur if the null hypothesis was true. The data are consistent with the null hypothesis. The data provide no evidence to doubt the null hypothesis, and no evidence for the alternative hypothesis.
    (d)
    No evidence for the alternative hypothesis means there is no evidence that the average tip percentages is different for the checks with the smiley face on them.

Confidence Interval Approach

The confidence interval for this dataset would be:

displaymath143

This gives a tex2html_wrap_inline293 value of 2.056 for a 95 percent confidence interval. This interval for tex2html_wrap_inline295 is ( -9.7, 2.6). The interpretation is:

(a)
The plausible range of values for the difference in the two group averages was between -9.7 and 2.6 .
(b)
This range includes the crucial value of zero.
(c)
Because zero is inside the interval it means that zero is a plausible value for the difference between the two means. If zero is a plausible value, this means that no difference between the group averages is a plausible statement.
(d)
If no difference between the group means is a plausible statement, there is no evidence that the smiley face changes the tip percentage.

Fred McGriff Example

This example concerns the possible benefit of the inclusion the player Fred McGriff to the Atlanta Braves in the 1993 season.

Hypothesis Test Approach

  1. The null hypothesis is no change in the average runs scored, tex2html_wrap_inline297 , the alternative is whether there has been an increase in average runs per game, tex2html_wrap_inline299 .
  2. The test statistic is a T,

    displaymath170

  3. The p-value is tex2html_wrap_inline301 .
  4. The conclusion.
    (a)
    The probability of observing data like ours is between (.001, .002) if the null hypothesis is true.
    (b)
    This is a small chance.
    (c)
    The data is unlikely to occur if the null hypothesis is true. The data are inconsistent with the null hypothesis. The data provide evidence against the null and support for the alternative hypothesis that tex2html_wrap_inline303 .
    (d)
    This means that there is evidence that the average runs per game increased after FM became an Atlanta Brave.

Confidence Interval Approach

I'll again use the minitab degrees of freedom of 124.

displaymath182

This yields an interval of approximately (-2.9, -.68). Now for the conclusion.

(a)
The plausible range of values for tex2html_wrap_inline305 is (-2.9, -.68).
(b)
The crucial value of zero is not inside the interval. All of the plausible values for the difference in the two means is zero. This means that no difference is not a plausible statement for this problem, there is a difference between the two group means.
(c)
The plausible values all negative must mean that the mean for before is smaller than the mean for after. This means there is evidence that the addition of Fred McGriff increased the average runs per game for the Atlanta Braves.

Sex Partner Problem

This problem concerns the number of lifetime opposite sex partners as reported by males and females in the General Social Survey. For males there was 1682 surveyed with an average number of contacts of 11.719 and a standard deviation was 24.479. The minimum number for males was 0 and the maximum was 253 contacts. There were 1850 females surveyed with an average of 3.322 contacts and a standard deviation of 6.054 contacts. The minimum was zero and the maximum was 100 contacts for females.

Test Approach

  1. Null hypothesis tex2html_wrap_inline307 versus tex2html_wrap_inline309 .
  2. Test Statistic is a T=13.69.
  3. The p-value is 2*(p<.0005)=pvalue<.001, when we use 1000 degrees of freedom in the t-distribution.
  4. Conclusion.
    (a)
    The probability of observing data like ours is less than .001 if the null hypothesis was true.
    (b)
    This is a small chance.
    (c)
    The data are very unlikely to occur if the null was true. The data are very inconsistent with the null hypothesis. This provides evidence against the null and support for the alternative hypothesis.
    (d)
    Evidence for the alternative hypothesis means we have evidence that the population group averages are different for the two groups. The fact that the sample average for males is much larger than that for females implies that there is evidence that males report larger average lifetime sexual contacts than females.

Confidence Interval Approach

The 95 percent confidence interval for tex2html_wrap_inline315 is, (7.194, 9.54) when using 1000 degrees of freedom for the t-distribution.

(a)
This interval gives a plausible range of values for the difference between the two population means. This plausible range is (7.194, 9.54).
(b)
This entire range is above zero, so zero is not a plausible value for this difference. If zero is not a plausible value for this difference, this means there is evidence for a difference in the two group means.
(c)
The interval containing all positive values means that there is evidence that tex2html_wrap_inline317 is larger than tex2html_wrap_inline319 for these data. Thus there is evidence that males report more contacts on average than do females. The interval tells us that there are approximately 7 to 9.5 more lifetime contacts reported for males than for females.

Oops, what is going on here? How can this be possible when opposite sex contacts for males and females must match? It turns out that the most active males and females are causing most of the problems.

About this document ...

This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -split 0 ch7.

The translation was initiated by Jon E. Anderson on Mon Jul 23 17:11:32 CDT 2001


Jon E. Anderson
Mon Jul 23 17:11:32 CDT 2001