One night Jon was drinking a 12 pack of beer. After 3 beers he started worrying that he was getting cheated by the beer company and was not getting the amount of beer that he paid for. The remaining beer cans were taken to the UMM chemistry lab and the actual contents were accurately measured. The results in ounces are as follows: 15.8, 16.2, 16.3, 15.9, 15.5, 15.9, 16.0, 15.6, 15.8 Is Jon being cheated? Test at 5% level. The data analyzed in Statlets gives the following result: Summary Statistics for Beer
Sample size = 9 Mean = 15.8889 Median = 15.9 Standard deviation = 0.257121 Minimum = 15.5 Maximum = 16.3 Range = 0.8 Standardized skewness = 0.192705 Standardized kurtosis = -0.245636The hypotheses are:
The p-value is:
is between .10 and .15.
Research Question is: Is 16oz in the rangle of plausible values in this population?
Where the degrees of freedom were df=n-1=9-1=8. Conclusion
One day Jon went to TMC and bought 6 cartons of milk to see if he was getting cheated by the milk company and not getting as much milk as he payed for. The milk was then taken to a high tech measurement lab and the actual contents were measured. The results were as follows: 233.5, 233, 233.5, 234, 233.5, 233.5 Is Jon being cheated? Test at a 5% level. The Statlets analysis revealed the following results by going into the stats tab and the t-test tab, clicking on options and picking the less than option.
Summary Statistics for milk Sample size = 6 Mean = 233.5 Median = 233.5 Standard deviation = 0.316228 Minimum = 233.0 Maximum = 234.0 Range = 1.0 Standardized skewness = 0.0 Standardized kurtosis = 1.25 Estimation of Population Mean for milk Sample size = 6 Mean = 233.5 95.0% upper confidence bound for mean: 233.5 + 0.260142 [233.76] t-test ------ Null hypothesis: mean = 236.0 Alt. hypothesis: less than Computed t-statistic = -19.3649 P-value = 3.38744E-6 Reject the null hypothesis for alpha = 0.05The hypotheses are:
The p-value is:
is less than .0005.
The research question is: Is 236ml in the range of plausible values for this population?
Conclusion
A shoe company was trying to determine if two types of shoe materials had different durability. These materials are denoted materials A and B. Material B was suspected to wear out sooner, but it was a cheaper substance. 10 kids were tested and the difference was calculated by B-A for each child. The mean difference was .41 and the standard deviation was .39.
The hypotheses are:
vs
.
The test statistic is:
The p-value is:
is equal to .0043.
Research question: Is material B less durable than material A?
Conclusion
In this section we are usually trying to determine if two populations have
different means. That is, we are trying to compare means
. Both significance testing and confidence intervals are used to
make these comparisons.
The usual null hypothesis is no difference between the two population
means. This is usually written as:
, the alternative is usually that there is a difference
between the two group means:
. One sided alternative hypotheses are
also possible.
The test statistic is a T, with formula given by,
Where
are the sample mean, standard deviation,
and sample size for the first group, and
are
the sample values for the second group.
The next step is to compute the p-value. However, this can get a bit
complicated because this test statistic theoretically does not follow a T
distribution. Luckily for us it does follow a T distribution
approximately, but the best degrees of freedom to use is a sticky problem.
If you are doing the problem by hand without the aid of a computer, the
degrees of freedom for you should use is the smallest group sample size
minus one. That is, the smallest
. If you are using a
computer program that provides an estimate of the appropriate degrees of
freedom on the output, go ahead and use what is provided as it will be
slightly more accurate than using the by-hand method.
The p-value is computed as usual, for a two sided test, the p-value is
and the appropriate modifications
for one sided tests. Do the conclusions as usual.
The confidence interval for
is given by the following
formula:
All that needs to be done is to place the appropriate sample values into the formula. The degrees of freedom rules should be the same as above, either use the smaller of the two sample sizes minus one, or use the complex degrees of freedom computed by your favorite stat package if provided.
The crucial value to compare with is zero. If zero is not contained in the interval it means that zero is not a plausible value for the difference in the two means. Zero not plausible means that no difference between the groups is not a reasonable statement, and thus there is evidence the groups differ. How do they differ? This depends on what the interval values are. If the whole interval is above zero, it implies that the first group mean is higher than the second, and vice-versa if the interval contains all negative values. If the interval contains zero it implies that no difference between the groups is a plausible statement, and thus there is no evidence to suggest they differ.
The MINITAB output for comparing the (fat or sodium?, cant recall at the moment) contents of two types of hotdogs mystery meat and poultry. For our purposes here the test statistic and p-value along with the confidence interval on the output shows evidence that mystery meat hot dogs have higher sodium/fat content than do poultry dogs. Do you see why?
MTB > set c1
DATA> 173 191 182 190 172 147 146 139 175 136 179 153 107 195 135 140 138
DATA> end
MTB > set c2
DATA> 129 132 102 106 94 102 87 99 170 113 135 142 86 143 152 146 144
DATA> end
MTB > name c1 'meat'
MTB > name c2 'poultry'
MTB > twosample data in 'meat' 'poultry'
TWOSAMPLE T FOR meat VS poultry
N MEAN STDEV SE MEAN
meat 17 158.7 25.2 6.1
poultry 17 122.5 25.5 6.2
95 PCT CI FOR MU meat - MU poultry: ( 18.5, 54.0)
TTEST MU meat = MU poultry (VS NE): T= 4.17 P=0.0002 DF= 31
MTB > stack c1 c2 in c3;
SUBC> subscripts in c4.
MTB > boxplot c3;
SUBC> by c4.
C4
---------------------
1 ----------------I + I--------
---------------------
----------------------
2 --------I + I-------------
----------------------
--------+---------+---------+---------+---------+--------C3
100 120 140 160 180
In this example there are two samples of tip percentages for a male waiter. In one set a smiley face was put on the check and in the other set, no smiley face was used. The data summary is as follows:
MTB > twosample 'wface' 'wnoface'
TWOSAMPLE T FOR wface VS wnoface
N MEAN STDEV SE MEAN
wface 23 17.83 5.52 1.2
wnoface 21 21.4 12.7 2.8
95 PCT CI FOR MU wface - MU wnoface:
( -9.7, 2.6)
TTEST MU wface = MU wnoface (VS NE): T= -1.19
P=0.25 DF= 26
Notice that in this solution I used the minitab degrees of freedom, without this information the degrees of freedom should have been 20.
The confidence interval for this dataset would be:
This gives a
value of 2.056 for a 95 percent confidence interval.
This interval for
is ( -9.7, 2.6). The
interpretation is:
This example concerns the possible benefit of the inclusion the player Fred McGriff to the Atlanta Braves in the 1993 season.
I'll again use the minitab degrees of freedom of 124.
This yields an interval of approximately (-2.9, -.68). Now for the conclusion.
This problem concerns the number of lifetime opposite sex partners as reported by males and females in the General Social Survey. For males there was 1682 surveyed with an average number of contacts of 11.719 and a standard deviation was 24.479. The minimum number for males was 0 and the maximum was 253 contacts. There were 1850 females surveyed with an average of 3.322 contacts and a standard deviation of 6.054 contacts. The minimum was zero and the maximum was 100 contacts for females.
The 95 percent confidence interval for
is,
(7.194, 9.54)
when using 1000 degrees of freedom for the t-distribution.
Oops, what is going on here? How can this be possible when opposite sex contacts for males and females must match? It turns out that the most active males and females are causing most of the problems.
This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 ch7.
The translation was initiated by Jon E. Anderson on Mon Jul 23 17:11:32 CDT 2001