Survey of Math Chapter 6: Exploring Data

Example of Histogram

Table 6.5 (reproduced below) in the text (For All Practical Purposes 6th Ed. COMAP) gives the number of medical doctors per 100,000 people in each state. Construct a histogram of the distribution, and describe the distribution.

StateDoctorsStateDoctorsStateDoctors
AL198LA246OH235
AK167ME223OK169
AZ202MD374OR225
AR190MA412PA291
CA247MI224RI338
CO238MN249SC207
CT354MS163SD184
DE234MO230TN246
FL238MT190TX203
GA211NE218UT200
HI265NV173VT305
ID154NH237VA241
IL260NJ295WA235
IN195NM212WV215
IA173NY387WI227
KS203NC232WY171
KY209ND222DC737

Solution

The individuals here are the states. The variable is the number of medical doctors per 100,000 people, which varies from state to state.

Here are four histograms I have constructed to graphically represent the data. The width of the vertical bars has been changed in each case.

histogram of the datahistogram of the data
histogram of the datahistogram of the data

We see that choosing the width of the bars is important. If our width is too wide, the histogram does not represent the fluctuations in height as well as we might like. If the width is too small, the histogram represents the fluctuations in height too well, and we get a histogram with drastic variations in height.

The best histograms strike a balance when choosing the width of the bars. The two with widths 50 and 25 produce pretty good graphical representation of the data. Let's choose the following histogram to work with:

histogram of the data

There appears to be one outlier in the distribution, which corresponds to the District of Columbia. Since The District of Columbia is a city and not a state, it is not surprising that it has a different value for the variables than the other states. We can ignore this outlier as we continue with our description of the distribution.

The distribution appears to have one peak, in the 225-250 band. If you calculate the mean of the distribution you will find it to be 244, and the median is 225.

The distribution is right skewed (not symmetric), since from the center the distribution extends further to the right (425) rather than the left (150).

[an error occurred while processing this directive]