2000 Flex-Time and Flex-Mode Prob-Stat I Topic #7C Page

LEGACY CONTENT. If you are looking for Voteview.com, PLEASE CLICK HERE

This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

45-733 PROBABILITY AND STATISTICS I Topic #7C

22 February 2000, 24 February 2000

Confidence Intervals

Sampling From a Normal Distribution With unknown m and known s².
The basic form of the confidence interval is:

   _
P[|X_n - m| < c] = 1 - a, or
  _            _
P[X_n - c < m < X_n + c] = 1 - a, where 
a = .01 or .05 or .10 typically.
Note that the random variable here is a line length.  The proper 
interpretation of it is that, in the long run, 1 - a 
                                        _
percent of the time the interval around X_n will contain m.
                                           _
Also note that, once we insert a value for X_n we no 
longer have a random variable -- we have a defined interval.  
Hence, we say that we are "1 - a confident 
that the true mean is in the interval."

To find the value of c:

   _                     _
P[|X_n - m| < c] = P[-c < X_n - m < c] = 
             _
P[-cn^1/2/s < (X_n - m)/s/n^1/2 < cn^1/2/s] = 
P[-cn^1/2/s < Z < cn^1/2/s] = F(cn^1/2/s) - F(-cn^1/2/s) = 
F(z_a/2) - F(-z_a/2) = 1 - a
Hence:  z_a/2 = c/s/n^1/2  and  c = z_a/2s/n^1/2
    
Which gives us our confidence interval:
  _                    _
P[X_n - z_a/2s/n^1/2 < m < X_n + z_a/2s/n^1/2] = 1 - a

Example: Suppose we take a random sample of 25 from N(m, 1). Construct a 95% confidence interval for m.
     We are given that a = .05, so that a/2 = .025 and z_.025 = 1.96. Hence
     z_a/2 s/n^1/2 = (1.96*1)/5 = .392, so the interval is:
     _{_}
     X_n ± .392

Sampling From a Normal Distribution With unknown m and unknown s² with large sample size (n > 30).
In this case we simply substitute s² for s² by appeal to the Central Limit Theorem and
obtain the confidence interval:
```
  _                    _
P[X_n - z_a/2s/n^1/2 < m < X_n + z_a/2s/n^1/2] = 1 - a
```

Suppose we have a large order of bolts delivered to our factory. We are concerned about the precision with which these bolts have been machined. In particular, we want to construct 95% confidence limits for the true mean length of the bolts. Assume that the length of the bolts is normally distributed.
```
                        _
We are given:  n = 500, X_n = 6.1cm, s = .1cm, 
1 - a = .95, a = .05, a/2 = .025, 
Hence, z_.025 = 1.96
```
So the confidence limits are: 6.1 ± (1.96*.1)/500^1/2 or

(6.091, 6.109)

We are 95 percent confident that the true mean length of the bolts is in the interval.

Large Sample (n > 50) Confidence Interval for proportions.
With large sample sizes, we can appeal to the Central Limit Theorem and assume:

_Ù        _Ù     _Ù
p ~ N[p, p(1 - p)/n] so that the confidence interval is:

  _Ù        _Ù     _Ù             _Ù        _Ù     _Ù
P{p - z_a/2[p(1 - p)/n]^1/2 < p < p + z_a/2[p(1 - p)/n]^1/2} = 1 - a

Problem 8.47 p.346
```
                         _Ù
We are given:  n = 1506, p = .73, 
1 - a = .95, a = .05, a/2 = .025.  
Hence, z_.025 = 1.96
```
So the confidence limits are: .73 ± 1.96[(.73*.27)/1506]^1/2
.73 ± .0224 or

(.7076, .7524)

We are 95 percent confident that the true proportion is in the interval.

Confidence Interval for s² when the random sample is drawn from a Normal Distribution.
Ideally, the confidence interval would be built around the probability distribution for s² -- our unbiased estimator for s². Unfortunately, this distribution is not so easily used. However, the distribution of (n-1)s²/ s² is known to be Chi-Square with n-1 degrees of freedom.

Since the Chi-Square is an asymmetric distribution we define c₁ to be a point below which a/2 of the probability lies, and define c₂ be a point above which a/2 of the probability lies. Hence:
P[c₁ < (n-1)s²/ s² < c₂] = P[(n-1)s²/c₂ < s² < (n-1)s²/c₁] = 1 - a

Problem 8.79 p.362
We are given: n = 6, df = n - 1 = 5,
1 - a = .99, a = .1, a/2 = .05. Hence, c₁ = 1.145476 and c₂ = 11.0705
s² = .502667

(n-1)s²/c₂ = (5*.502667)/11.0705 = .227, and
(n-1)s²/c₁ = (5*.502667)/1.145476 = 2.194

Hence: (.227, 2.194)

We are 90 percent confident that s² is in the interval.

Sampling From a Normal Distribution With unknown m and unknown s² with small sample size (n < 30).
In this case we build our confidence interval from the t distribution. In particular,
```
 _
(X_n - m)/s/n^1/2 ~ t_n-1
So that we can write the confidence interval as:
  _                   _
P[X_n - t_a/2s/n^1/2 < m < X_n + t_a/2s/n^1/2] = 1 - a
```

Problem 8.68 p.358
We are given: n = 20, s = 57, 1 - a = .9, a = .1, a/2 = .05. Hence, t_.05,19df = 1.729
1. Confidence Limits: 419 ± (1.729*57)/20^1/2 = 419 ± 22.04
  or (396.96, 441.04)
2. Yes, the population mean is in the interval. All values in the interval have 90% confidence.
3. Confidence Limits: 455 ± (1.729*69)/20^1/2
  or (428.33, 481.67)

Confidence Interval for the Difference Between Two Means when sampling from two separate, independent, Normal Distributions with known variances.
Here we use the same technique as the the testing problem discussed in notes #10 (1). Namely, let the distributions of the sample means be:

_                    _
X_n ~ N[m_x, s_x²/n] and Y_m ~ N[m_y, s_y²/m]
       _    _
Then:  X_n - Y_m ~ N[m_x - m_y, s_x²/n + s_y²/m]
     
And the confidence interval is:
  _    _                                    _   _
P{X_n - Y_m - z_a/2[s_x²/n + s_y²/m]^1/2 < m_x - m_y < X_n - Y_m + z_a/2[s_x²/n + s_y²/m]^1/2} 
= 1 - a

Confidence Interval for the Difference Between Two Means when sampling from two separate, independent, Normal Distributions with unknown variances but large (n > 30 and m > 30) sample sizes.
Here the confidence interval is the same as in (3) but s_x² and s_y² are used in the formula. Namely:
```
  _    _                                    _   _
P{X_n - Y_m - z_a/2[s_x²/n + s_y²/m]^1/2 < m_x - m_y < X_n - Y_m + z_a/2[s_x²/n + s_y²/m]^1/2} 
= 1 - a
```

Problem 8.52 p.347

                                 _           _
We are given:  n = 252, m = 307, X_n = 11.48, Y_m = 13.21,
s_x = 5.69, s_y = 5.31,  
1 - a = .95, a = .05, a/2 = .025.
Hence, z_.025 = 1.96

Our confidence limits are:
11.48 - 13.21 ± 1.96[5.69²/252 + 5.31²/307]^1/2 =
-1.73 ± .92

Which produces the interval, (-2.65, -.81)

                                 _           _
We are given:  n = 252, m = 307, X_n = 22.05, Y_m = 25.96,
s_x = 5.12, s_y = 5.07,  
1 - a = .90, a = .10, a/2 = .05.
Hence, z_.05 = 1.645

Our confidence limits are:
22.05 - 25.96 ± 1.645[5.12²/252 + 5.07²/307]^1/2 =
-3.91 ± .71

Which produces the interval, (-4.62, -3.20)

Note that both intervals do not include 0. Hence, we are 95 and 90 percent confident respectively, that there is a significant difference between men and women on these two scales.

Confidence Interval for the Difference Between Two Proportions with large (n > 50 and m > 50) sample sizes.
Here we appeal to the Central Limit Theorem to write:

_Ù    _Ù              _Ù      _Ù       _Ù      _Ù
p₁ - p₂ ~ N[p₁ - p₂, p₁(1 - p₁)/n + p₂(1 - p₂)/m]
And the confidence limits can be computed from:
_Ù    _Ù        _Ù      _Ù       _Ù      _Ù
p₁ - p₂ ± z_a/2[p₁(1 - p₁)/n + p₂(1 - p₂)/m]^1/2

Problem 8.49 p.347
```
               _Ù         _Ù
We are given:  p₁ = .19, p₂ = .70, n = 1250, m = 1251, 
1 - a = .9, a = .1, a/2 = .05.
Hence, z_.05 = 1.645.
```
The confidence limits are:

.19 - .70 ± 1.645[(.19*.81)/1250 + (.7*.3)/1251] = -.51 ± .028

The interval, (-.538, -.482), is well below 0 so we are 95% confident, based upon this evidence, that there was a change of opinion between the two periods.

Problem 8.50 p.347
```
               _Ù         _Ù
We are given:  p₁ = .67, p₂ = .90, n = 1250, m = 1251, 
1 - a = .98, a = .02, a/2 = .01.
Hence, z_.01 = 2.33.
```
The confidence limits are:

.67 - .90 ± 2.33[(.67*.33)/1250 + (.90*.1)/1251] = -.23 ± .0368

The interval, (-.2668, -.1932), is well below 0 so we are 98% confident, based upon this evidence, that there was a change of opinion regarding smoke detectors between the two periods.

Confidence Interval for the Difference Between Two Means when sampling from two separate, independent, Normal Distributions with unknown variances and small (n < 30 and m < 30) sample sizes.

Here we must assume that s_x² = s_y² and use this assumption to combine the two sample sum of squares to obtain s². Namely:

s² = [(n - 1)(s_x)² + (m - 1)(s_y)²]/(n + m - 2)

The Confidence Interval is:
```
  _    _                                  _   _
P{X_n - Y_m - t_a/2s(1/n + 1/m)^1/2 < m_x - m_y < X_n - Y_m + t_a/2s(1/n + 1/m)^1/2} = 
1 - a
```

Problem 8.71 p.359

               _        _
We are given:  X_n = 11, Y_m = 20, 
n = 16, m = 20, s_x = 6, s_y = 8, 
1 - a = .95, a = .05, a/2 = .025.
Hence, t_{.025, 34df} = 1.96

Pooling the sample sums of squares:
     s² = (15*36 + 19*64)/34 = 51.647
     The confidence limits are:
     11 - 12 ± 1.96[51.647(1/16 + 1/20)]^1/2 = -1 ± 4.72
     For an interval of: (-5.72, 3.72)