45-734 PROBABILITY AND STATISTICS II Homework Answers #4 (4th Mini AY1997-98)
The output for all 26 independent variables is shown below.
============================================================
LS // Dependent Variable is Y
Date: 03/22/98 Time: 22:01
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
D1 0.078227 0.004591 17.04017 0.0000
D2 0.074879 0.004386 17.07409 0.0000
D3 0.087562 0.004592 19.06738 0.0000
D4 0.079191 0.004296 18.43201 0.0000
D5 0.098049 0.004243 23.10639 0.0000
D6 0.064719 0.004327 14.95635 0.0000
D7 0.073621 0.004379 16.81131 0.0000
D8 0.053391 0.004567 11.69156 0.0000
D9 0.073663 0.004361 16.89258 0.0000
D10 0.046727 0.004461 10.47458 0.0000
D11 0.063839 0.004458 14.32120 0.0000
D12 0.045408 0.004325 10.49975 0.0000
D13 0.048226 0.004929 9.784037 0.0000
D14 0.038257 0.004458 8.580982 0.0000
D15 0.038148 0.004293 8.885784 0.0000
D16 0.036189 0.004350 8.319180 0.0000
D17 0.046678 0.004409 10.58735 0.0000
D18 0.038402 0.004924 7.798198 0.0000
APPOINT 0.010184 0.008949 1.138008 0.2552
DIED -0.003778 0.006090 -0.620311 0.5351
HIRUN 0.004321 0.003989 1.083138 0.2788
LOST -0.000101 0.003118 -0.032541 0.9740
NOTVOTE 0.022550 0.004588 4.915558 0.0000
REDIST 0.001407 0.002434 0.578122 0.5632
RETIRE -0.002357 0.003381 -0.697215 0.4857
VOTESHR -3.04E-05 4.46E-05 -0.680324 0.4963
============================================================
R-squared 0.103848 Mean dependent var 0.063764
Adjusted R-squared 0.100270 S.D. dependent var 0.059434
S.E. of regression 0.056376 Akaike info criter-5.747300
Sum squared resid 19.90218 Schwarz criterion -5.719404
Log likelihood 9173.225 F-statistic 29.02601
Durbin-Watson stat 1.999388 Prob(F-statistic) 0.000000
============================================================
Note that, because our dependent variable is the absolute value of the
ideological shift, a negative coefficient means that the variable reduced the
shift.
The fixed effect indicators -- D1 through D18 -- are all statistically
significant. The overall mean ideological shift is .063764 (the underlying
scale runs from -1.0 to +1.0 so this is not a large effect) with the largest
shifts occurring early in the time series (compare D1 - D9 with
D10 - D18,
which correspond to the time periods 1947-1968 and 1969-1986 respectively).
The negative coefficients for LOST and
RETIRE do not make sense. Given
that the representative lost either the primary or general elections, this
indicates that the representative was out of touch with the wishes of her
constituency. All things being equal, we should expect the coefficient on
LOST to be zero or positive but not negative. However, it is so close to
zero that we need not be concerned about it.
In the case of RETIRE, here it is unambiguous -- this coefficient
should be positive. Note that this variable should be capturing the
pure "shirking" effect.
The negative coefficient for DIED is somewhat puzzling. Clearly, if
someone dies in office then they obviously vote less! Hence, since the
ideological measure is based on the number of votes, dying will very likely
produce a (spurious) ideological shift. However, this is controlled for by
the NOTVOTE variable. A case could be made that a representative,
anticipating her death from some terminal disease, could begin voting her
"true" preferences thereby "shirking". But if the 97 representatives who
died all anticipated their death in such a fashion, the coefficient on DEATH
should be positive!
Below is the output using an intercept term with D18 omitted to avoid
the "dummy variable trap". Doing a version of the thought experiment I
showed you in class, note that setting D1=D2=D3=...=D17=0, then the
coefficient for C is picking up the omitted fixed effect dummy, D18,
which is
equal to 0.038402. To get the correct value for D1...D17 we must add
0.038402 to them -- for example, on the first output D2=0.074879=0.036477+
0.038402. These two models are exactly the same (except for the
inconvenience of adding the coefficient of C to the other dummy variables) so
that the lower portions of the tables -- showing the R2, etc. -- are
identical.
Note that some of the fixed effect dummy variables are no longer
statistically "significant" even though the two regressions are absolutely
identical substantively! For example, the coefficient on D14 has a two-tail
p-value of .9755 even though the standard error barely changed. This is
simply an artifact -- the original coefficient on D14 was almost identical to
that of D18 (which is now the intercept term). Hence, the coefficient is now
near zero so the t-statistic is small.
============================================================
LS // Dependent Variable is Y
Date: 03/22/98 Time: 22:20
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
C 0.038402 0.004924 7.798198 0.0000
D1 0.039825 0.004960 8.028512 0.0000
D2 0.036477 0.004878 7.478075 0.0000
D3 0.049160 0.004471 10.99636 0.0000
D4 0.040789 0.004815 8.470457 0.0000
D5 0.059647 0.004763 12.52293 0.0000
D6 0.026317 0.004852 5.423796 0.0000
D7 0.035219 0.004806 7.328188 0.0000
D8 0.014989 0.004387 3.416516 0.0006
D9 0.035261 0.004803 7.341743 0.0000
D10 0.008325 0.004448 1.871748 0.0613
D11 0.025437 0.004358 5.837158 0.0000
D12 0.007006 0.004699 1.491149 0.1360
D13 0.009824 0.004298 2.285615 0.0223
D14 -0.000145 0.004726 -0.030762 0.9755
D15 -0.000254 0.004822 -0.052679 0.9580
D16 -0.002213 0.004854 -0.455924 0.6485
D17 0.008276 0.004853 1.705603 0.0881
APPOINT 0.010184 0.008949 1.138008 0.2552
DIED -0.003778 0.006090 -0.620311 0.5351
HIRUN 0.004321 0.003989 1.083138 0.2788
LOST -0.000101 0.003118 -0.032541 0.9740
NOTVOTE 0.022550 0.004588 4.915558 0.0000
REDIST 0.001407 0.002434 0.578122 0.5632
RETIRE -0.002357 0.003381 -0.697215 0.4857
VOTESHR -3.04E-05 4.46E-05 -0.680324 0.4963
============================================================
R-squared 0.103848 Mean dependent var 0.063764
Adjusted R-squared 0.100270 S.D. dependent var 0.059434
S.E. of regression 0.056376 Akaike info criter-5.747300
Sum squared resid 19.90218 Schwarz criterion -5.719404
Log likelihood 9173.225 F-statistic 29.02601
Durbin-Watson stat 1.999388 Prob(F-statistic) 0.000000
============================================================
Below are the results of regressing the fixed effects dummy variables on
Y. The coefficients are slightly larger than when all 26 variables are used
but notice that the pattern -- in terms of the relative magnitude -- is
exactly the same. To test whether or not the 8 omitted variables are jointly
significant we perform the F test:
Fq,n-k-1 = {[SSER - SSEUR]/q}/
{SSEUR/n-k-1} =
[(20.00789 - 19.90218)/8]/[19.90218/(6288-25-1)] = 4.1576
The P-Value [using @FDIST(4.1576,8,6262)] is .000057. We
reject the null hypothesis that the 8 omitted variables have
no effect. Consequently, one or more of the eight should be utilized in our
model.
============================================================
LS // Dependent Variable is Y
Date: 03/22/98 Time: 22:31
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
D1 0.082372 0.003256 25.29860 0.0000
D2 0.079507 0.003015 26.36894 0.0000
D3 0.091847 0.003096 29.67013 0.0000
D4 0.081544 0.002961 27.54074 0.0000
D5 0.100024 0.002913 34.33452 0.0000
D6 0.066523 0.003055 21.77813 0.0000
D7 0.076094 0.002977 25.55861 0.0000
D8 0.057115 0.003033 18.83433 0.0000
D9 0.077455 0.003091 25.05857 0.0000
D10 0.051914 0.003033 17.11900 0.0000
D11 0.069472 0.002917 23.81543 0.0000
D12 0.050331 0.002941 17.11507 0.0000
D13 0.053123 0.003024 17.56831 0.0000
D14 0.041207 0.003091 13.33131 0.0000
D15 0.040343 0.002973 13.56938 0.0000
D16 0.038589 0.003019 12.77985 0.0000
D17 0.049100 0.003024 16.23790 0.0000
D18 0.041590 0.003037 13.69480 0.0000
============================================================
R-squared 0.099087 Mean dependent var 0.063764
Adjusted R-squared 0.096645 S.D. dependent var 0.059434
S.E. of regression 0.056489 Akaike info criter-5.744546
Sum squared resid 20.00789 Schwarz criterion -5.725234
Log likelihood 9156.568 F-statistic 40.56517
Durbin-Watson stat 1.997846 Prob(F-statistic) 0.000000
============================================================
Below is the regression omitting the 18 fixed effect dummy variables.
In this situation we should use the intercept term, C. The remaining
indicator variables do not sum to one so we need an intercept term. To test
whether or not the 18 fixed effect dummy variables are jointly significant we
perform the test:
Fq,n-k-1 = {[SSER - SSEUR]/q}/
{SSEUR/n-k-1} =
[(22.00871 - 19.90218)/18]/[19.90218/(6288-25-1)] = 36.8220
The P-Value [using @FDIST(36.8220,18,6262)] is .0000000.
We reject the null hypothesis. The fixed effect dummy variables have a
considerable impact and must be left in the model.
The signs on DIED, LOST and RETIRE are negative;
these are clearly wrong
for the reasons outlined earlier. The coefficient on C is picking up
(roughly) the average overall shift so that a negative coefficient means that
the the shift is below the average. The coefficient on REDIST is also
negative but it is also statistically significant. This sign is also clearly
wrong. If a geographic constituency changes (which is what this indicator
variable is telling us), then the representative should be shifting his or
her voting pattern to match the new constituency (in other words, a greater
than average shift so the coefficient should be positive). The signs on
APPOINT and HIRUN are both positive which is what we
should expect. Getting
appointed to, or running for, higher office should free the representative to
"shirk". The sign on VOTESHR is negative which indicates that
the larger the
margin of victory in the previous election, the less the representative
shifted. Arguably, this is evidence of the fact that the representative must
be closely attuned to the preferences of her constituency in order to win by
larger than average margins. On the other hand, the larger the margin of
victory the greater the leeway the representative would have to "shirk" since
she could sacrifice a bit of her margin in a future election to vote against
the wishes of her constituency on occasion. In sum, the sign on this
variable is ambiguous. Finally, the coefficient on NOTVOTE is and should be
positive. As explained in the "Description of Variables" appendix to the
homework problem, this variable is simply picking up the loss of precision of
the estimation of the ideology variable.
Turning to statistical significance, the only variables that appear to
be important here are NOTVOTE, REDIST, and RETIRE.
REDIST and RETIRE have
the wrong sign so that a mechanical application of hypothesis tests ignores
what our substantive knowledge tells us. Also, even though the r-squared is
only .007862, the overall F-statistic is statistically significant!
============================================================
LS // Dependent Variable is Y
Date: 03/22/98 Time: 22:47
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
C 0.062457 0.003393 18.41004 0.0000
APPOINT 0.009820 0.009378 1.047147 0.2951
DIED -0.001540 0.006365 -0.241869 0.8089
HIRUN 0.000339 0.004172 0.081332 0.9352
LOST -0.000538 0.003253 -0.165281 0.8687
NOTVOTE 0.026312 0.004623 5.691317 0.0000
REDIST -0.007402 0.001725 -4.290529 0.0000
RETIRE -0.006559 0.003523 -1.861612 0.0627
VOTESHR -3.77E-05 4.65E-05 -0.809000 0.4185
============================================================
R-squared 0.008995 Mean dependent var 0.063764
Adjusted R-squared 0.007732 S.D. dependent var 0.059434
S.E. of regression 0.059204 Akaike info criter-5.652097
Sum squared resid 22.00871 Schwarz criterion -5.642441
Log likelihood 8856.909 F-statistic 7.123779
Durbin-Watson stat 1.813502 Prob(F-statistic) 0.000000
============================================================
Below is our preferred model. In the overall regression with all 26
variables, the 18 fixed effect dummy variables were statistically significant
at all reasonable levels and of the 8 additional variables, only NOTVOTE has
a small p-value. Performing the F test on excluding the 7 variables we get:
Fq,n-k-1 = {[SSER - SSEUR]/q}/
{SSEUR/n-k-1} =
[(19.91641 - 19.90218)/7]/[19.90218/(6288-25-1)] = .6396
The P-Value [using @FDIST(.6396,8,6262)] is .74498.
We do not reject the null hypothesis that the omitted 7 variables have no
effect.
For reasons stated above, the signs on all the coefficients make sense.
============================================================
LS // Dependent Variable is Y
Date: 03/22/98 Time: 22:52
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
D1 0.076350 0.003437 22.21285 0.0000
D2 0.072874 0.003253 22.40529 0.0000
D3 0.086263 0.003259 26.46609 0.0000
D4 0.077145 0.003066 25.16208 0.0000
D5 0.095984 0.003003 31.96622 0.0000
D6 0.062665 0.003132 20.01125 0.0000
D7 0.071707 0.003081 23.27296 0.0000
D8 0.052033 0.003171 16.41113 0.0000
D9 0.071783 0.003260 22.01772 0.0000
D10 0.045623 0.003245 14.05984 0.0000
D11 0.062816 0.003164 19.85359 0.0000
D12 0.043591 0.003192 13.65723 0.0000
D13 0.047549 0.003191 14.90107 0.0000
D14 0.036427 0.003210 11.34733 0.0000
D15 0.036181 0.003066 11.79979 0.0000
D16 0.034113 0.003126 10.91235 0.0000
D17 0.044690 0.003127 14.29125 0.0000
D18 0.037780 0.003112 12.13909 0.0000
NOTVOTE 0.022108 0.004120 5.366135 0.0000
============================================================
R-squared 0.103206 Mean dependent var 0.063764
Adjusted R-squared 0.100631 S.D. dependent var 0.059434
S.E. of regression 0.056365 Akaike info criter-5.748811
Sum squared resid 19.91641 Schwarz criterion -5.728426
Log likelihood 9170.976 F-statistic 40.08114
Durbin-Watson stat 1.998638 Prob(F-statistic) 0.000000
============================================================
Below is regression with NOTVOTE as the dependent variable. The results
are very interesting. All the exit variables have positive signs and are
statistically significant. The largest coefficient is on DIED which makes
sense in this context. With respect to the remaining exit dummies, when a
member knows she is going to exit, she votes less but her voting pattern
given the smaller sample remains the same. In other words, "shirking" is not
ideological, it simply is voting less.
The redistricting dummy variable, REDIST, is positive and statistically
significant. This means that changing the geographic boundaries of
representative's district produces a slight increase in not voting. This
does not appear to make much sense. One explanation might be that the
representative spends more of her time in the new district in order to
solidify her support and therefore votes less.
VOTESHR is positive and statistically significant. This implies that
the higher the previous election margin, the freer the representative feels
to vote less. Electorally safe representatives appear vote less. However,
the effect is very small.
============================================================
LS // Dependent Variable is NOTVOTE
Date: 03/22/98 Time: 23:00
Sample: 1 6288
Included observations: 6288
============================================================
Variable CoefficienStd. Errort-Statistic Prob.
============================================================
C 0.076039 0.009210 8.256167 0.0000
APPOINT 0.286145 0.025340 11.29217 0.0000
DIED 0.416157 0.016560 25.12969 0.0000
HIRUN 0.174242 0.011173 15.59512 0.0000
LOST 0.077676 0.008825 8.801402 0.0000
REDIST 0.016437 0.004704 3.493949 0.0005
RETIRE 0.142396 0.009447 15.07345 0.0000
VOTESHR 0.001796 0.000125 14.36470 0.0000
============================================================
R-squared 0.180039 Mean dependent var 0.232428
Adjusted R-squared 0.179125 S.D. dependent var 0.178356
S.E. of regression 0.161595 Akaike info criter-3.644055
Sum squared resid 163.9889 Schwarz criterion -3.635472
Log likelihood 2542.625 F-statistic 196.9853
Durbin-Watson stat 1.852625 Prob(F-statistic) 0.000000
============================================================
The principal-agent theory from economics when applied to representation
in the U.S. House of Representatives, fails. The indicator variable RETIRE
-- which is a direct test of the "shirking" hypothesis -- is not
statistically significant and does not even have the correct sign in the full
model. Indeed, all the exit indicator variables appear to have no impact
upon ideological shifts beyond a very indirect effect through NOTVOTE. In
short -- all things being equal -- a representative does not alter her voting
pattern. Genuine ideological conversion is very rare in U.S. politics.
Representatives are voting their personal beliefs.
But if representatives are voting their personal beliefs, doesn't this
imply that they are trustees? Not necessarily. After all, they have to get
elected the first time! To use an American slang phrase, "what you see is
what you get" or, "a leopard cannot change its spots". The voters know what
they are getting the first time they vote! Hence, the representative should
not change her voting patterns.
What about the REDIST indicator variable? Shouldn't the representative
now shift her position to another stable pattern corresponding to the new
district? Not necessarily. If the representative truly believes that her
voting is correct, then she may not change. This is still an open area of
research.