45-734 PROBABILITY AND STATISTICS II Homework Answers #4 (4th Mini AY1997-98)



The output for all 26 independent variables is shown below.
============================================================
LS // Dependent Variable is Y                                         
Date: 03/22/98   Time: 22:01                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         D1          0.078227   0.004591   17.04017   0.0000          
         D2          0.074879   0.004386   17.07409   0.0000          
         D3          0.087562   0.004592   19.06738   0.0000          
         D4          0.079191   0.004296   18.43201   0.0000          
         D5          0.098049   0.004243   23.10639   0.0000          
         D6          0.064719   0.004327   14.95635   0.0000          
         D7          0.073621   0.004379   16.81131   0.0000          
         D8          0.053391   0.004567   11.69156   0.0000          
         D9          0.073663   0.004361   16.89258   0.0000          
        D10          0.046727   0.004461   10.47458   0.0000          
        D11          0.063839   0.004458   14.32120   0.0000          
        D12          0.045408   0.004325   10.49975   0.0000          
        D13          0.048226   0.004929   9.784037   0.0000          
        D14          0.038257   0.004458   8.580982   0.0000          
        D15          0.038148   0.004293   8.885784   0.0000          
        D16          0.036189   0.004350   8.319180   0.0000          
        D17          0.046678   0.004409   10.58735   0.0000          
        D18          0.038402   0.004924   7.798198   0.0000          
      APPOINT        0.010184   0.008949   1.138008   0.2552          
        DIED        -0.003778   0.006090  -0.620311   0.5351          
       HIRUN         0.004321   0.003989   1.083138   0.2788          
        LOST        -0.000101   0.003118  -0.032541   0.9740          
      NOTVOTE        0.022550   0.004588   4.915558   0.0000          
       REDIST        0.001407   0.002434   0.578122   0.5632          
       RETIRE       -0.002357   0.003381  -0.697215   0.4857          
      VOTESHR       -3.04E-05   4.46E-05  -0.680324   0.4963          
============================================================
R-squared            0.103848    Mean dependent var 0.063764          
Adjusted R-squared   0.100270    S.D. dependent var 0.059434          
S.E. of regression   0.056376    Akaike info criter-5.747300          
Sum squared resid    19.90218    Schwarz criterion -5.719404          
Log likelihood       9173.225    F-statistic        29.02601          
Durbin-Watson stat   1.999388    Prob(F-statistic)  0.000000          
============================================================
Note that, because our dependent variable is the absolute value of the ideological shift, a negative coefficient means that the variable reduced the shift.

The fixed effect indicators -- D1 through D18 -- are all statistically significant. The overall mean ideological shift is .063764 (the underlying scale runs from -1.0 to +1.0 so this is not a large effect) with the largest shifts occurring early in the time series (compare D1 - D9 with D10 - D18, which correspond to the time periods 1947-1968 and 1969-1986 respectively).

The negative coefficients for LOST and RETIRE do not make sense. Given that the representative lost either the primary or general elections, this indicates that the representative was out of touch with the wishes of her constituency. All things being equal, we should expect the coefficient on LOST to be zero or positive but not negative. However, it is so close to zero that we need not be concerned about it.

In the case of RETIRE, here it is unambiguous -- this coefficient should be positive. Note that this variable should be capturing the pure "shirking" effect.

The negative coefficient for DIED is somewhat puzzling. Clearly, if someone dies in office then they obviously vote less! Hence, since the ideological measure is based on the number of votes, dying will very likely produce a (spurious) ideological shift. However, this is controlled for by the NOTVOTE variable. A case could be made that a representative, anticipating her death from some terminal disease, could begin voting her "true" preferences thereby "shirking". But if the 97 representatives who died all anticipated their death in such a fashion, the coefficient on DEATH should be positive!

Below is the output using an intercept term with D18 omitted to avoid the "dummy variable trap". Doing a version of the thought experiment I showed you in class, note that setting D1=D2=D3=...=D17=0, then the coefficient for C is picking up the omitted fixed effect dummy, D18, which is equal to 0.038402. To get the correct value for D1...D17 we must add 0.038402 to them -- for example, on the first output D2=0.074879=0.036477+ 0.038402. These two models are exactly the same (except for the inconvenience of adding the coefficient of C to the other dummy variables) so that the lower portions of the tables -- showing the R2, etc. -- are identical.

Note that some of the fixed effect dummy variables are no longer statistically "significant" even though the two regressions are absolutely identical substantively! For example, the coefficient on D14 has a two-tail p-value of .9755 even though the standard error barely changed. This is simply an artifact -- the original coefficient on D14 was almost identical to that of D18 (which is now the intercept term). Hence, the coefficient is now near zero so the t-statistic is small.

============================================================
LS // Dependent Variable is Y                                         
Date: 03/22/98   Time: 22:20                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         C           0.038402   0.004924   7.798198   0.0000          
         D1          0.039825   0.004960   8.028512   0.0000          
         D2          0.036477   0.004878   7.478075   0.0000          
         D3          0.049160   0.004471   10.99636   0.0000          
         D4          0.040789   0.004815   8.470457   0.0000          
         D5          0.059647   0.004763   12.52293   0.0000          
         D6          0.026317   0.004852   5.423796   0.0000          
         D7          0.035219   0.004806   7.328188   0.0000          
         D8          0.014989   0.004387   3.416516   0.0006          
         D9          0.035261   0.004803   7.341743   0.0000          
        D10          0.008325   0.004448   1.871748   0.0613          
        D11          0.025437   0.004358   5.837158   0.0000          
        D12          0.007006   0.004699   1.491149   0.1360          
        D13          0.009824   0.004298   2.285615   0.0223          
        D14         -0.000145   0.004726  -0.030762   0.9755          
        D15         -0.000254   0.004822  -0.052679   0.9580          
        D16         -0.002213   0.004854  -0.455924   0.6485          
        D17          0.008276   0.004853   1.705603   0.0881          
      APPOINT        0.010184   0.008949   1.138008   0.2552          
        DIED        -0.003778   0.006090  -0.620311   0.5351          
       HIRUN         0.004321   0.003989   1.083138   0.2788          
        LOST        -0.000101   0.003118  -0.032541   0.9740          
      NOTVOTE        0.022550   0.004588   4.915558   0.0000          
       REDIST        0.001407   0.002434   0.578122   0.5632          
       RETIRE       -0.002357   0.003381  -0.697215   0.4857          
      VOTESHR       -3.04E-05   4.46E-05  -0.680324   0.4963          
============================================================
R-squared            0.103848    Mean dependent var 0.063764          
Adjusted R-squared   0.100270    S.D. dependent var 0.059434          
S.E. of regression   0.056376    Akaike info criter-5.747300          
Sum squared resid    19.90218    Schwarz criterion -5.719404          
Log likelihood       9173.225    F-statistic        29.02601          
Durbin-Watson stat   1.999388    Prob(F-statistic)  0.000000          
============================================================
Below are the results of regressing the fixed effects dummy variables on Y. The coefficients are slightly larger than when all 26 variables are used but notice that the pattern -- in terms of the relative magnitude -- is exactly the same. To test whether or not the 8 omitted variables are jointly significant we perform the F test:

Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
[(20.00789 - 19.90218)/8]/[19.90218/(6288-25-1)] = 4.1576

The P-Value [using @FDIST(4.1576,8,6262)] is .000057. We reject the null hypothesis that the 8 omitted variables have no effect. Consequently, one or more of the eight should be utilized in our model.

============================================================
LS // Dependent Variable is Y                                         
Date: 03/22/98   Time: 22:31                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         D1          0.082372   0.003256   25.29860   0.0000          
         D2          0.079507   0.003015   26.36894   0.0000          
         D3          0.091847   0.003096   29.67013   0.0000          
         D4          0.081544   0.002961   27.54074   0.0000          
         D5          0.100024   0.002913   34.33452   0.0000          
         D6          0.066523   0.003055   21.77813   0.0000          
         D7          0.076094   0.002977   25.55861   0.0000          
         D8          0.057115   0.003033   18.83433   0.0000          
         D9          0.077455   0.003091   25.05857   0.0000          
        D10          0.051914   0.003033   17.11900   0.0000          
        D11          0.069472   0.002917   23.81543   0.0000          
        D12          0.050331   0.002941   17.11507   0.0000          
        D13          0.053123   0.003024   17.56831   0.0000          
        D14          0.041207   0.003091   13.33131   0.0000          
        D15          0.040343   0.002973   13.56938   0.0000          
        D16          0.038589   0.003019   12.77985   0.0000          
        D17          0.049100   0.003024   16.23790   0.0000          
        D18          0.041590   0.003037   13.69480   0.0000          
============================================================
R-squared            0.099087    Mean dependent var 0.063764          
Adjusted R-squared   0.096645    S.D. dependent var 0.059434          
S.E. of regression   0.056489    Akaike info criter-5.744546          
Sum squared resid    20.00789    Schwarz criterion -5.725234          
Log likelihood       9156.568    F-statistic        40.56517          
Durbin-Watson stat   1.997846    Prob(F-statistic)  0.000000          
============================================================
Below is the regression omitting the 18 fixed effect dummy variables. In this situation we should use the intercept term, C. The remaining indicator variables do not sum to one so we need an intercept term. To test whether or not the 18 fixed effect dummy variables are jointly significant we perform the test:

Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
[(22.00871 - 19.90218)/18]/[19.90218/(6288-25-1)] = 36.8220

The P-Value [using @FDIST(36.8220,18,6262)] is .0000000. We reject the null hypothesis. The fixed effect dummy variables have a considerable impact and must be left in the model.

The signs on DIED, LOST and RETIRE are negative; these are clearly wrong for the reasons outlined earlier. The coefficient on C is picking up (roughly) the average overall shift so that a negative coefficient means that the the shift is below the average. The coefficient on REDIST is also negative but it is also statistically significant. This sign is also clearly wrong. If a geographic constituency changes (which is what this indicator variable is telling us), then the representative should be shifting his or her voting pattern to match the new constituency (in other words, a greater than average shift so the coefficient should be positive). The signs on APPOINT and HIRUN are both positive which is what we should expect. Getting appointed to, or running for, higher office should free the representative to "shirk". The sign on VOTESHR is negative which indicates that the larger the margin of victory in the previous election, the less the representative shifted. Arguably, this is evidence of the fact that the representative must be closely attuned to the preferences of her constituency in order to win by larger than average margins. On the other hand, the larger the margin of victory the greater the leeway the representative would have to "shirk" since she could sacrifice a bit of her margin in a future election to vote against the wishes of her constituency on occasion. In sum, the sign on this variable is ambiguous. Finally, the coefficient on NOTVOTE is and should be positive. As explained in the "Description of Variables" appendix to the homework problem, this variable is simply picking up the loss of precision of the estimation of the ideology variable.

Turning to statistical significance, the only variables that appear to be important here are NOTVOTE, REDIST, and RETIRE. REDIST and RETIRE have the wrong sign so that a mechanical application of hypothesis tests ignores what our substantive knowledge tells us. Also, even though the r-squared is only .007862, the overall F-statistic is statistically significant!

============================================================
LS // Dependent Variable is Y                                         
Date: 03/22/98   Time: 22:47                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         C           0.062457   0.003393   18.41004   0.0000          
      APPOINT        0.009820   0.009378   1.047147   0.2951          
        DIED        -0.001540   0.006365  -0.241869   0.8089          
       HIRUN         0.000339   0.004172   0.081332   0.9352          
        LOST        -0.000538   0.003253  -0.165281   0.8687          
      NOTVOTE        0.026312   0.004623   5.691317   0.0000          
       REDIST       -0.007402   0.001725  -4.290529   0.0000          
       RETIRE       -0.006559   0.003523  -1.861612   0.0627          
      VOTESHR       -3.77E-05   4.65E-05  -0.809000   0.4185          
============================================================
R-squared            0.008995    Mean dependent var 0.063764          
Adjusted R-squared   0.007732    S.D. dependent var 0.059434          
S.E. of regression   0.059204    Akaike info criter-5.652097          
Sum squared resid    22.00871    Schwarz criterion -5.642441          
Log likelihood       8856.909    F-statistic        7.123779          
Durbin-Watson stat   1.813502    Prob(F-statistic)  0.000000          
============================================================
Below is our preferred model. In the overall regression with all 26 variables, the 18 fixed effect dummy variables were statistically significant at all reasonable levels and of the 8 additional variables, only NOTVOTE has a small p-value. Performing the F test on excluding the 7 variables we get:

Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
[(19.91641 - 19.90218)/7]/[19.90218/(6288-25-1)] = .6396

The P-Value [using @FDIST(.6396,8,6262)] is .74498. We do not reject the null hypothesis that the omitted 7 variables have no effect.

For reasons stated above, the signs on all the coefficients make sense.
============================================================
LS // Dependent Variable is Y                                         
Date: 03/22/98   Time: 22:52                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         D1          0.076350   0.003437   22.21285   0.0000          
         D2          0.072874   0.003253   22.40529   0.0000          
         D3          0.086263   0.003259   26.46609   0.0000          
         D4          0.077145   0.003066   25.16208   0.0000          
         D5          0.095984   0.003003   31.96622   0.0000          
         D6          0.062665   0.003132   20.01125   0.0000          
         D7          0.071707   0.003081   23.27296   0.0000          
         D8          0.052033   0.003171   16.41113   0.0000          
         D9          0.071783   0.003260   22.01772   0.0000          
        D10          0.045623   0.003245   14.05984   0.0000          
        D11          0.062816   0.003164   19.85359   0.0000          
        D12          0.043591   0.003192   13.65723   0.0000          
        D13          0.047549   0.003191   14.90107   0.0000          
        D14          0.036427   0.003210   11.34733   0.0000          
        D15          0.036181   0.003066   11.79979   0.0000          
        D16          0.034113   0.003126   10.91235   0.0000          
        D17          0.044690   0.003127   14.29125   0.0000          
        D18          0.037780   0.003112   12.13909   0.0000          
      NOTVOTE        0.022108   0.004120   5.366135   0.0000          
============================================================
R-squared            0.103206    Mean dependent var 0.063764          
Adjusted R-squared   0.100631    S.D. dependent var 0.059434          
S.E. of regression   0.056365    Akaike info criter-5.748811          
Sum squared resid    19.91641    Schwarz criterion -5.728426          
Log likelihood       9170.976    F-statistic        40.08114          
Durbin-Watson stat   1.998638    Prob(F-statistic)  0.000000          
============================================================
Below is regression with NOTVOTE as the dependent variable. The results are very interesting. All the exit variables have positive signs and are statistically significant. The largest coefficient is on DIED which makes sense in this context. With respect to the remaining exit dummies, when a member knows she is going to exit, she votes less but her voting pattern given the smaller sample remains the same. In other words, "shirking" is not ideological, it simply is voting less.

The redistricting dummy variable, REDIST, is positive and statistically significant. This means that changing the geographic boundaries of representative's district produces a slight increase in not voting. This does not appear to make much sense. One explanation might be that the representative spends more of her time in the new district in order to solidify her support and therefore votes less.

VOTESHR is positive and statistically significant. This implies that the higher the previous election margin, the freer the representative feels to vote less. Electorally safe representatives appear vote less. However, the effect is very small.

============================================================
LS // Dependent Variable is NOTVOTE                                   
Date: 03/22/98   Time: 23:00                                          
Sample: 1 6288                                                        
Included observations: 6288                                           
============================================================
      Variable      CoefficienStd. Errort-Statistic  Prob.            
============================================================
         C           0.076039   0.009210   8.256167   0.0000          
      APPOINT        0.286145   0.025340   11.29217   0.0000          
        DIED         0.416157   0.016560   25.12969   0.0000          
       HIRUN         0.174242   0.011173   15.59512   0.0000          
        LOST         0.077676   0.008825   8.801402   0.0000          
       REDIST        0.016437   0.004704   3.493949   0.0005          
       RETIRE        0.142396   0.009447   15.07345   0.0000          
      VOTESHR        0.001796   0.000125   14.36470   0.0000          
============================================================
R-squared            0.180039    Mean dependent var 0.232428          
Adjusted R-squared   0.179125    S.D. dependent var 0.178356          
S.E. of regression   0.161595    Akaike info criter-3.644055          
Sum squared resid    163.9889    Schwarz criterion -3.635472          
Log likelihood       2542.625    F-statistic        196.9853          
Durbin-Watson stat   1.852625    Prob(F-statistic)  0.000000          
============================================================
The principal-agent theory from economics when applied to representation in the U.S. House of Representatives, fails. The indicator variable RETIRE -- which is a direct test of the "shirking" hypothesis -- is not statistically significant and does not even have the correct sign in the full model. Indeed, all the exit indicator variables appear to have no impact upon ideological shifts beyond a very indirect effect through NOTVOTE. In short -- all things being equal -- a representative does not alter her voting pattern. Genuine ideological conversion is very rare in U.S. politics. Representatives are voting their personal beliefs.

But if representatives are voting their personal beliefs, doesn't this imply that they are trustees? Not necessarily. After all, they have to get elected the first time! To use an American slang phrase, "what you see is what you get" or, "a leopard cannot change its spots". The voters know what they are getting the first time they vote! Hence, the representative should not change her voting patterns.

What about the REDIST indicator variable? Shouldn't the representative now shift her position to another stable pattern corresponding to the new district? Not necessarily. If the representative truly believes that her voting is correct, then she may not change. This is still an open area of research.