LEGACY CONTENT. If you are looking for Voteview.com, PLEASE CLICK HERE

This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

45-734 PROBABILITY AND STATISTICS II Homework Answers #3 (4th Mini AY1997-98)



  1. The following are the results needed to test all combinations of pairs of the coefficients b1, b2, and b3 equal to 0. We shall suppose that b1, corresponds to NICO, b2 corresponds to TAR, and b3, corresponds to WEIGHT.
    ============================================================
    LS // Dependent Variable is CARMON                                    
    Date: 03/16/98   Time: 23:07                                          
    Sample: 1 25                                                          
    Included observations: 25                                             
    ============================================================
          Variable      CoefficienStd. Errort-Statistic  Prob.            
    ============================================================
             C           3.202188   3.461755   0.925019   0.3655          
            NICO        -2.631660   3.900556  -0.674688   0.5072          
            TAR          0.962574   0.242244   3.973566   0.0007          
           WEIGHT       -0.130480   3.885342  -0.033583   0.9735          
    ============================================================
    R-squared            0.918589    Mean dependent var 12.52800          
    Adjusted R-squared   0.906959    S.D. dependent var 4.739683          
    S.E. of regression   1.445726    Akaike info criter 0.882869          
    Sum squared resid    43.89258    Schwarz criterion  1.077890          
    Log likelihood      -42.50933    F-statistic        78.98385          
    Durbin-Watson stat   2.860469    Prob(F-statistic)  0.000000          
    ============================================================
    
    ============================================================
    LS // Dependent Variable is CARMON                                    
    Date: 03/16/98   Time: 23:15                                          
    Sample: 1 25                                                          
    Included observations: 25                                             
    ============================================================
          Variable      CoefficienStd. Errort-Statistic  Prob.            
    ============================================================
             C           1.664666   0.993602   1.675386   0.1074          
            NICO         12.39541   1.054152   11.75865   0.0000          
    ============================================================
    R-squared            0.857378    Mean dependent var 12.52800          
    Adjusted R-squared   0.851177    S.D. dependent var 4.739683          
    S.E. of regression   1.828452    Akaike info criter 1.283558          
    Sum squared resid    76.89449    Schwarz criterion  1.381068          
    Log likelihood      -49.51794    F-statistic        138.2659          
    Durbin-Watson stat   2.673760    Prob(F-statistic)  0.000000          
    ============================================================
    
    ============================================================
    LS // Dependent Variable is CARMON                                    
    Date: 03/16/98   Time: 23:16                                          
    Sample: 1 25                                                          
    Included observations: 25                                             
    ============================================================
          Variable      CoefficienStd. Errort-Statistic  Prob.            
    ============================================================
             C           2.743277   0.675206   4.062875   0.0005          
            TAR          0.800976   0.050320   15.91759   0.0000          
    ============================================================
    R-squared            0.916778    Mean dependent var 12.52800          
    Adjusted R-squared   0.913160    S.D. dependent var 4.739683          
    S.E. of regression   1.396721    Akaike info criter 0.744873          
    Sum squared resid    44.86908    Schwarz criterion  0.842383          
    Log likelihood      -42.78438    F-statistic        253.3698          
    Durbin-Watson stat   2.892673    Prob(F-statistic)  0.000000          
    ============================================================
    
    ============================================================
    LS // Dependent Variable is CARMON                                    
    Date: 03/16/98   Time: 23:16                                          
    Sample: 1 25                                                          
    Included observations: 25                                             
    ============================================================
          Variable      CoefficienStd. Errort-Statistic  Prob.            
    ============================================================
             C          -11.79527   9.721627  -1.213302   0.2373          
           WEIGHT        25.06820   9.980283   2.511773   0.0195          
    ============================================================
    R-squared            0.215258    Mean dependent var 12.52800          
    Adjusted R-squared   0.181139    S.D. dependent var 4.739683          
    S.E. of regression   4.288984    Akaike info criter 2.988718          
    Sum squared resid    423.0939    Schwarz criterion  3.086228          
    Log likelihood      -70.83244    F-statistic        6.309001          
    Durbin-Watson stat   2.615425    Prob(F-statistic)  0.019481          
    ============================================================
    
    To test

    H0: b2 = b3 = 0
    H1: b2 ¹ 0 and/or b3 ¹ 0


    We use the first and second sets of results above to find:

    Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
    [(76.89449 - 43.89258)/2]/[43.89258/(25-3-1)] = 7.8947

    The table value for F2,21,.05 = 3.47. Since 7.8947 > 3.47 we reject the null hypothesis at the a = .05 significance level. Thus there is sufficient evidence to say that at least one of the two coefficients is not equal to 0.

    This is equivalent to performing the Wald Test:
    ====================================================
    Wald Test:                                                    
    Equation: Untitled                                            
    ====================================================
    Null HypothesisC(3)=0                                         
                   C(4)=0                                         
    ====================================================
    F-statistic     7.894729    Probability     0.002775          
    Chi-square      15.78946    Probability     0.000373          
    ====================================================
    
    To test

    H0: b1 = b3 = 0
    H1: b1 ¹ 0 and/or b3 ¹ 0


    We use the first and third sets of results above to find:

    Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
    [(44.86908 - 43.89258)/2]/[43.89258/(25-3-1)] = .2336

    The table value for F2,21,.05 = 3.47. Since .2336 < 3.47 we do not reject the null hypothesis at the a = .05 significance level. There is not enough evidence to say that at least one of the two coefficients is not equal to 0.

    This is equivalent to the Wald Test:
    ====================================================
    Wald Test:                                                    
    Equation: Untitled                                            
    ====================================================
    Null HypothesisC(2)=0                                         
                   C(4)=0                                         
    ====================================================
    F-statistic     0.233600    Probability     0.793708          
    Chi-square      0.467199    Probability     0.791679          
    ====================================================
    
    To test

    H0: b1 = b2 = 0
    H1: b1 ¹ 0 and/or b2 ¹ 0


    We use the first and fourth sets of results above to find:

    Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
    [(423.0939 - 43.89258)/2]/[43.89258/(25-3-1)] = 90.7127

    The table value for F2,21,.05 = 3.47. Since 90.7127 > 3.47 we reject the null hypothesis at the a = .05 significance level. There is enough evidence to say that at least one of the two coefficients is not equal to 0.

    This is equivalent to the Wald Test:
    ====================================================
    Wald Test:                                                    
    Equation: Untitled                                            
    ====================================================
    Null HypothesisC(2)=0                                         
                   C(3)=0                                         
    ====================================================
    F-statistic     90.71269    Probability     0.000000          
    Chi-square      181.4254    Probability     0.000000          
    ====================================================
    
    The results from the three F-tests seem to indicate that b1 and b3 are not significantly different from 0 either by themselves or jointly. Both F-tests that included b2 were rejected, but that should not be surprising because b2 itself was significant in the model that included all the coefficients. The proper choice of the model would be that using only the variable TAR and an intercept to predict the level. Notice, in fact, that in this model the adjusted R2 is higher in the model with only the variable TAR than it is in the unrestricted model.

    The test

    H0: b2 = 1
    H1: b2 ¹ 1


    can be done using either the unrestricted model or the restricted model with only the variable TAR in it. However based on what was said above, it should really be done on the model with only TAR and the intercept.

    For the unrestricted case:

    Test Statistic = (b2 - 1)/SE{b2} = (.962574 - 1)/.242244 = -.1545. Now, since t21,.025 = 2.080, we would fail to reject the null hypothesis that b2 = 1 at the .05 significance level.

    The equivalent Wald Test is (note that the F = t2 here):
    ====================================================
    Wald Test:                                                    
    Equation: Untitled                                            
    ====================================================
    Null HypothesisC(3)=1                                         
    ====================================================
    F-statistic     0.023870    Probability     0.878692          
    Chi-square      0.023870    Probability     0.877217          
    ====================================================
    
    For the restricted case:

    Test Statistic = (b2 - 1)/SE{b2} = (.800976 - 1)/.050320 = -3.9552. Now since t23,.025 = 2.069, we reject the null hypothesis b2 = 1 at the .05 significance level.

    The equivalent Wald Test is (note that the F = t2 here):
    ====================================================
    Wald Test:                                                    
    Equation: Untitled                                            
    ====================================================
    Null HypothesisC(2)=1                                         
    ====================================================
    F-statistic     15.64324    Probability     0.000629          
    Chi-square      15.64324    Probability     0.000076          
    ====================================================
    
    In summary, the model that uses only the variable TAR and the intercept to predict the amount of carbon monoxide should be used as there is no evidence that either of the other variables have any predictive power over the amount of carbon monoxide given off by a cigarette.

    1. The model which predicts the number of wildcat wells from the variables OILCON, PCI, PRICEOK and VEHICLE is shown below:

      ============================================================
      LS // Dependent Variable is WILDCT2                                   
      Date: 03/20/98   Time: 22:23                                          
      Sample: 1936 1987                                                     
      Included observations: 52                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           29855.87   7030.455   4.246649   0.0001          
             OILCON        53073.31   248862.9   0.213263   0.8320          
              PCI         -1875.729   1549.480  -1.210554   0.2321          
            PRICEOK        2100.154   406.8240   5.162317   0.0000          
            VEHICLE        9139.570   39587.01   0.230873   0.8184          
      ============================================================
      R-squared            0.623885    Mean dependent var 43583.69          
      Adjusted R-squared   0.591875    S.D. dependent var 16546.58          
      S.E. of regression   10570.73    Akaike info criter 18.62290          
      Sum squared resid    5.25E+09    Schwarz criterion  18.81052          
      Log likelihood      -552.9802    F-statistic        19.49042          
      Durbin-Watson stat   0.403879    Prob(F-statistic)  0.000000          
      ============================================================
      
      The results of this model show that individually the only variable that is significant in predicting the number wildcat oil wells drilled is the price of oil, PRICEOK. The coefficient is positive and says that as the price of a barrel of oil increases by one dollar the expected number of wells drilled will increase by 2100 or so. The other variables are not individually significant. The result makes sense if one believes that the major cause for drilling more wells is financial. It could be that the demand for oil, or the number of vehicles on the road, or even the per capita income are exogenous and do not determine whether it is actually cost effective to drill wells. In fact, this is what the the model shown here seems to imply.

      The regression below is used to perform the joint hypothesis test on the other coefficients.
      
      ============================================================
      LS // Dependent Variable is WILDCT2                                   
      Date: 03/20/98   Time: 22:26                                          
      Sample: 1936 1987                                                     
      Included observations: 52                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           20900.92   2991.614   6.986503   0.0000          
            PRICEOK        1820.533   209.4608   8.691520   0.0000          
      ============================================================
      R-squared            0.601729    Mean dependent var 43583.69          
      Adjusted R-squared   0.593763    S.D. dependent var 16546.58          
      S.E. of regression   10546.25    Akaike info criter 18.56475          
      Sum squared resid    5.56E+09    Schwarz criterion  18.63980          
      Log likelihood      -554.4684    F-statistic        75.54251          
      Durbin-Watson stat   0.343020    Prob(F-statistic)  0.000000          
      ============================================================
      
      To test

      H0: boilcon = bpci = bvehicle = 0
      H1: boilcon ¹ 0 and/or bpci ¹ 0 and/or bvehicle ¹ 0


      Fq,n-k-1 = {[SSER - SSEUR]/q}/ {SSEUR/n-k-1} =
      [(5.56E09 - 5.25E09)/3]/[5.25E09/(52-4-1)] = .9229

      F3,47,.05 = 2.80. Since .9251 < 2.80 we do not reject the null hypothesis. We do not have sufficient evidence to say that at least one of the three coefficients is not equal to 0. Note again (as in problem one) the adjusted R2 is higher when we restrict the coefficients of OILCON, PCI and VEHICLE to be 0.

      The equivalent Wald Test is:
      ====================================================
      Wald Test:                                                    
      Equation: Untitled                                            
      ====================================================
      Null HypothesisC(2)=0                                         
                     C(3)=0                                         
                     C(5)=0                                         
      ====================================================
      F-statistic     0.922888    Probability     0.437107          
      Chi-square      2.768665    Probability     0.428684          
      ====================================================
      
    2. The results of fitting the model described in the homework are shown below. The variables were all transformed by taking logs.

      ============================================================
      LS // Dependent Variable is LWILDCT2                                  
      Date: 03/20/98   Time: 22:43                                          
      Sample: 1936 1987                                                     
      Included observations: 52                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           11.71659   1.257868   9.314645   0.0000          
            LOILCON        0.564075   0.362768   1.554919   0.1267          
              LPCI        -0.586528   0.290325  -2.020249   0.0491          
            LPRICEOK       0.707562   0.159539   4.435029   0.0001          
            LVEHICLE      -0.258323   0.396029  -0.652283   0.5174          
      ============================================================
      R-squared            0.552816    Mean dependent var 10.62026          
      Adjusted R-squared   0.514758    S.D. dependent var 0.348383          
      S.E. of regression   0.242681    Akaike info criter-2.740802          
      Sum squared resid    2.768024    Schwarz criterion -2.553183          
      Log likelihood       2.476056    F-statistic        14.52556          
      Durbin-Watson stat   0.336451    Prob(F-statistic)  0.000000          
      ============================================================
      
      In this model there have been several changes in the significance level of the coefficients. The variable LPCI and LOILCON are now individually more significant then they were before. The coefficient of vehicle changed sign between the two models, but it was not significantly different from 0 in either case. Thus it does not make sense to worry about this coefficient too much. This model seems to indicate that both an increase in price and an increase in oil consumption cause increases in the number of wells drilled in the United States each year. It also indicates that increases in per capita income cause decreases in the number of wells drilled.

      These results are not intuitive. The fact that the coefficient of PCI is negative could perhaps be explained by the fact that if per capita income increases consumers don't care as much about how expensive the oil is and are more willing to buy from foreign competition. On the other hand, by this reasoning, if oil consumption increases, it is hard to see why this would cause an increase in the number of wells drilled. Perhaps both these variables would be better used in a model which predicts the price of oil (PRICEOK). The fit of the data to the model is actually better in part a, (higher adjusted R2 ) where the only variable that has a significant coefficient is PRICEOK.

    1. Presidential vote appears to be related to the performance of the economy. Note that the sign on GNP is positive.

      ============================================================
      LS // Dependent Variable is PRSVOTE                                   
      Date: 03/20/98   Time: 23:01                                          
      Sample: 1 19                                                          
      Included observations: 19                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           50.18956   1.802018   27.85186   0.0000          
              GNP          0.830757   0.286533   2.899338   0.0100          
      ============================================================
      R-squared            0.330871    Mean dependent var 53.05306          
      Adjusted R-squared   0.291510    S.D. dependent var 7.805461          
      S.E. of regression   6.569999    Akaike info criter 3.864328          
      Sum squared resid    733.8031    Schwarz criterion  3.963743          
      Log likelihood      -61.67095    F-statistic        8.406161          
      Durbin-Watson stat   2.336690    Prob(F-statistic)  0.009977          
      ============================================================
      
      Adding military mobilization to the model does not seem to add much to the model. This is not surprising given the results of the first homework where we found that -- excluding the big outlier year of 1946 -- GNP and MILMOB were significantly related.

      ============================================================
      LS // Dependent Variable is PRSVOTE                                   
      Date: 03/20/98   Time: 23:04                                          
      Sample: 1 19                                                          
      Included observations: 19                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           50.20606   1.862087   26.96225   0.0000          
              GNP          0.820986   0.306912   2.674985   0.0166          
             MILMOB        0.438611   3.765610   0.116478   0.9087          
      ============================================================
      R-squared            0.331438    Mean dependent var 53.05306          
      Adjusted R-squared   0.247868    S.D. dependent var 7.805461          
      S.E. of regression   6.769331    Akaike info criter 3.968744          
      Sum squared resid    733.1814    Schwarz criterion  4.117866          
      Log likelihood      -61.66290    F-statistic        3.965978          
      Durbin-Watson stat   2.339763    Prob(F-statistic)  0.039915          
      ============================================================
      
      Now, introducing the lag of GNP into the model produces a rather surprising result. Note that the sign is negative. This does not make sense even though it is marginally significant. In addition, the last period here is 4 years earlier (these are presidential elections) and that is probably too far back to expect the voters to remember.

      ============================================================
      LS // Dependent Variable is PRSVOTE                                   
      Date: 03/20/98   Time: 23:06                                          
      Sample(adjusted): 2 19                                                
      Included observations: 18 after adjusting endpoints                   
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           52.04742   2.224612   23.39617   0.0000          
              GNP          0.762056   0.316581   2.407146   0.0304          
            GNP(-1)       -0.414956   0.300494  -1.380913   0.1890          
             MILMOB        1.031935   3.769765   0.273740   0.7883          
      ============================================================
      R-squared            0.428678    Mean dependent var 53.13232          
      Adjusted R-squared   0.306251    S.D. dependent var 8.023883          
      S.E. of regression   6.683218    Akaike info criter 3.992329          
      Sum squared resid    625.3156    Schwarz criterion  4.190190          
      Log likelihood      -57.47186    F-statistic        3.501518          
      Durbin-Watson stat   2.532604    Prob(F-statistic)  0.044066          
      ============================================================
      
      Hence the model we should pick to predict the presidential votes is the simplest one using only GNP.

      This brings us to our preferred model:

      
      ============================================================
      LS // Dependent Variable is PRSVOTE                                   
      Date: 03/20/98   Time: 23:08                                          
      Sample: 1 19                                                          
      Included observations: 19                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           45.55513   2.154329   21.14586   0.0000          
              GNP          1.088018   0.252156   4.314866   0.0005          
             REPUB         7.911778   2.656529   2.978239   0.0089          
      ============================================================
      R-squared            0.569517    Mean dependent var 53.05306          
      Adjusted R-squared   0.515707    S.D. dependent var 7.805461          
      S.E. of regression   5.431912    Akaike info criter 3.528521          
      Sum squared resid    472.0906    Schwarz criterion  3.677643          
      Log likelihood      -57.48079    F-statistic        10.58379          
      Durbin-Watson stat   2.459840    Prob(F-statistic)  0.001179          
      ============================================================
      
      Note the large and significant coefficient on the REPUB dummy variable, indicating that when the incumbent president is a Republican then they can expect 7.91 percent more of the vote than when the incumbent president is a Democrat. The coefficient of GNP is still positive and significant as well. The dummy variable REPUB should stay in the model because it tells us to add 7.91 percent to any prediction about presidential vote when the incumbent is a Republican and 0 when the incumbent president is a Democrat. Another way to say this is that the coefficient of the REPUB dummy variable gives an estimate of the difference that one would expect between the percentage vote for the Republican party over and above the Democratic party when all other factors (here only GNP) are held fixed.

    2. The two models whose estimation was requested are shown below:

      ============================================================
      LS // Dependent Variable is HOUSVOTE                                  
      Date: 03/20/98   Time: 23:10                                          
      Sample: 1 19                                                          
      Included observations: 19                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           24.80678   7.923629   3.130735   0.0065          
              GNP          0.089298   0.225553   0.395908   0.6974          
            PRSVOTE        0.467304   0.156172   2.992236   0.0086          
      ============================================================
      R-squared            0.494500    Mean dependent var 49.90647          
      Adjusted R-squared   0.431312    S.D. dependent var 5.609913          
      S.E. of regression   4.230514    Akaike info criter 3.028586          
      Sum squared resid    286.3560    Schwarz criterion  3.177708          
      Log likelihood      -52.73140    F-statistic        7.825901          
      Durbin-Watson stat   1.845674    Prob(F-statistic)  0.004264          
      ============================================================
      
      ============================================================
      LS // Dependent Variable is HOUSVOTE                                  
      Date: 03/20/98   Time: 23:10                                          
      Sample: 1 19                                                          
      Included observations: 19                                             
      ============================================================
            Variable      CoefficienStd. Errort-Statistic  Prob.            
      ============================================================
               C           11.51443   5.967311   1.929584   0.0728          
              GNP         -0.465363   0.190953  -2.437049   0.0277          
            PRSVOTE        0.824670   0.128708   6.407271   0.0000          
             REPUB        -7.927619   1.705135  -4.649262   0.0003          
      ============================================================
      R-squared            0.792916    Mean dependent var 49.90647          
      Adjusted R-squared   0.751499    S.D. dependent var 5.609913          
      S.E. of regression   2.796532    Akaike info criter 2.241424          
      Sum squared resid    117.3089    Schwarz criterion  2.440253          
      Log likelihood      -44.25336    F-statistic        19.14481          
      Durbin-Watson stat   2.238649    Prob(F-statistic)  0.000022          
      ============================================================
      
      The estimated models are:

      HOUSVOTE = 24.80678 + .089298*GNP + .467304*PRSVOTE

      and

      HOUSVOTE = 11.51443 - .465363*GNP + .82467*PRSVOTE - 7.927619*REPUB

      The results of these two models are hard to follow. In the first model the coefficient of GNP is not significantly different from 0 and it is positive. In the second equation the coefficient of the variable GNP is negative and it is significantly different from 0. An interpretation of the first model would say that the House of Representatives vote for the incumbent's party goes up by a factor of .467304 for each percent increase in the presidential vote for the incumbent party's presidential candidate.

      The increase in the House vote due to a unit change in presidential vote is even greater in the second model. This is because we are controlling for two factors which negatively influence the House vote as they increase. The fact that the incumbent is a Republican negatively influences the House vote by 7.9 percent. For each unit increase in GNP growth, there is a decrease in House vote for the incumbent's party by .465363. This last statement does not make sense if one believes that a higher GNP implies that the voters are more likely to vote for the incumbent. One way that this might be explained is that the predictive power of the variable PRSVOTE on HOUSVOTE could be much greater than that of GNP, and in this model GNP is in essence a correction term.

      Note that in the first model GNP does not influence the House vote (large p-value) whereas it does in the second model. The r-square is much higher in the second model with the REPUB dummy variable indicating that it provides much of the punch of the model. Because GNP was not significant in the first model and it was in the second, the relationship between GNP and REPUB should be investigated.

      Below is the correlation matrix for the variables used in the two models.

                           Correlation Matrix
      ============================================================
                    HOUSVOTE      GNP       PRSVOTE      REPUB              
      ============================================================
        HOUSVOTE    1.000000    0.460028    0.699677   -0.270831            
          GNP       0.460028    1.000000    0.575214   -0.342567            
        PRSVOTE     0.699677    0.575214    1.000000    0.261906            
         REPUB     -0.270831   -0.342567    0.261906    1.000000            
      ============================================================
      
      The variables REPUB and GNP are negatively correlated. Thus when REPUB is added to the model, the variable GNP becomes a better predictor. However, we should be very suspicious about the role GNP plays in this model. The coefficient on GNP is negative -- which is counter-intuitive -- and it is not significant when used by itself in the absence of REPUB.