This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

Tenth Assignment
Due 12 November 2001

1. This problem is a continuation of our analysis of the 105th and 106th congressional district data that we have analyzed in homeworks 3, 4, 5, 6, and 8. I made some additional corrections to the file so download the new Stata file below:

105th and 106th Congressional District Data (HDMG106.DTA)

2000 Census Data For Congressional Districts (CD2000CENSUS.XLS)

and merge it into HDMG106.DTA. (Note that this will require some ingenuity on your part!) Use the following variable names and definitions (note that the variable district also appears in the Excel file but it will be one of the variables you will need to use to do the merge):
```
statenmlong     str20  %20s                   Name of state (long)
total_pop       double %10.0g                 Total population of CD 2000
white00         float  %9.0g                  Percent White 2000
black00         float  %9.0g                  Percent Black 2000
asian00         float  %9.0g                  Percent Asian 2000
hispanic00      float  %9.0g                  Percent Hispanic 2000
owner00         float  %9.0g                  Percent Owner-Occupied Housing Units 2000
```
Do the d and summ commands, and report the results.

2. In STATA run the regressions:

regress bush00 black00 south hispanic00 income owner00 dwnom1n dwnom2n dole96
regress gore00 black00 south hispanic00 income owner00 dwnom1n dwnom2n clint96

Analyze these two regressions. What do you think accounts for the differences between them. Be specific.

3. In STATA compute the correlation matrix for the independent variables:

correlate black00 hispanic00 income owner00 dwnom1n dwnom2n

Examine the entries of the correlation matrix. Do you see anything that strikes you as odd? Be explicit.

4. To obtain the eigenvalues and eigenvectors of the correlation matrix, use the STATA command:

factor black00 hispanic00 income owner00 dwnom1n dwnom2n, pc

To obtain a graph of the eigenvalues, use the STATA command:

greigen, xlabel(1,2,3,4,5,6)

Does this graph lead you to believe that there is a significant problem with multicollinearity with these independent variables? Why? Why not?

2. This problem deals with congressional elections. Below you will find a dataset that includes variables created by David Lublin and Gary Jacobson. The observations are congressional districts for the 1960 to 1994 period. Some of the data are missing so when you run regressions you may not have the entire time period. To bring up the dataset in Stata you will have to increase the default memory size. To do this, use the command:

set mem 20m

which allocates 20 meg of memory for Stata to work with.

Congressional Elections Data From Lublin and Jacobson (Stata Dataset)

Download the dataset and bring it up in Stata. If you issue the d command you will see:
```
. d

obs:         7,832
vars:            39                          1 Nov 2001 11:35
size:     1,057,320 (98.7% of memory free)
-------------------------------------------------------------------------------
storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
year            int    %8.0g                  year
congress        byte   %8.0g                  congress (87-104)
icpsrid         long   %12.0g                 icpsr id #
icpsrst         byte   %8.0g                  icpsr state code
cdist1          byte   %8.0g                  cong. district (p&r)
statenm         str7   %9s                    state name
cdist2          byte   %8.0g                  cong. district (lublin)
dempct          float  %9.0g                  demo. % two party vote
blkpct          float  %9.0g                  black percent of pop.
whpct           float  %9.0g                  white percent of pop.
forpct          double %10.0g                 foreign born % of pop.
south           byte   %8.0g                  south (1=confederacy + KY +OK,
0=north)
incomewh        float  %9.0g                  white median family income
incomebl        long   %12.0g                 black median family income
hs25            float  %9.0g                  percent 25 and older completing
high school or more
college         float  %9.0g                  percent 25 or older completed 4
yrs college or more
party1          int    %8.0g                  party code (100=Dem, 200=Rep)
blackrep        byte   %8.0g                  blackrep =1 if black
representative, 0 otherwise
latinorp        byte   %8.0g                  latinorp=1 if mexican, 2=PR,
3=Cuban, 0 otherwise
womanrep        byte   %8.0g                  woman representative (1=woman,
0=man)
incumb1         byte   %8.0g                  incumbency (0=repub, 1=demo.,
2=open)
demvshr         float  %9.0g                  democrats share two-party vote
whowon          byte   %8.0g                  0 = repub won, 1= demo. won,
99=3rd party won
incshr          float  %9.0g                  incumbents share 2-party vote,
99.9=unopposed
incshrl         float  %9.0g                  incumbents share 2-party vote
last elect, 99.9=unpposed
redist          byte   %8.0g                  redistricted: 0=district
unchange, 1=re-districting
incumbst        byte   %8.0g                  incumbency status:
0 = republican incumbent
1 = democratic incumbent
2 = open seat formerly held by democrat
3 = open seat formerly held by republican
4 = open seat, new (from redistricting)
5 = two incumbents (from redistricting)
9 = third-party incumbent
challeng        byte   %8.0g                  challenger quality
0 = challenger has not held elective office
1 = challenger has held elective office
2 = only Democratic candidate for open seat has held office
3 = only Republican candidate for open seat has held office
4 = both candidates for open seat have held office
5 = no challenger
6 = no Democrat candidate (open)
7 = no Republican candidate (open)
challenh        byte   %8.0g                  challenger misc. information
0 = Nothing special  (ignore)
1 = At Large or multi-candidate race
2 = unopposed
3 = incumbent switched parties since last election
4 = challenger was state legislator
5 = only Democrat was state legislator (open seat)
6 = only Republican was state legislator (open seat)
7 = both candidates for open seat were state legislators
8 = challenger is former U.S. Representative
9 = odd race, third party; in general, DO NOT USE
icpsrid2        long   %12.0g                 icpsr id number
party2          int    %8.0g                  party id (100=Dem, 200=Repub)
name            str11  %11s                   member name
dwnom1          float  %9.0g                  dwnominate 1st dimension
dwnom2          float  %9.0g                  dwnominate 2nd dimension
(multiply by .3)
partynm         str13  %13s                   name of political party
xincome         long   %12.0g                 median family income
xhispct         float  %9.0g                  percent hispanic
-------------------------------------------------------------------------------
Sorted by:```
If you issue the summ command you will see:
```. summ

Variable |     Obs        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------
year |    7832    1976.996   10.37915       1960       1994
congress |    7832    95.49783   5.189574         87        104
icpsrid |    7832     12325.2   7208.363          2      95120
icpsrst |    7832    36.75447   21.00158          1         82
cdist1 |    7832    9.979443   10.88324          1         99
statenm |       0
cdist2 |    7832    9.566394   9.151734          1         52
dempct |    7832    56.98605   23.56704          0        100
blkpct |    7595    11.12508   14.51647   .0194025   95.50033
whpct |    7595    85.85531   15.74295   3.862633   99.89686
forpct |    6723    5.732316    6.60551    .116483   58.52188
south |    7832    .2858784   .4518606          0          1
incomewh |    5859    17896.74   11820.14   2088.375      78717
incomebl |    5856    12378.12   8885.767       1213      66320
hs25 |    6723    57.11782    15.5113       14.8       92.3
college |    6723    12.88006   7.004294        1.9       51.4
party1 |    7830    140.3649   49.22029        100        329
blackrep |    7832    .0390705   .1937751          0          1
latinorp |    7832     .020429   .1716504          0          3
womanrep |    7832     .046859   .2113504          0          1
incumb1 |    7817    1.242548   .6364613          0          3
votesd |    6382    83320.55   53081.79          0    1872351
votesr |    6443    71985.16   57503.79          0    1786018
demvshr |    7832    57.00632   23.59968          0        100
whowon |    7832     .601762    .525787          0          9
incshr |    7832    71.56447   18.54318       20.6     99.999
incshrl |    7832    69.86711   17.08056       22.1     99.999
redist |    7832    .2893258   .4534784          0          1
incumbst |    7832    .8476762   .8639607          0          9
challeng |    7832    1.135981   1.803747          0          9
challenh |    7832    1.124362   2.017193          0          9
icpsrid2 |    7832     12325.2   7208.363          2      95120
party2 |    7832    140.4164   49.26602        100        329
name |       0
dwnom1 |    7832   -.0354424   .3335639      -1.07       1.37
dwnom2 |    7832    .0107231   .5186352      -1.83       1.43
partynm |       0
xincome |    6723    15494.69   10600.03       1968      64199
xhispct |    4780    6.610872   11.38954   .0137409   83.71677
order |    7832      3916.5   2261.048          1       7832```
1. Your assignment is to build a model of the Democratic Vote Share. That is, use demvshr as your dependent variable (note, do not use dempct -- it has some errors in it!). You are free to use any independent variables you want but you must include median family income (xincome) in your specification. Whatever other independent variables you use, you must have a reasonable explanation for your specification!

2. Note that xincome is in nominal dollars! To see the distribution of xincome use the graph command in Stata; namely:

graph xincome congress

To correct the xincome variable as well as the incomewh and incomebl variables, you need to apply a price deflator. For congress 88 - 91 use 100/90.6, for 93 - 97 use 100/125.3, for 98 - 102 use 100/289.1, and for 103 - 104 use 100/420.3. These transformations will correct the income variables to 1967 dollars.

3. When you have settled on your specification and have finished your analysis using Stata, paste the variables that you settled on into EVIEWS and replicate your analysis using EVIEWS.