Measurement Theory UCSD Spring 2007: Sixth Assignment Page

LEGACY CONTENT. If you are looking for Voteview.com, PLEASE CLICK HERE

This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

POLI 279 MEASUREMENT THEORY
Sixth Assignment
Due 23 May 2006

In this problem we are going to use MCMCPack developed by Andrew Martin and Kevin Quinn to do a one-dimensional parametric unfolding of the 104^th U.S. Senate. Download the two R programs below along with the Stata file for the 104^th Senate roll calls and put them in the same directory:

keith2.r -- Program to create R dataframe from Stata file

keithMCMC.r -- Program to run Bayesian MCMC using MCMCPack

Sen104kh.dta

Stata

^th

Sen104kh.ord

ASCII

^th

Here is what keith2.r looks like:

#
#
#    keith2.r -- This Program was written by Kevin Quinn and modfied by
#                 Keith Poole to run on WINDOZE machines
#
rm(list=ls(all=TRUE))
#
library(foreign)              Be sure to load this library
#
## does the 104th senate
## (assumes the relevant .dta files are in the current directory)
#
## write out the filehead (only need to do that here)
filehead <- "c:/ucsd_homework_5/sen104kh"   Change Path Statements as Appropriate
filehead2 <- "sen104kh"                     This line needs to be changed as well
#
## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE
## #######################################################
#
## read data in
filestring <- paste(filehead, ".dta", sep="")  Creates "c:/ucsd_homework_5/sen104kh.dta"
mydata <- read.dta(filestring)                 read.dta is in the foreign library
#
## add rownames corresponding to legislators' names
rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column 
#                                                in the Stata file and deletes all spaces  
## recode votes to 0/1/NA                        rownames uses the "name" variable as labels for the rows of the matrix 
## 1 yea, 0 nay, NA missing
ncdata <- ncol(mydata)                         ncol gets the number of columns of the matrix 
submat <- mydata[,10:ncdata]                   Note that the first roll call is the 10^th variable in Stata
submat[submat==0] <- -999
submat[submat==7] <- -999
submat[submat==8] <- -999
submat[submat==9] <- -999
submat[submat==4] <- 0
submat[submat==5] <- 0
submat[submat==6] <- 0
submat[submat==2] <- 1
submat[submat==3] <- 1
submat[submat==-999] <- NA
mydata[,10:ncdata] <- submat
#
## change name of mydata to whatever filehead is    parse creates the character string "sen104kh <- mydata"
expr <- parse(text=paste(filehead2, " <- mydata"))    using the paste command to concatenate the strings
eval(expr)                                          expr (expression) creates an unevaluated call and eval(expression) 
#                                                      evaluates the unevaluated expression.  Note that this is a clever way
## save as an .rda file                                 of passing R a command that would normally be entered from the command line     
expr <- parse(text=paste("save(", filehead2, ", file=\"",
                filehead, ".rda\", ascii=TRUE)", sep="")) This produces 
eval(expr)                                         expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE))
                                                    and eval(expr) executes the save(...) command
                                                    Note that the \" says use " as part of this string

When you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.

keithMCMC.r actually runs the Bayesian MCMC analysis of the roll call matrix created by keith2.r.

#
#
#  keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack.
#                    Program written by Kevin Quinn and modified by Keith Poole
#
library(MCMCpack)
#
load("c:/ucsd_homework_5/sen104kh.rda")
#
Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)]    Pull in the roll calls
#
posterior <- MCMCirt1d(Sen.rollcalls,            Here is the call to the One-Dimensional IRT model in MCMCPack
                       theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2),  Note the Kennedy-Helms Constraint
                       burnin=2000, mcmc=100000, thin=20, verbose=500)      The Verbose switch writes to the screen every 500 iterations
geweke.diag(posterior)        geweke.diag is a convergence diagnostic for Markov Chains
pdf(file="c:/ucsd_homework_5/keithplots3.pdf")  This opens a PDF file for writing plots
plot(posterior)    Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator
summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain
sss <- summary(posterior)
write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt")  This just writes the statistics to a file
write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt")   This just writes the quantiles to a file
dev.off()          This turns off the PDF output file
plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain Matrix

Run keithMCMC.r and you should see something like this:

The quantiles file of the chains should look something like this:

The means and standard deviations should look something like this:

We need to get the Bayesian MCMC senator medians out of the file tab2.txt. To do this we are going to use a keyboard macro written as a text file like the one we used in Homework 4 Question 1. Using Epsilon bring up tab2.txt in a window and then split the screen (C-X2) and set it up so that you have tab2.txt in the top screen, sen104kh.ord the next window down, a blank file -- sen104MCMC.txt (you can call it something else if you want) -- and our macro, macro_hw5.txt (shown blank). Be sure to position the cursor at the top of each file except tab2.txt where it should be positioned in front of the record for President Clinton. It should look like this:

Here is the first part of the macro and what it looks like executed. The nice thing about constructing a complex macro is that you can do it piecemeal and watch what happens as long as you do not overwrite your files!!! Note that in the macro SPACE means Space Bar, and LEFT, RIGHT, UP, and DOWN refer to the corresponding Arrow Keys. Also note that the macro is written in two lines. Epsilon interprets this as a C-M, a Line Feed or an Enter. This is handy because your macro does not have to be one long line!

Here is what the completed macro looks like executed one time:

And here it is at the finish:

Here is the macro itself:

macro_hw5.txt

Save all of the above files and turn in a listing of your sen104MCMC.txt file.
sen104MCMC.txt is still not in a form that we can use because note that the decimal points in the ideal point medians are not lined up! This needs to be fixed. To begin, use Epsilon to change Kennedy's coordinate from -2 to -2.00000 and Helms' coordinate from 2 to 2.00000 (the number of zeroes is not crucial). Now, as in the previous question, bring sen104MCMC.txt up in Epsilon, split the window twice, and create the dummy file fixit.txt with a window at the bottom for another Macro. It should look like this:

The strategy here is to cut the coordinate out of sen104MCMC.txt, take it down to the fixit.txt window below and process it. The "trick" is to add some spaces at the beginning of the number, back up from the decimal point a set number of spaces and break the line. Note that this sets the decimal point a fixed number of spaces to the right. The number is then put into the kill buffer and returned to sen104MCMC.txt. This produces coordinates that are lined up.

This is what it looks like mid-way through the macro. Note that I added some spaces at the end of the line in sen104MCMC.txt so that I have plenty of room when I return to the file with the adjusted number.

Here it is at the end. Note that the coordinate medians are lined up appropriately.

Here is the macro itself:

macro_hw5_1.txt

One last thing needs to be done. Note that R puts lots of digits after the decimal point! Write a simple keyboard macro to limit the number of digits after the decimal point to 5. Turn in a listing of that macro and the final version of your sen104MCMC.txt file. (Note that you will need this file for the next problem!)
Perform a test to see if Ben Nighthorse Campbell (CO) significantly changed his voting behavior after he switched parties from Democrat to Republican. In particular, note that Campbell (D-CO) is column 13 of posterior and Campbell (R-CO) is column 14 of posterior. To do the test, compute the difference between the two chains and use the summary command to get the statistics. In particular, in R issue the three commands:

campbelldiff <- posterior[,14]-posterior[,13]
summary(campbelldiff)
plot(campbelldiff)

Turn in the results of the summary command and the plot.

From a statistical standpoint what do these results show?

The aim of this problem is to compare the rank-ordering from my Optimal Classification (OC) program with the ordering of the medians from MCMCPack. Download the program and the "control card" file:

PERFL_2006.EXE -- Maximum Classification Scaling Program

PERFSTRT_2006.DAT -- Control Card File

SEN104KH.ORD -- 104^th Senate Roll Call Data

and place them in the same folder on a WINDOZE machine. Recall that PERFSTRT_2006.DAT was used in homework 5 problem 1 to scale the 108^th Senate so you will have to use Epsilon to change the values in PERFSTRT_2006.DAT so it looks like this:

SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
    1  919   20   36    1    2   20 0.005    5
(36A1,12000I1)
(I5,1X,36A1,2I5,50F8.3)

The first line is the roll call data file, the second line is the title, the third line, in order, is the number of dimensions, the number of roll calls, the number of iterations (leave as is), the number of "characters" to read off the front of each legislator's record (leave as is), the record number of a Senator that you believe is on the "Left" (this just sets "liberal" on the left -- lower ranks or negative values on the first dimension if the scaling is in two or more dimensions -- and "conservative" on the right), and the record number of a Senator that you believe is on "Up" or on positive side of the second dimension (in the U.S. context, these are the Southern Democrats in an earlier era). The last line is the format for the coordinate output file. Please study the examples on the Optimal Classification Program page because we will be using this program several more times!

At the command line type

PERFL_2006

The program should only take a few seconds to run and it produces three output files -- PERF21.DAT, PERF23.DAT, and PERF25.DAT. PERF21.DAT should look something like this:


 09 MAY       2007  13.38.10.53.
 RANDOM NUMBER SEED     62600
SEN104KH.ORD                                          
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
    1  919   20   36    1    2   20 0.005    5
(36A1,25000I1)                                                  
(I5,1X,36A1,2I5,50F8.3)                                         
 ******************************************************************************
  1 ROLL CALLS   1    8178   81250  0.10065  0.89935  0.71583
  2 LEGISLATORS  1    8096   81250  0.09964  0.90036  0.71868  0.99848
  3 ROLL CALLS   1    8025   81250  0.09877  0.90123  0.72115
  4 LEGISLATORS  1    8016   81250  0.09866  0.90134  0.72146  0.99973
  5 ROLL CALLS   1    7989   81250  0.09833  0.90167  0.72240
  6 LEGISLATORS  1    7988   81250  0.09831  0.90169  0.72244  0.99991
  7 ROLL CALLS   1    7981   81250  0.09823  0.90177  0.72268
  8 LEGISLATORS  1    7981   81250  0.09823  0.90177  0.72268  0.99992
  9 ROLL CALLS   1    7970   81250  0.09809  0.90191  0.72306
 10 LEGISLATORS  1    7970   81250  0.09809  0.90191  0.72306  0.99999
 SHARPEN         2    7957   81250
 SHARPEN         3    7921   81250
 SHARPEN         4    7886   81250
 SHARPEN         5    7875   81250
 L-PERMUTATIONS  5    7875   81250  0.09692  0.90308  0.72636
                      13.38.10.54.
 ELAPSED TIME OF JOB  13.44.37.32.

The seventh column reports the correct classification. For the 104^th Senate the algorithm quickly converges to 0.90308 (90.3%).

Turn in a copy of this file.

The file PERF25.DAT contains the output coordinates for the Senators and the roll calls. In one dimension, this is reported as a rank ordering. The first part of the file shows the Senators in rank order and then in the order that they appear in the roll call file with their ranking:


 09 MAY       2007  13.38.10.53.
    1 1041427521 0ILLINOI 10000SIMON         88  818   0.892   1.000
    2 1044930925 0WISCONS 10000FEINGOLD     109  827   0.868   2.000
    3 1044910133 0MINNESO 10000WELLSTONE     54  826   0.935   3.000
    4 1041487172 0OREGON  10000WYDEN         24  264   0.909   4.000
    5 1041501171 0CALIFOR 10000BOXER         56  813   0.931   5.000
    6 10410808 3 0MASSACH 10000KENNEDY, ED   53  806   0.934   6.000
    7 1041470923 0MICHIGA 10000LEVIN, CARL   57  826   0.931   7.000
    8 1041470212 0NEW JER 10000BRADLEY       98  757   0.871   8.000
    9 1041491412 0NEW JER 10000LAUTENBERG    70  821   0.915   9.000
   10 10414307 6 0VERMONT 10000LEAHY         71  807   0.912  10.000
                 etc.
                 etc.
                 etc.
  100 1042936733 0MINNESO 20000GRAMS         43  816   0.947  99.500
  101 1044930447 0NORTH C 20000FAIRCLOTH     45  800   0.944 101.000
  102 1041480362 0COLORAD 20000BROWN, HANK   81  828   0.902 102.000
  103 1041503961 0ARIZONA 20000MCCAIN        86  801   0.893 103.000
  104 1041542961 0ARIZONA 20000KYL           40  818   0.951 104.000

Note that below the legislators are the rank positions for the roll call cutting points. Write an Epsilon macro to combine the rank ordering of the Senators with the Bayesian MCMC coordinates (the median from the quantiles file). Turn in the macro and a printout of the combined file. Leave the header on the combined file.

Write an R program that graphs the rank ordering (horizontal axis) against the Bayesian MCMC medians (vertical axis). Label the axes appropriately and label a few of the Senators including Campbell (D-CO) and Campbell (R-CO).
Report the correlation between the OC rank ordering and the Bayesian MCMC medians.
Use Epsilon to cut the listing of the eigenvalues out of the PERF23.DAT file and make a skree plot of the eigenvalues with the R program we discussed in class:

Use Epsilon to change the number of dimensions from 1 to 2 in PERFSTRT_2006.DAT and run PERFL_2006 again (be sure to save the output files from the one-dimensional run first!!!! -- they will be overwritten by the new ones!). PERF21.DAT should look something like this:


 09 MAY       2007  14.14.01.93.
 RANDOM NUMBER SEED     35000
SEN104KH.ORD                                          
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
    2  919   20   36    1    2   20 0.005    5
(36A1,25000I1)                                                  
(I5,1X,36A1,2I5,50F8.3)                                         
 ******************************************************************************
  1 ROLL CALLS   2    7240   81250  0.08911  0.91089  0.74843           0.00000
    LEGISLATORS  2    7199   81250  0.08860  0.91140  0.74985  0.00000
  2 ROLL CALLS   2    7099   81250  0.08737  0.91263  0.75333           0.99610
    LEGISLATORS  2    7087   81250  0.08722  0.91278  0.75374  0.99859
  3 ROLL CALLS   2    7065   81250  0.08695  0.91305  0.75451           0.99800
    LEGISLATORS  2    7061   81250  0.08690  0.91310  0.75465  0.99981
  4 ROLL CALLS   2    7050   81250  0.08677  0.91323  0.75503           0.99886
    LEGISLATORS  2    7046   81250  0.08672  0.91328  0.75517  0.99995
  5 ROLL CALLS   2    7039   81250  0.08663  0.91337  0.75541           0.99927
    LEGISLATORS  2    7036   81250  0.08660  0.91340  0.75552  0.99992
  6 ROLL CALLS   2    7033   81250  0.08656  0.91344  0.75562           0.99848
    LEGISLATORS  2    7031   81250  0.08654  0.91346  0.75569  0.99998
  7 ROLL CALLS   2    7025   81250  0.08646  0.91354  0.75590           0.99952
    LEGISLATORS  2    7024   81250  0.08645  0.91355  0.75593  0.99999
  8 ROLL CALLS   2    7021   81250  0.08641  0.91359  0.75604           0.99960
    LEGISLATORS  2    7020   81250  0.08640  0.91360  0.75607  1.00000
  9 ROLL CALLS   2    7020   81250  0.08640  0.91360  0.75607           0.99938
    LEGISLATORS  2    7020   81250  0.08640  0.91360  0.75607  0.99999
 10 ROLL CALLS   2    7019   81250  0.08639  0.91361  0.75611           0.99950
    LEGISLATORS  2    7016   81250  0.08635  0.91365  0.75621  0.99810
 11 ROLL CALLS   2    6994   81250  0.08608  0.91392  0.75698           0.99906
    LEGISLATORS  2    6994   81250  0.08608  0.91392  0.75698  1.00000
 12 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99927
    LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
 13 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99946
    LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
 14 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99961
    LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
 15 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99921
    LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
 16 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99998
    LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
 17 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99961
    LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
 18 ROLL CALLS   2    6989   81250  0.08602  0.91398  0.75715           0.99960
    LEGISLATORS  2    6989   81250  0.08602  0.91398  0.75715  1.00000
 19 ROLL CALLS   2    6988   81250  0.08601  0.91399  0.75718           0.99996
    LEGISLATORS  2    6988   81250  0.08601  0.91399  0.75718  1.00000
 20 ROLL CALLS   2    6988   81250  0.08601  0.91399  0.75718           0.99998
    LEGISLATORS  2    6988   81250  0.08601  0.91399  0.75718  1.00000
 MEAN VOLUME LEG.   0.0026   0.0032
 MACHINE PREC.   2    6988   81250  0.08601  0.91399  0.75718
 MACHINE PREC.   2    6988   81250  0.08601  0.91399  0.75718
                      14.14.01.93.
 ELAPSED TIME OF JOB  14.15.35.76.

The correct classification is now 91.399% (0.91399). The first few lines of PERF25.DAT should look very similar to this:


 09 MAY       2007  14.14.01.93.
    1 1049990999 0USA     10000CLINTON        9  134   0.933   0.034  -0.256   0.014
    2 1041470541 0ALABAMA 10000HEFLIN        95  803   0.882   0.002  -0.086   0.591
    3 1049465941 0ALABAMA 20000SHELBY        58  805   0.928   0.002   0.253   0.176
    4 1041490781 0ALASKA  20000MURKOWSKI     32  798   0.960   0.002   0.245   0.088
    5 1041210981 0ALASKA  20000STEVENS       52  799   0.935   0.002   0.218   0.081
    6 1041542961 0ARIZONA 20000KYL           40  818   0.951   0.002   0.301  -0.017
    7 1041503961 0ARIZONA 20000MCCAIN        84  801   0.895   0.002   0.272  -0.195
    8 1041430042 0ARKANSA 10000BUMPERS       70  794   0.912   0.003  -0.334  -0.005
    9 1041079142 0ARKANSA 10000PRYOR         67  762   0.912   0.002  -0.293   0.169
   10 1041501171 0CALIFOR 10000BOXER         53  813   0.935   0.002  -0.344  -0.120
                       etc etc etc

The last two columns are the two-dimensional coordinates. Make a two-dimensional plot of the coordinates using R. The party code is just to the right of the state name: 100=Democrat, 200=Republican. The state codes are to the left of the column of zeroes at the beginning of the state names. Use "D" for non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" representing President Clinton. You can modify my R program:

plot_oc_output_X_and_Z.r -- R Program to Plot OC Output

to do this. This program shows what state codes correspond to "South" and "non-South". Note that you will have to add a line to the program to display the "P" for President Clinton. Be sure to correctly label the graph. You also might want to use mtext(.....) to do the labeling.

Repeat problem (1) only now use the 108^th Senate. Download the data files:

Sen108kh.dta -- Stata Data file for 108^th Senate

Sen108kh.ord -- ASCII Data file for 108^th Senate

Use Epsilon to change keith2.r so it looks like this:

#
rm(list=ls(all=TRUE))
#
library(foreign)
#
## does the 102nd senate
## (assumes the relevant .dta files are in the current directory)
#
## write out the filehead (only need to do that here)
filehead <- "c:/ucsd_homework_5/sen108kh"
filehead2 <- "sen108kh"
#
## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE
## #######################################################
#
## read data in
filestring <- paste(filehead, ".dta", sep="")
mydata <- read.dta(filestring)
#
## add rownames corresponding to legislators' names
rownames(mydata) <- gsub(" ", "", mydata$name)
#
## recode votes to 0/1/NA
## 1 yea, 0 nay, NA missing
ncdata <- ncol(mydata)
submat <- mydata[,10:ncdata]
submat[submat==0] <- -999
submat[submat==7] <- -999
submat[submat==8] <- -999
submat[submat==9] <- -999
submat[submat==4] <- 0
submat[submat==5] <- 0
submat[submat==6] <- 0
submat[submat==2] <- 1
submat[submat==3] <- 1
submat[submat==-999] <- NA
mydata[,10:ncdata] <- submat

## change name of mydata to whatever filehead is
expr <- parse(text=paste(filehead2, " <- mydata"))
eval(expr)

## save as an .rda file
expr <- parse(text=paste("save(", filehead2, ", file=\"",
                filehead, ".rda\", ascii=TRUE)", sep="")) 
eval(expr)

This will create the SEN108KH.RDA file.

Use Epsilon to change keithMCMC.r so it looks like this:

#
library(MCMCpack)

load("c:/ucsd_homework_5/sen108kh.rda")

Sen.rollcalls <- sen108kh[,10:ncol(sen108kh)]

posterior <- MCMCirt1d(Sen.rollcalls,
                       theta.constraints=list("KENNEDYED"=-2, "KYL"=2),
                       burnin=2000, mcmc=100000, thin=20, verbose=500)
geweke.diag(posterior)
pdf(file="c:/ucsd_homework_5/keithplots3.pdf")
plot(posterior)
summary(posterior)
sss <- summary(posterior)
write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt")
write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt")
dev.off()
plot(posterior[,1])

Repeat parts (a) and (b) of problem (1) to create a neat listing of sen108MCMC.txt. Turn in the lists of your modified macros and sen108MCMC.txt.
Use PERFL_2006 to get the rank-ordering for the 108^th Senate. Use the R program that you wrote for (1.c) above and graph the rank ordering (horizontal axis) against the Bayesian MCMC medians (vertical axis). Label the axes appropriately and label a few of the Senators.