POLI 279 MEASUREMENT THEORY
Sixth Assignment
Due 23 May 2006
# # # keith2.r -- This Program was written by Kevin Quinn and modfied by # Keith Poole to run on WINDOZE machines # rm(list=ls(all=TRUE)) # library(foreign) Be sure to load this library # ## does the 104th senate ## (assumes the relevant .dta files are in the current directory) # ## write out the filehead (only need to do that here) filehead <- "c:/ucsd_homework_5/sen104kh" Change Path Statements as Appropriate filehead2 <- "sen104kh" This line needs to be changed as well # ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE ## ####################################################### # ## read data in filestring <- paste(filehead, ".dta", sep="") Creates "c:/ucsd_homework_5/sen104kh.dta" mydata <- read.dta(filestring) read.dta is in the foreign library # ## add rownames corresponding to legislators' names rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column # in the Stata file and deletes all spaces ## recode votes to 0/1/NA rownames uses the "name" variable as labels for the rows of the matrix ## 1 yea, 0 nay, NA missing ncdata <- ncol(mydata) ncol gets the number of columns of the matrix submat <- mydata[,10:ncdata] Note that the first roll call is the 10th variable in Stata submat[submat==0] <- -999 submat[submat==7] <- -999 submat[submat==8] <- -999 submat[submat==9] <- -999 submat[submat==4] <- 0 submat[submat==5] <- 0 submat[submat==6] <- 0 submat[submat==2] <- 1 submat[submat==3] <- 1 submat[submat==-999] <- NA mydata[,10:ncdata] <- submat # ## change name of mydata to whatever filehead is parse creates the character string "sen104kh <- mydata" expr <- parse(text=paste(filehead2, " <- mydata")) using the paste command to concatenate the strings eval(expr) expr (expression) creates an unevaluated call and eval(expression) # evaluates the unevaluated expression. Note that this is a clever way ## save as an .rda file of passing R a command that would normally be entered from the command line expr <- parse(text=paste("save(", filehead2, ", file=\"", filehead, ".rda\", ascii=TRUE)", sep="")) This produces eval(expr) expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE)) and eval(expr) executes the save(...) command Note that the \" says use " as part of this stringWhen you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.
#
#
# keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack.
# Program written by Kevin Quinn and modified by Keith Poole
#
library(MCMCpack)
#
load("c:/ucsd_homework_5/sen104kh.rda")
#
Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)] Pull in the roll calls
#
posterior <- MCMCirt1d(Sen.rollcalls, Here is the call to the One-Dimensional IRT model in MCMCPack
theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2), Note the Kennedy-Helms Constraint
burnin=2000, mcmc=100000, thin=20, verbose=500) The Verbose switch writes to the screen every 500 iterations
geweke.diag(posterior) geweke.diag is a convergence diagnostic for Markov Chains
pdf(file="c:/ucsd_homework_5/keithplots3.pdf") This opens a PDF file for writing plots
plot(posterior) Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator
summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain
sss <- summary(posterior)
write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt") This just writes the statistics to a file
write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt") This just writes the quantiles to a file
dev.off() This turns off the PDF output file
plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain Matrix
Run keithMCMC.r and you should see something like this:









SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
1 919 20 36 1 2 20 0.005 5
(36A1,12000I1)
(I5,1X,36A1,2I5,50F8.3)
The first line is the roll call data file, the second line is the title, the third
line, in order, is the number of dimensions, the number of roll calls, the
number of iterations (leave as is), the number of "characters" to read off
the front of each legislator's record (leave as is), the record number of a
Senator that you believe is on the "Left" (this just sets "liberal" on the left --
lower ranks or negative values on the first dimension if the scaling is in
two or more dimensions -- and
"conservative" on the right), and the record number of a Senator that you
believe is on "Up" or on positive side of the second dimension (in the U.S. context,
these are the Southern Democrats in an earlier era). The last line is the format for
the coordinate output file. Please study the examples on the
Optimal Classification Program page because
we will be using this program several more times!
09 MAY 2007 13.38.10.53.
RANDOM NUMBER SEED 62600
SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
1 919 20 36 1 2 20 0.005 5
(36A1,25000I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 1 8178 81250 0.10065 0.89935 0.71583
2 LEGISLATORS 1 8096 81250 0.09964 0.90036 0.71868 0.99848
3 ROLL CALLS 1 8025 81250 0.09877 0.90123 0.72115
4 LEGISLATORS 1 8016 81250 0.09866 0.90134 0.72146 0.99973
5 ROLL CALLS 1 7989 81250 0.09833 0.90167 0.72240
6 LEGISLATORS 1 7988 81250 0.09831 0.90169 0.72244 0.99991
7 ROLL CALLS 1 7981 81250 0.09823 0.90177 0.72268
8 LEGISLATORS 1 7981 81250 0.09823 0.90177 0.72268 0.99992
9 ROLL CALLS 1 7970 81250 0.09809 0.90191 0.72306
10 LEGISLATORS 1 7970 81250 0.09809 0.90191 0.72306 0.99999
SHARPEN 2 7957 81250
SHARPEN 3 7921 81250
SHARPEN 4 7886 81250
SHARPEN 5 7875 81250
L-PERMUTATIONS 5 7875 81250 0.09692 0.90308 0.72636
13.38.10.54.
ELAPSED TIME OF JOB 13.44.37.32.
The seventh column reports the correct classification. For the 104th
Senate the algorithm quickly converges to 0.90308 (90.3%).
09 MAY 2007 13.38.10.53.
1 1041427521 0ILLINOI 10000SIMON 88 818 0.892 1.000
2 1044930925 0WISCONS 10000FEINGOLD 109 827 0.868 2.000
3 1044910133 0MINNESO 10000WELLSTONE 54 826 0.935 3.000
4 1041487172 0OREGON 10000WYDEN 24 264 0.909 4.000
5 1041501171 0CALIFOR 10000BOXER 56 813 0.931 5.000
6 10410808 3 0MASSACH 10000KENNEDY, ED 53 806 0.934 6.000
7 1041470923 0MICHIGA 10000LEVIN, CARL 57 826 0.931 7.000
8 1041470212 0NEW JER 10000BRADLEY 98 757 0.871 8.000
9 1041491412 0NEW JER 10000LAUTENBERG 70 821 0.915 9.000
10 10414307 6 0VERMONT 10000LEAHY 71 807 0.912 10.000
etc.
etc.
etc.
100 1042936733 0MINNESO 20000GRAMS 43 816 0.947 99.500
101 1044930447 0NORTH C 20000FAIRCLOTH 45 800 0.944 101.000
102 1041480362 0COLORAD 20000BROWN, HANK 81 828 0.902 102.000
103 1041503961 0ARIZONA 20000MCCAIN 86 801 0.893 103.000
104 1041542961 0ARIZONA 20000KYL 40 818 0.951 104.000
Note that below the legislators are the rank positions for the roll call cutting
points. Write an Epsilon macro to combine the rank ordering
of the Senators with the Bayesian MCMC coordinates
(the median from the quantiles file). Turn in the macro and a printout of the
combined file. Leave the header on the combined file.
09 MAY 2007 14.14.01.93.
RANDOM NUMBER SEED 35000
SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
2 919 20 36 1 2 20 0.005 5
(36A1,25000I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 2 7240 81250 0.08911 0.91089 0.74843 0.00000
LEGISLATORS 2 7199 81250 0.08860 0.91140 0.74985 0.00000
2 ROLL CALLS 2 7099 81250 0.08737 0.91263 0.75333 0.99610
LEGISLATORS 2 7087 81250 0.08722 0.91278 0.75374 0.99859
3 ROLL CALLS 2 7065 81250 0.08695 0.91305 0.75451 0.99800
LEGISLATORS 2 7061 81250 0.08690 0.91310 0.75465 0.99981
4 ROLL CALLS 2 7050 81250 0.08677 0.91323 0.75503 0.99886
LEGISLATORS 2 7046 81250 0.08672 0.91328 0.75517 0.99995
5 ROLL CALLS 2 7039 81250 0.08663 0.91337 0.75541 0.99927
LEGISLATORS 2 7036 81250 0.08660 0.91340 0.75552 0.99992
6 ROLL CALLS 2 7033 81250 0.08656 0.91344 0.75562 0.99848
LEGISLATORS 2 7031 81250 0.08654 0.91346 0.75569 0.99998
7 ROLL CALLS 2 7025 81250 0.08646 0.91354 0.75590 0.99952
LEGISLATORS 2 7024 81250 0.08645 0.91355 0.75593 0.99999
8 ROLL CALLS 2 7021 81250 0.08641 0.91359 0.75604 0.99960
LEGISLATORS 2 7020 81250 0.08640 0.91360 0.75607 1.00000
9 ROLL CALLS 2 7020 81250 0.08640 0.91360 0.75607 0.99938
LEGISLATORS 2 7020 81250 0.08640 0.91360 0.75607 0.99999
10 ROLL CALLS 2 7019 81250 0.08639 0.91361 0.75611 0.99950
LEGISLATORS 2 7016 81250 0.08635 0.91365 0.75621 0.99810
11 ROLL CALLS 2 6994 81250 0.08608 0.91392 0.75698 0.99906
LEGISLATORS 2 6994 81250 0.08608 0.91392 0.75698 1.00000
12 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99927
LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000
13 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99946
LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000
14 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99961
LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000
15 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99921
LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000
16 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99998
LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000
17 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99961
LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000
18 ROLL CALLS 2 6989 81250 0.08602 0.91398 0.75715 0.99960
LEGISLATORS 2 6989 81250 0.08602 0.91398 0.75715 1.00000
19 ROLL CALLS 2 6988 81250 0.08601 0.91399 0.75718 0.99996
LEGISLATORS 2 6988 81250 0.08601 0.91399 0.75718 1.00000
20 ROLL CALLS 2 6988 81250 0.08601 0.91399 0.75718 0.99998
LEGISLATORS 2 6988 81250 0.08601 0.91399 0.75718 1.00000
MEAN VOLUME LEG. 0.0026 0.0032
MACHINE PREC. 2 6988 81250 0.08601 0.91399 0.75718
MACHINE PREC. 2 6988 81250 0.08601 0.91399 0.75718
14.14.01.93.
ELAPSED TIME OF JOB 14.15.35.76.
The correct classification is now 91.399% (0.91399). The first few lines of
PERF25.DAT should look very similar to this:
09 MAY 2007 14.14.01.93.
1 1049990999 0USA 10000CLINTON 9 134 0.933 0.034 -0.256 0.014
2 1041470541 0ALABAMA 10000HEFLIN 95 803 0.882 0.002 -0.086 0.591
3 1049465941 0ALABAMA 20000SHELBY 58 805 0.928 0.002 0.253 0.176
4 1041490781 0ALASKA 20000MURKOWSKI 32 798 0.960 0.002 0.245 0.088
5 1041210981 0ALASKA 20000STEVENS 52 799 0.935 0.002 0.218 0.081
6 1041542961 0ARIZONA 20000KYL 40 818 0.951 0.002 0.301 -0.017
7 1041503961 0ARIZONA 20000MCCAIN 84 801 0.895 0.002 0.272 -0.195
8 1041430042 0ARKANSA 10000BUMPERS 70 794 0.912 0.003 -0.334 -0.005
9 1041079142 0ARKANSA 10000PRYOR 67 762 0.912 0.002 -0.293 0.169
10 1041501171 0CALIFOR 10000BOXER 53 813 0.935 0.002 -0.344 -0.120
etc etc etc
The last two columns are the two-dimensional coordinates. Make a two-dimensional plot
of the coordinates using R. The party code is just to
the right of the state name: 100=Democrat, 200=Republican. The state codes are to the
left of the column of zeroes at the beginning of the state names. Use "D" for
non-Southern Democrats, "S" for Southern Democrats, "R" for
Republicans, and "P" representing President Clinton. You can modify my
R program:#
rm(list=ls(all=TRUE))
#
library(foreign)
#
## does the 102nd senate
## (assumes the relevant .dta files are in the current directory)
#
## write out the filehead (only need to do that here)
filehead <- "c:/ucsd_homework_5/sen108kh"
filehead2 <- "sen108kh"
#
## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE
## #######################################################
#
## read data in
filestring <- paste(filehead, ".dta", sep="")
mydata <- read.dta(filestring)
#
## add rownames corresponding to legislators' names
rownames(mydata) <- gsub(" ", "", mydata$name)
#
## recode votes to 0/1/NA
## 1 yea, 0 nay, NA missing
ncdata <- ncol(mydata)
submat <- mydata[,10:ncdata]
submat[submat==0] <- -999
submat[submat==7] <- -999
submat[submat==8] <- -999
submat[submat==9] <- -999
submat[submat==4] <- 0
submat[submat==5] <- 0
submat[submat==6] <- 0
submat[submat==2] <- 1
submat[submat==3] <- 1
submat[submat==-999] <- NA
mydata[,10:ncdata] <- submat
## change name of mydata to whatever filehead is
expr <- parse(text=paste(filehead2, " <- mydata"))
eval(expr)
## save as an .rda file
expr <- parse(text=paste("save(", filehead2, ", file=\"",
filehead, ".rda\", ascii=TRUE)", sep=""))
eval(expr)
This will create the SEN108KH.RDA file.#
library(MCMCpack)
load("c:/ucsd_homework_5/sen108kh.rda")
Sen.rollcalls <- sen108kh[,10:ncol(sen108kh)]
posterior <- MCMCirt1d(Sen.rollcalls,
theta.constraints=list("KENNEDYED"=-2, "KYL"=2),
burnin=2000, mcmc=100000, thin=20, verbose=500)
geweke.diag(posterior)
pdf(file="c:/ucsd_homework_5/keithplots3.pdf")
plot(posterior)
summary(posterior)
sss <- summary(posterior)
write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt")
write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt")
dev.off()
plot(posterior[,1])