POLI 279 MEASUREMENT THEORY
Fifth Assignment
Due 26 May 2006
# # # keith2.r -- This Program was written by Kevin Quinn and modfied by # Keith Poole to run on WINDOZE machines # rm(list=ls(all=TRUE)) # library(foreign) Be sure to load this library # ## does the 104th senate ## (assumes the relevant .dta files are in the current directory) # ## write out the filehead (only need to do that here) filehead <- "c:/ucsd_homework_5/sen104kh" Change Path Statements as Appropriate filehead2 <- "sen104kh" This line needs to be changed as well # ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE ## ####################################################### # ## read data in filestring <- paste(filehead, ".dta", sep="") Creates "c:/ucsd_homework_5/sen104kh.dta" mydata <- read.dta(filestring) read.dta is in the foreign library # ## add rownames corresponding to legislators' names rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column # in the Stata file and deletes all spaces ## recode votes to 0/1/NA rownames uses the "name" variable as labels for the rows of the matrix ## 1 yea, 0 nay, NA missing ncdata <- ncol(mydata) ncol gets the number of columns of the matrix submat <- mydata[,10:ncdata] Note that the first roll call is the 10th variable in Stata submat[submat==0] <- -999 submat[submat==7] <- -999 submat[submat==8] <- -999 submat[submat==9] <- -999 submat[submat==4] <- 0 submat[submat==5] <- 0 submat[submat==6] <- 0 submat[submat==2] <- 1 submat[submat==3] <- 1 submat[submat==-999] <- NA mydata[,10:ncdata] <- submat # ## change name of mydata to whatever filehead is parse creates the character string "sen104kh <- mydata" expr <- parse(text=paste(filehead2, " <- mydata")) using the paste command to concatenate the strings eval(expr) expr (expression) creates an unevaluated call and eval(expression) # evaluates the unevaluated expression. Note that this is a clever way ## save as an .rda file of passing R a command that would normally be entered from the command line expr <- parse(text=paste("save(", filehead2, ", file=\"", filehead, ".rda\", ascii=TRUE)", sep="")) This produces eval(expr) expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE)) and eval(expr) executes the save(...) command Note that the \" says use " as part of this stringWhen you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.
#
#
# keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack.
# Program written by Kevin Quinn and modified by Keith Poole
#
library(MCMCpack)
#
load("c:/ucsd_homework_5/sen104kh.rda")
#
Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)] Pull in the roll calls
#
posterior <- MCMCirt1d(Sen.rollcalls, Here is the call to the One-Dimensional IRT model in MCMCPack
theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2), Note the Kennedy-Helms Constraint
burnin=2000, mcmc=100000, thin=20, verbose=500) The Verbose switch writes to the screen every 500 iterations
geweke.diag(posterior) geweke.diag is a convergence diagnostic for Markov Chains
pdf(file="c:/ucsd_homework_5/keithplots3.pdf") This opens a PDF file for writing plots
plot(posterior) Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator
summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain
sss <- summary(posterior)
write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt") This just writes the statistics to a file
write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt") This just writes the quantiles to a file
dev.off() This turns off the PDF output file
plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain Matrix
Run keithMCMC.r and you should see something like this:









SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
1 919 20 36 1 2 20 0.005
(36A1,12000I1)
(I5,1X,36A1,2I5,50F8.3)
The first line is the roll call data file, the second line is the title, the third
line, in order, is the number of dimensions, the number of roll calls, the
number of iterations (leave as is), the number of "characters" to read off
the front of each legislator's record (leave as is), and the record number of a
Senator that you believe is on the "Left" (this just sets "liberal" on the left --
lower ranks or negative values on the first dimension if the scaling is in
two or more dimensions -- and
"conservative" on the right). The last line is the format for
the coordinate output file. Please study the examples on the
Optimal Classification Program page because
we will be using this program several more times!
17 MAY 2006 21.11.41.48.
RANDOM NUMBER SEED 67100
SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
1 919 20 36 1 2 20 0.005
(36A1,12000I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 1 8178 81250 0.10065 0.89935 0.71583
2 LEGISLATORS 1 8096 81250 0.09964 0.90036 0.71868 0.99848
3 ROLL CALLS 1 8025 81250 0.09877 0.90123 0.72115
4 LEGISLATORS 1 8016 81250 0.09866 0.90134 0.72146 0.99973
5 ROLL CALLS 1 7989 81250 0.09833 0.90167 0.72240
6 LEGISLATORS 1 7988 81250 0.09831 0.90169 0.72244 0.99991
7 ROLL CALLS 1 7981 81250 0.09823 0.90177 0.72268
8 LEGISLATORS 1 7981 81250 0.09823 0.90177 0.72268 0.99992
9 ROLL CALLS 1 7970 81250 0.09809 0.90191 0.72306
10 LEGISLATORS 1 7970 81250 0.09809 0.90191 0.72306 0.99999
21.11.41.70.
ELAPSED TIME OF JOB 21.11.43.60.
The seventh column reports the correct classification. For the 104th
Senate the algorithm quickly converges to 0.90191 (90.2%).
17 MAY 2006 21.11.41.48.
1 1044930925 0WISCONS 10000FEINGOLD 103 827 0.875 1.000
2 1044910133 0MINNESO 10000WELLSTONE 56 826 0.932 2.000
3 1041427521 0ILLINOI 10000SIMON 101 818 0.877 3.000
4 10410808 3 0MASSACH 10000KENNEDY, ED 52 806 0.935 4.000
5 1041501171 0CALIFOR 10000BOXER 58 813 0.929 5.000
6 1041487172 0OREGON 10000WYDEN 27 264 0.898 6.000
7 1041470923 0MICHIGA 10000LEVIN, CARL 57 826 0.931 7.000
8 1041470212 0NEW JER 10000BRADLEY 94 757 0.876 8.000
9 1041491412 0NEW JER 10000LAUTENBERG 69 821 0.916 9.000
10 10414307 6 0VERMONT 10000LEAHY 72 807 0.911 10.000
etc.
etc.
etc.
100 10415116 4 0NEW HAM 20000SMITH, ROBE 50 821 0.939 99.500
101 1044950134 0MISSOUR 20000ASHCROFT 40 816 0.951 101.000
102 1041462849 0TEXAS 20000GRAMM, PHIL 38 748 0.949 102.000
103 1044930447 0NORTH C 20000FAIRCLOTH 44 800 0.945 103.000
104 1041542961 0ARIZONA 20000KYL 39 818 0.952 104.000
******************************************************************************
1 1049990999 0USA 10000CLINTON 11 134 0.918 29.000
2 1041470541 0ALABAMA 10000HEFLIN 148 803 0.816 48.000
3 1049465941 0ALABAMA 20000SHELBY 67 805 0.917 77.500
4 1041490781 0ALASKA 20000MURKOWSKI 34 798 0.957 66.000
5 1041210981 0ALASKA 20000STEVENS 47 799 0.941 60.000
6 1041542961 0ARIZONA 20000KYL 39 818 0.952 104.000
7 1041503961 0ARIZONA 20000MCCAIN 98 801 0.878 99.500
8 1041430042 0ARKANSA 10000BUMPERS 73 794 0.908 16.500
9 1041079142 0ARKANSA 10000PRYOR 72 762 0.906 16.500
10 1041501171 0CALIFOR 10000BOXER 58 813 0.929 5.000
etc.
etc.
100 1041492256 0WEST VI 10000ROCKEFELLER 84 803 0.895 23.000
101 1044930925 0WISCONS 10000FEINGOLD 103 827 0.875 1.000
102 1041570325 0WISCONS 10000KOHL 131 827 0.842 21.000
103 1041471068 0WYOMING 20000SIMPSON 64 800 0.920 59.000
104 1041563368 0WYOMING 20000THOMAS 58 819 0.929 86.000
Note that below the legislators are the rank positions for the roll call cutting
points. Write an Epsilon macro to combine the rank ordering
of the Senators with the Bayesian MCMC coordinates
(the median from the quantiles file). Turn in the macro and a printout of the
combined file. Leave the header on the combined file.
etc.
etc.
PERFORMANCE INDEX EIGENVALUE/VECTOR ROUTINE= 1 104 0 0
1 10.7509 66.4575 66.4575 12.5245 55.4540 55.4540
2 0.4921 3.0418 69.4992 0.7115 3.1501 58.6042
3 0.3706 2.2909 71.7901 0.5712 2.5289 61.1330
4 0.2624 1.6217 73.4119 0.4574 2.0252 63.1582
5 0.2254 1.3931 74.8050 0.3656 1.6188 64.7771
6 0.2188 1.3526 76.1575 0.3504 1.5514 66.3284
7 0.1769 1.0935 77.2510 0.3011 1.3331 67.6615
8 0.1466 0.9062 78.1572 0.2914 1.2903 68.9518
9 0.1427 0.8821 79.0393 0.2630 1.1646 70.1164
10 0.1220 0.7541 79.7934 0.2344 1.0379 71.1543
11 0.1023 0.6323 80.4257 0.2172 0.9617 72.1161
12 0.0994 0.6142 81.0399 0.2026 0.8972 73.0133
13 0.0906 0.5600 81.5999 0.2007 0.8888 73.9020
14 0.0747 0.4616 82.0615 0.1753 0.7763 74.6784
15 0.0670 0.4144 82.4759 0.1683 0.7451 75.4235
16 0.0635 0.3924 82.8684 0.1647 0.7290 76.1525
17 0.0619 0.3823 83.2507 0.1589 0.7037 76.8562
18 0.0554 0.3424 83.5931 0.1466 0.6492 77.5053
19 0.0546 0.3376 83.9307 0.1393 0.6169 78.1223
20 0.0509 0.3144 84.2451 0.1370 0.6064 78.7286
etc.
etc.
The second column with a first entry of 10.7509 are the eigenvalues. Make
a graph of the eigenvalues like the one you did for
Homework 3 Q.3.d. In particular, you can
modify the code below to do this:
nrow <- 10
plot(T[,1],T[,2],
xlab="",
ylab="",
main="",
type="n",xlim=c(1,10),ylim=c(0.0,1.0),axes=FALSE)
#
# Main title: which side of the plot (1=bottom, 2=left, 3=top, 4=right).
mtext("Normalized Eigenvalues of Double-Centered\n Agreement Score Matrix for the 104th Senate",side=3,line=2.75,cex=1.2)
# x-axis title
mtext("Dimension",side=1,line=2.75,cex=1.2)
# y-axis title
mtext("Normalized Eigenvalue",side=2,line=2.5,cex=1.2)
#
# I am tricking R to do a nice looking x-axis by forcing
# the labels
#
axis(1,1:10,
labels=c(' 1',' 2',' 3',' 4',' 5',' 6',' 7',' 8',' 9','10'),
font=2,col.axis='black',cex.axis=1.2)
# font switch: 1 corresponds to plain text,
# 2 to bold face, 3 to italic and 4 to bold italic
# the at=NULL tells R to compute tick marks internally
axis(2,at=NULL,col.axis='black',cex.axis=1.2,font=2)
#
box()
points(T[,1],T[,2],pch=16,col="red")
lines(T[1:nrow,1],T[1:nrow,2],lty=1,lwd=3,col="blue")
17 MAY 2006 22.57.32.64.
RANDOM NUMBER SEED 107500
SEN104KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
2 919 20 36 1 2 20 0.005
(36A1,12000I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 2 7408 81250 0.09118 0.90882 0.74259 0.00000
LEGISLATORS 2 7370 81250 0.09071 0.90929 0.74391 0.00000
2 ROLL CALLS 2 7293 81250 0.08976 0.91024 0.74659 0.99639
LEGISLATORS 2 7289 81250 0.08971 0.91029 0.74673 0.99940
3 ROLL CALLS 2 7271 81250 0.08949 0.91051 0.74735 0.99823
LEGISLATORS 2 7267 81250 0.08944 0.91056 0.74749 0.99995
4 ROLL CALLS 2 7262 81250 0.08938 0.91062 0.74766 0.99851
LEGISLATORS 2 7260 81250 0.08935 0.91065 0.74773 0.99999
5 ROLL CALLS 2 7256 81250 0.08930 0.91070 0.74787 0.99990
LEGISLATORS 2 7253 81250 0.08927 0.91073 0.74798 0.99997
6 ROLL CALLS 2 7251 81250 0.08924 0.91076 0.74805 0.99987
LEGISLATORS 2 7250 81250 0.08923 0.91077 0.74808 0.99988
7 ROLL CALLS 2 7246 81250 0.08918 0.91082 0.74822 0.99932
LEGISLATORS 2 7243 81250 0.08914 0.91086 0.74832 0.99999
8 ROLL CALLS 2 7241 81250 0.08912 0.91088 0.74839 0.99959
LEGISLATORS 2 7237 81250 0.08907 0.91093 0.74853 0.99828
9 ROLL CALLS 2 7233 81250 0.08902 0.91098 0.74867 0.99964
LEGISLATORS 2 7231 81250 0.08900 0.91100 0.74874 0.99969
10 ROLL CALLS 2 7226 81250 0.08894 0.91106 0.74891 0.99973
LEGISLATORS 2 7226 81250 0.08894 0.91106 0.74891 0.99999
11 ROLL CALLS 2 7225 81250 0.08892 0.91108 0.74895 0.99939
LEGISLATORS 2 7223 81250 0.08890 0.91110 0.74902 0.99993
12 ROLL CALLS 2 7219 81250 0.08885 0.91115 0.74916 0.99929
LEGISLATORS 2 7219 81250 0.08885 0.91115 0.74916 1.00000
13 ROLL CALLS 2 7217 81250 0.08882 0.91118 0.74923 0.99942
LEGISLATORS 2 7217 81250 0.08882 0.91118 0.74923 1.00000
14 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99969
LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000
15 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99946
LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000
16 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99949
LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000
17 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99972
LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000
18 ROLL CALLS 2 7215 81250 0.08880 0.91120 0.74930 0.99961
LEGISLATORS 2 7215 81250 0.08880 0.91120 0.74930 1.00000
19 ROLL CALLS 2 7215 81250 0.08880 0.91120 0.74930 0.99984
LEGISLATORS 2 7215 81250 0.08880 0.91120 0.74930 1.00000
20 ROLL CALLS 2 7064 81250 0.08694 0.91306 0.75454 0.99511
LEGISLATORS 2 7062 81250 0.08692 0.91308 0.75461 0.99990
MEAN VOLUME LEG. 0.0034 0.0055
MACHINE PREC. 2 7062 81250 0.08692 0.91308 0.75461
MACHINE PREC. 2 7062 81250 0.08692 0.91308 0.75461
22.57.32.64.
ELAPSED TIME OF JOB 22.58.53.98.
The correct classification is now 91.308% (0.91308). The first few lines of
PERF25.DAT should look very similar to this:
17 MAY 2006 22.57.32.64.
1 1049990999 0USA 10000CLINTON 9 134 0.933 0.056 -0.291 0.211
2 1041470541 0ALABAMA 10000HEFLIN 92 803 0.885 0.002 -0.086 0.590
3 1049465941 0ALABAMA 20000SHELBY 63 805 0.922 0.002 0.240 0.166
4 1041490781 0ALASKA 20000MURKOWSKI 36 798 0.955 0.005 0.245 0.068
5 1041210981 0ALASKA 20000STEVENS 54 799 0.932 0.003 0.223 0.077
6 1041542961 0ARIZONA 20000KYL 43 818 0.947 0.003 0.304 -0.004
7 1041503961 0ARIZONA 20000MCCAIN 85 801 0.894 0.002 0.288 -0.173
8 1041430042 0ARKANSA 10000BUMPERS 67 794 0.916 0.002 -0.329 -0.004
9 1041079142 0ARKANSA 10000PRYOR 64 762 0.916 0.005 -0.292 0.175
10 1041501171 0CALIFOR 10000BOXER 54 813 0.934 0.002 -0.345 -0.119
11 1044930071 0CALIFOR 10000FEINSTEIN 119 822 0.855 0.002 -0.221 0.097
etc etc etc
The last two columns are the two-dimensional coordinates. Make a two-dimensional plot
of the coordinates using R. The party code is just to
the right of the state name: 100=Democrat, 200=Republican. The state codes are to the
left of the column of zeroes at the beginning of the state names. Use "D" for
non-Southern Democrats, "S" for Southern Democrats, "R" for
Republicans, and "P" representing President Clinton. You can modify my
R program: