POLI 279 MEASUREMENT THEORY
Fifth Assignment
Due 26 May 2006
# # # keith2.r -- This Program was written by Kevin Quinn and modfied by # Keith Poole to run on WINDOZE machines # rm(list=ls(all=TRUE)) # library(foreign) Be sure to load this library # ## does the 104th senate ## (assumes the relevant .dta files are in the current directory) # ## write out the filehead (only need to do that here) filehead <- "c:/ucsd_homework_5/sen104kh" Change Path Statements as Appropriate filehead2 <- "sen104kh" This line needs to be changed as well # ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE ## ####################################################### # ## read data in filestring <- paste(filehead, ".dta", sep="") Creates "c:/ucsd_homework_5/sen104kh.dta" mydata <- read.dta(filestring) read.dta is in the foreign library # ## add rownames corresponding to legislators' names rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column # in the Stata file and deletes all spaces ## recode votes to 0/1/NA rownames uses the "name" variable as labels for the rows of the matrix ## 1 yea, 0 nay, NA missing ncdata <- ncol(mydata) ncol gets the number of columns of the matrix submat <- mydata[,10:ncdata] Note that the first roll call is the 10th variable in Stata submat[submat==0] <- -999 submat[submat==7] <- -999 submat[submat==8] <- -999 submat[submat==9] <- -999 submat[submat==4] <- 0 submat[submat==5] <- 0 submat[submat==6] <- 0 submat[submat==2] <- 1 submat[submat==3] <- 1 submat[submat==-999] <- NA mydata[,10:ncdata] <- submat # ## change name of mydata to whatever filehead is parse creates the character string "sen104kh <- mydata" expr <- parse(text=paste(filehead2, " <- mydata")) using the paste command to concatenate the strings eval(expr) expr (expression) creates an unevaluated call and eval(expression) # evaluates the unevaluated expression. Note that this is a clever way ## save as an .rda file of passing R a command that would normally be entered from the command line expr <- parse(text=paste("save(", filehead2, ", file=\"", filehead, ".rda\", ascii=TRUE)", sep="")) This produces eval(expr) expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE)) and eval(expr) executes the save(...) command Note that the \" says use " as part of this stringWhen you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.
# # # keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack. # Program written by Kevin Quinn and modified by Keith Poole # library(MCMCpack) # load("c:/ucsd_homework_5/sen104kh.rda") # Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)] Pull in the roll calls # posterior <- MCMCirt1d(Sen.rollcalls, Here is the call to the One-Dimensional IRT model in MCMCPack theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2), Note the Kennedy-Helms Constraint burnin=2000, mcmc=100000, thin=20, verbose=500) The Verbose switch writes to the screen every 500 iterations geweke.diag(posterior) geweke.diag is a convergence diagnostic for Markov Chains pdf(file="c:/ucsd_homework_5/keithplots3.pdf") This opens a PDF file for writing plots plot(posterior) Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain sss <- summary(posterior) write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt") This just writes the statistics to a file write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt") This just writes the quantiles to a file dev.off() This turns off the PDF output file plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain MatrixRun keithMCMC.r and you should see something like this:
SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 1 919 20 36 1 2 20 0.005 (36A1,12000I1) (I5,1X,36A1,2I5,50F8.3)The first line is the roll call data file, the second line is the title, the third line, in order, is the number of dimensions, the number of roll calls, the number of iterations (leave as is), the number of "characters" to read off the front of each legislator's record (leave as is), and the record number of a Senator that you believe is on the "Left" (this just sets "liberal" on the left -- lower ranks or negative values on the first dimension if the scaling is in two or more dimensions -- and "conservative" on the right). The last line is the format for the coordinate output file. Please study the examples on the Optimal Classification Program page because we will be using this program several more times!
17 MAY 2006 21.11.41.48. RANDOM NUMBER SEED 67100 SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 1 919 20 36 1 2 20 0.005 (36A1,12000I1) (I5,1X,36A1,2I5,50F8.3) ****************************************************************************** 1 ROLL CALLS 1 8178 81250 0.10065 0.89935 0.71583 2 LEGISLATORS 1 8096 81250 0.09964 0.90036 0.71868 0.99848 3 ROLL CALLS 1 8025 81250 0.09877 0.90123 0.72115 4 LEGISLATORS 1 8016 81250 0.09866 0.90134 0.72146 0.99973 5 ROLL CALLS 1 7989 81250 0.09833 0.90167 0.72240 6 LEGISLATORS 1 7988 81250 0.09831 0.90169 0.72244 0.99991 7 ROLL CALLS 1 7981 81250 0.09823 0.90177 0.72268 8 LEGISLATORS 1 7981 81250 0.09823 0.90177 0.72268 0.99992 9 ROLL CALLS 1 7970 81250 0.09809 0.90191 0.72306 10 LEGISLATORS 1 7970 81250 0.09809 0.90191 0.72306 0.99999 21.11.41.70. ELAPSED TIME OF JOB 21.11.43.60.The seventh column reports the correct classification. For the 104th Senate the algorithm quickly converges to 0.90191 (90.2%).
17 MAY 2006 21.11.41.48. 1 1044930925 0WISCONS 10000FEINGOLD 103 827 0.875 1.000 2 1044910133 0MINNESO 10000WELLSTONE 56 826 0.932 2.000 3 1041427521 0ILLINOI 10000SIMON 101 818 0.877 3.000 4 10410808 3 0MASSACH 10000KENNEDY, ED 52 806 0.935 4.000 5 1041501171 0CALIFOR 10000BOXER 58 813 0.929 5.000 6 1041487172 0OREGON 10000WYDEN 27 264 0.898 6.000 7 1041470923 0MICHIGA 10000LEVIN, CARL 57 826 0.931 7.000 8 1041470212 0NEW JER 10000BRADLEY 94 757 0.876 8.000 9 1041491412 0NEW JER 10000LAUTENBERG 69 821 0.916 9.000 10 10414307 6 0VERMONT 10000LEAHY 72 807 0.911 10.000 etc. etc. etc. 100 10415116 4 0NEW HAM 20000SMITH, ROBE 50 821 0.939 99.500 101 1044950134 0MISSOUR 20000ASHCROFT 40 816 0.951 101.000 102 1041462849 0TEXAS 20000GRAMM, PHIL 38 748 0.949 102.000 103 1044930447 0NORTH C 20000FAIRCLOTH 44 800 0.945 103.000 104 1041542961 0ARIZONA 20000KYL 39 818 0.952 104.000 ****************************************************************************** 1 1049990999 0USA 10000CLINTON 11 134 0.918 29.000 2 1041470541 0ALABAMA 10000HEFLIN 148 803 0.816 48.000 3 1049465941 0ALABAMA 20000SHELBY 67 805 0.917 77.500 4 1041490781 0ALASKA 20000MURKOWSKI 34 798 0.957 66.000 5 1041210981 0ALASKA 20000STEVENS 47 799 0.941 60.000 6 1041542961 0ARIZONA 20000KYL 39 818 0.952 104.000 7 1041503961 0ARIZONA 20000MCCAIN 98 801 0.878 99.500 8 1041430042 0ARKANSA 10000BUMPERS 73 794 0.908 16.500 9 1041079142 0ARKANSA 10000PRYOR 72 762 0.906 16.500 10 1041501171 0CALIFOR 10000BOXER 58 813 0.929 5.000 etc. etc. 100 1041492256 0WEST VI 10000ROCKEFELLER 84 803 0.895 23.000 101 1044930925 0WISCONS 10000FEINGOLD 103 827 0.875 1.000 102 1041570325 0WISCONS 10000KOHL 131 827 0.842 21.000 103 1041471068 0WYOMING 20000SIMPSON 64 800 0.920 59.000 104 1041563368 0WYOMING 20000THOMAS 58 819 0.929 86.000Note that below the legislators are the rank positions for the roll call cutting points. Write an Epsilon macro to combine the rank ordering of the Senators with the Bayesian MCMC coordinates (the median from the quantiles file). Turn in the macro and a printout of the combined file. Leave the header on the combined file.
etc. etc. PERFORMANCE INDEX EIGENVALUE/VECTOR ROUTINE= 1 104 0 0 1 10.7509 66.4575 66.4575 12.5245 55.4540 55.4540 2 0.4921 3.0418 69.4992 0.7115 3.1501 58.6042 3 0.3706 2.2909 71.7901 0.5712 2.5289 61.1330 4 0.2624 1.6217 73.4119 0.4574 2.0252 63.1582 5 0.2254 1.3931 74.8050 0.3656 1.6188 64.7771 6 0.2188 1.3526 76.1575 0.3504 1.5514 66.3284 7 0.1769 1.0935 77.2510 0.3011 1.3331 67.6615 8 0.1466 0.9062 78.1572 0.2914 1.2903 68.9518 9 0.1427 0.8821 79.0393 0.2630 1.1646 70.1164 10 0.1220 0.7541 79.7934 0.2344 1.0379 71.1543 11 0.1023 0.6323 80.4257 0.2172 0.9617 72.1161 12 0.0994 0.6142 81.0399 0.2026 0.8972 73.0133 13 0.0906 0.5600 81.5999 0.2007 0.8888 73.9020 14 0.0747 0.4616 82.0615 0.1753 0.7763 74.6784 15 0.0670 0.4144 82.4759 0.1683 0.7451 75.4235 16 0.0635 0.3924 82.8684 0.1647 0.7290 76.1525 17 0.0619 0.3823 83.2507 0.1589 0.7037 76.8562 18 0.0554 0.3424 83.5931 0.1466 0.6492 77.5053 19 0.0546 0.3376 83.9307 0.1393 0.6169 78.1223 20 0.0509 0.3144 84.2451 0.1370 0.6064 78.7286 etc. etc.The second column with a first entry of 10.7509 are the eigenvalues. Make a graph of the eigenvalues like the one you did for Homework 3 Q.3.d. In particular, you can modify the code below to do this:
nrow <- 10 plot(T[,1],T[,2], xlab="", ylab="", main="", type="n",xlim=c(1,10),ylim=c(0.0,1.0),axes=FALSE) # # Main title: which side of the plot (1=bottom, 2=left, 3=top, 4=right). mtext("Normalized Eigenvalues of Double-Centered\n Agreement Score Matrix for the 104th Senate",side=3,line=2.75,cex=1.2) # x-axis title mtext("Dimension",side=1,line=2.75,cex=1.2) # y-axis title mtext("Normalized Eigenvalue",side=2,line=2.5,cex=1.2) # # I am tricking R to do a nice looking x-axis by forcing # the labels # axis(1,1:10, labels=c(' 1',' 2',' 3',' 4',' 5',' 6',' 7',' 8',' 9','10'), font=2,col.axis='black',cex.axis=1.2) # font switch: 1 corresponds to plain text, # 2 to bold face, 3 to italic and 4 to bold italic # the at=NULL tells R to compute tick marks internally axis(2,at=NULL,col.axis='black',cex.axis=1.2,font=2) # box() points(T[,1],T[,2],pch=16,col="red") lines(T[1:nrow,1],T[1:nrow,2],lty=1,lwd=3,col="blue")
17 MAY 2006 22.57.32.64. RANDOM NUMBER SEED 107500 SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 2 919 20 36 1 2 20 0.005 (36A1,12000I1) (I5,1X,36A1,2I5,50F8.3) ****************************************************************************** 1 ROLL CALLS 2 7408 81250 0.09118 0.90882 0.74259 0.00000 LEGISLATORS 2 7370 81250 0.09071 0.90929 0.74391 0.00000 2 ROLL CALLS 2 7293 81250 0.08976 0.91024 0.74659 0.99639 LEGISLATORS 2 7289 81250 0.08971 0.91029 0.74673 0.99940 3 ROLL CALLS 2 7271 81250 0.08949 0.91051 0.74735 0.99823 LEGISLATORS 2 7267 81250 0.08944 0.91056 0.74749 0.99995 4 ROLL CALLS 2 7262 81250 0.08938 0.91062 0.74766 0.99851 LEGISLATORS 2 7260 81250 0.08935 0.91065 0.74773 0.99999 5 ROLL CALLS 2 7256 81250 0.08930 0.91070 0.74787 0.99990 LEGISLATORS 2 7253 81250 0.08927 0.91073 0.74798 0.99997 6 ROLL CALLS 2 7251 81250 0.08924 0.91076 0.74805 0.99987 LEGISLATORS 2 7250 81250 0.08923 0.91077 0.74808 0.99988 7 ROLL CALLS 2 7246 81250 0.08918 0.91082 0.74822 0.99932 LEGISLATORS 2 7243 81250 0.08914 0.91086 0.74832 0.99999 8 ROLL CALLS 2 7241 81250 0.08912 0.91088 0.74839 0.99959 LEGISLATORS 2 7237 81250 0.08907 0.91093 0.74853 0.99828 9 ROLL CALLS 2 7233 81250 0.08902 0.91098 0.74867 0.99964 LEGISLATORS 2 7231 81250 0.08900 0.91100 0.74874 0.99969 10 ROLL CALLS 2 7226 81250 0.08894 0.91106 0.74891 0.99973 LEGISLATORS 2 7226 81250 0.08894 0.91106 0.74891 0.99999 11 ROLL CALLS 2 7225 81250 0.08892 0.91108 0.74895 0.99939 LEGISLATORS 2 7223 81250 0.08890 0.91110 0.74902 0.99993 12 ROLL CALLS 2 7219 81250 0.08885 0.91115 0.74916 0.99929 LEGISLATORS 2 7219 81250 0.08885 0.91115 0.74916 1.00000 13 ROLL CALLS 2 7217 81250 0.08882 0.91118 0.74923 0.99942 LEGISLATORS 2 7217 81250 0.08882 0.91118 0.74923 1.00000 14 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99969 LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000 15 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99946 LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000 16 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99949 LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000 17 ROLL CALLS 2 7216 81250 0.08881 0.91119 0.74926 0.99972 LEGISLATORS 2 7216 81250 0.08881 0.91119 0.74926 1.00000 18 ROLL CALLS 2 7215 81250 0.08880 0.91120 0.74930 0.99961 LEGISLATORS 2 7215 81250 0.08880 0.91120 0.74930 1.00000 19 ROLL CALLS 2 7215 81250 0.08880 0.91120 0.74930 0.99984 LEGISLATORS 2 7215 81250 0.08880 0.91120 0.74930 1.00000 20 ROLL CALLS 2 7064 81250 0.08694 0.91306 0.75454 0.99511 LEGISLATORS 2 7062 81250 0.08692 0.91308 0.75461 0.99990 MEAN VOLUME LEG. 0.0034 0.0055 MACHINE PREC. 2 7062 81250 0.08692 0.91308 0.75461 MACHINE PREC. 2 7062 81250 0.08692 0.91308 0.75461 22.57.32.64. ELAPSED TIME OF JOB 22.58.53.98.The correct classification is now 91.308% (0.91308). The first few lines of PERF25.DAT should look very similar to this:
17 MAY 2006 22.57.32.64. 1 1049990999 0USA 10000CLINTON 9 134 0.933 0.056 -0.291 0.211 2 1041470541 0ALABAMA 10000HEFLIN 92 803 0.885 0.002 -0.086 0.590 3 1049465941 0ALABAMA 20000SHELBY 63 805 0.922 0.002 0.240 0.166 4 1041490781 0ALASKA 20000MURKOWSKI 36 798 0.955 0.005 0.245 0.068 5 1041210981 0ALASKA 20000STEVENS 54 799 0.932 0.003 0.223 0.077 6 1041542961 0ARIZONA 20000KYL 43 818 0.947 0.003 0.304 -0.004 7 1041503961 0ARIZONA 20000MCCAIN 85 801 0.894 0.002 0.288 -0.173 8 1041430042 0ARKANSA 10000BUMPERS 67 794 0.916 0.002 -0.329 -0.004 9 1041079142 0ARKANSA 10000PRYOR 64 762 0.916 0.005 -0.292 0.175 10 1041501171 0CALIFOR 10000BOXER 54 813 0.934 0.002 -0.345 -0.119 11 1044930071 0CALIFOR 10000FEINSTEIN 119 822 0.855 0.002 -0.221 0.097 etc etc etcThe last two columns are the two-dimensional coordinates. Make a two-dimensional plot of the coordinates using R. The party code is just to the right of the state name: 100=Democrat, 200=Republican. The state codes are to the left of the column of zeroes at the beginning of the state names. Use "D" for non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" representing President Clinton. You can modify my R program: