POLI 279 MEASUREMENT THEORY
Sixth Assignment
Due 23 May 2006
# # # keith2.r -- This Program was written by Kevin Quinn and modfied by # Keith Poole to run on WINDOZE machines # rm(list=ls(all=TRUE)) # library(foreign) Be sure to load this library # ## does the 104th senate ## (assumes the relevant .dta files are in the current directory) # ## write out the filehead (only need to do that here) filehead <- "c:/ucsd_homework_5/sen104kh" Change Path Statements as Appropriate filehead2 <- "sen104kh" This line needs to be changed as well # ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE ## ####################################################### # ## read data in filestring <- paste(filehead, ".dta", sep="") Creates "c:/ucsd_homework_5/sen104kh.dta" mydata <- read.dta(filestring) read.dta is in the foreign library # ## add rownames corresponding to legislators' names rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column # in the Stata file and deletes all spaces ## recode votes to 0/1/NA rownames uses the "name" variable as labels for the rows of the matrix ## 1 yea, 0 nay, NA missing ncdata <- ncol(mydata) ncol gets the number of columns of the matrix submat <- mydata[,10:ncdata] Note that the first roll call is the 10th variable in Stata submat[submat==0] <- -999 submat[submat==7] <- -999 submat[submat==8] <- -999 submat[submat==9] <- -999 submat[submat==4] <- 0 submat[submat==5] <- 0 submat[submat==6] <- 0 submat[submat==2] <- 1 submat[submat==3] <- 1 submat[submat==-999] <- NA mydata[,10:ncdata] <- submat # ## change name of mydata to whatever filehead is parse creates the character string "sen104kh <- mydata" expr <- parse(text=paste(filehead2, " <- mydata")) using the paste command to concatenate the strings eval(expr) expr (expression) creates an unevaluated call and eval(expression) # evaluates the unevaluated expression. Note that this is a clever way ## save as an .rda file of passing R a command that would normally be entered from the command line expr <- parse(text=paste("save(", filehead2, ", file=\"", filehead, ".rda\", ascii=TRUE)", sep="")) This produces eval(expr) expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE)) and eval(expr) executes the save(...) command Note that the \" says use " as part of this stringWhen you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.
# # # keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack. # Program written by Kevin Quinn and modified by Keith Poole # library(MCMCpack) # load("c:/ucsd_homework_5/sen104kh.rda") # Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)] Pull in the roll calls # posterior <- MCMCirt1d(Sen.rollcalls, Here is the call to the One-Dimensional IRT model in MCMCPack theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2), Note the Kennedy-Helms Constraint burnin=2000, mcmc=100000, thin=20, verbose=500) The Verbose switch writes to the screen every 500 iterations geweke.diag(posterior) geweke.diag is a convergence diagnostic for Markov Chains pdf(file="c:/ucsd_homework_5/keithplots3.pdf") This opens a PDF file for writing plots plot(posterior) Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain sss <- summary(posterior) write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt") This just writes the statistics to a file write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt") This just writes the quantiles to a file dev.off() This turns off the PDF output file plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain MatrixRun keithMCMC.r and you should see something like this:
SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 1 919 20 36 1 2 20 0.005 5 (36A1,12000I1) (I5,1X,36A1,2I5,50F8.3)The first line is the roll call data file, the second line is the title, the third line, in order, is the number of dimensions, the number of roll calls, the number of iterations (leave as is), the number of "characters" to read off the front of each legislator's record (leave as is), the record number of a Senator that you believe is on the "Left" (this just sets "liberal" on the left -- lower ranks or negative values on the first dimension if the scaling is in two or more dimensions -- and "conservative" on the right), and the record number of a Senator that you believe is on "Up" or on positive side of the second dimension (in the U.S. context, these are the Southern Democrats in an earlier era). The last line is the format for the coordinate output file. Please study the examples on the Optimal Classification Program page because we will be using this program several more times!
09 MAY 2007 13.38.10.53. RANDOM NUMBER SEED 62600 SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 1 919 20 36 1 2 20 0.005 5 (36A1,25000I1) (I5,1X,36A1,2I5,50F8.3) ****************************************************************************** 1 ROLL CALLS 1 8178 81250 0.10065 0.89935 0.71583 2 LEGISLATORS 1 8096 81250 0.09964 0.90036 0.71868 0.99848 3 ROLL CALLS 1 8025 81250 0.09877 0.90123 0.72115 4 LEGISLATORS 1 8016 81250 0.09866 0.90134 0.72146 0.99973 5 ROLL CALLS 1 7989 81250 0.09833 0.90167 0.72240 6 LEGISLATORS 1 7988 81250 0.09831 0.90169 0.72244 0.99991 7 ROLL CALLS 1 7981 81250 0.09823 0.90177 0.72268 8 LEGISLATORS 1 7981 81250 0.09823 0.90177 0.72268 0.99992 9 ROLL CALLS 1 7970 81250 0.09809 0.90191 0.72306 10 LEGISLATORS 1 7970 81250 0.09809 0.90191 0.72306 0.99999 SHARPEN 2 7957 81250 SHARPEN 3 7921 81250 SHARPEN 4 7886 81250 SHARPEN 5 7875 81250 L-PERMUTATIONS 5 7875 81250 0.09692 0.90308 0.72636 13.38.10.54. ELAPSED TIME OF JOB 13.44.37.32.The seventh column reports the correct classification. For the 104th Senate the algorithm quickly converges to 0.90308 (90.3%).
09 MAY 2007 13.38.10.53. 1 1041427521 0ILLINOI 10000SIMON 88 818 0.892 1.000 2 1044930925 0WISCONS 10000FEINGOLD 109 827 0.868 2.000 3 1044910133 0MINNESO 10000WELLSTONE 54 826 0.935 3.000 4 1041487172 0OREGON 10000WYDEN 24 264 0.909 4.000 5 1041501171 0CALIFOR 10000BOXER 56 813 0.931 5.000 6 10410808 3 0MASSACH 10000KENNEDY, ED 53 806 0.934 6.000 7 1041470923 0MICHIGA 10000LEVIN, CARL 57 826 0.931 7.000 8 1041470212 0NEW JER 10000BRADLEY 98 757 0.871 8.000 9 1041491412 0NEW JER 10000LAUTENBERG 70 821 0.915 9.000 10 10414307 6 0VERMONT 10000LEAHY 71 807 0.912 10.000 etc. etc. etc. 100 1042936733 0MINNESO 20000GRAMS 43 816 0.947 99.500 101 1044930447 0NORTH C 20000FAIRCLOTH 45 800 0.944 101.000 102 1041480362 0COLORAD 20000BROWN, HANK 81 828 0.902 102.000 103 1041503961 0ARIZONA 20000MCCAIN 86 801 0.893 103.000 104 1041542961 0ARIZONA 20000KYL 40 818 0.951 104.000Note that below the legislators are the rank positions for the roll call cutting points. Write an Epsilon macro to combine the rank ordering of the Senators with the Bayesian MCMC coordinates (the median from the quantiles file). Turn in the macro and a printout of the combined file. Leave the header on the combined file.
09 MAY 2007 14.14.01.93. RANDOM NUMBER SEED 35000 SEN104KH.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 2 919 20 36 1 2 20 0.005 5 (36A1,25000I1) (I5,1X,36A1,2I5,50F8.3) ****************************************************************************** 1 ROLL CALLS 2 7240 81250 0.08911 0.91089 0.74843 0.00000 LEGISLATORS 2 7199 81250 0.08860 0.91140 0.74985 0.00000 2 ROLL CALLS 2 7099 81250 0.08737 0.91263 0.75333 0.99610 LEGISLATORS 2 7087 81250 0.08722 0.91278 0.75374 0.99859 3 ROLL CALLS 2 7065 81250 0.08695 0.91305 0.75451 0.99800 LEGISLATORS 2 7061 81250 0.08690 0.91310 0.75465 0.99981 4 ROLL CALLS 2 7050 81250 0.08677 0.91323 0.75503 0.99886 LEGISLATORS 2 7046 81250 0.08672 0.91328 0.75517 0.99995 5 ROLL CALLS 2 7039 81250 0.08663 0.91337 0.75541 0.99927 LEGISLATORS 2 7036 81250 0.08660 0.91340 0.75552 0.99992 6 ROLL CALLS 2 7033 81250 0.08656 0.91344 0.75562 0.99848 LEGISLATORS 2 7031 81250 0.08654 0.91346 0.75569 0.99998 7 ROLL CALLS 2 7025 81250 0.08646 0.91354 0.75590 0.99952 LEGISLATORS 2 7024 81250 0.08645 0.91355 0.75593 0.99999 8 ROLL CALLS 2 7021 81250 0.08641 0.91359 0.75604 0.99960 LEGISLATORS 2 7020 81250 0.08640 0.91360 0.75607 1.00000 9 ROLL CALLS 2 7020 81250 0.08640 0.91360 0.75607 0.99938 LEGISLATORS 2 7020 81250 0.08640 0.91360 0.75607 0.99999 10 ROLL CALLS 2 7019 81250 0.08639 0.91361 0.75611 0.99950 LEGISLATORS 2 7016 81250 0.08635 0.91365 0.75621 0.99810 11 ROLL CALLS 2 6994 81250 0.08608 0.91392 0.75698 0.99906 LEGISLATORS 2 6994 81250 0.08608 0.91392 0.75698 1.00000 12 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99927 LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000 13 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99946 LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000 14 ROLL CALLS 2 6992 81250 0.08606 0.91394 0.75705 0.99961 LEGISLATORS 2 6992 81250 0.08606 0.91394 0.75705 1.00000 15 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99921 LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000 16 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99998 LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000 17 ROLL CALLS 2 6990 81250 0.08603 0.91397 0.75711 0.99961 LEGISLATORS 2 6990 81250 0.08603 0.91397 0.75711 1.00000 18 ROLL CALLS 2 6989 81250 0.08602 0.91398 0.75715 0.99960 LEGISLATORS 2 6989 81250 0.08602 0.91398 0.75715 1.00000 19 ROLL CALLS 2 6988 81250 0.08601 0.91399 0.75718 0.99996 LEGISLATORS 2 6988 81250 0.08601 0.91399 0.75718 1.00000 20 ROLL CALLS 2 6988 81250 0.08601 0.91399 0.75718 0.99998 LEGISLATORS 2 6988 81250 0.08601 0.91399 0.75718 1.00000 MEAN VOLUME LEG. 0.0026 0.0032 MACHINE PREC. 2 6988 81250 0.08601 0.91399 0.75718 MACHINE PREC. 2 6988 81250 0.08601 0.91399 0.75718 14.14.01.93. ELAPSED TIME OF JOB 14.15.35.76.The correct classification is now 91.399% (0.91399). The first few lines of PERF25.DAT should look very similar to this:
09 MAY 2007 14.14.01.93. 1 1049990999 0USA 10000CLINTON 9 134 0.933 0.034 -0.256 0.014 2 1041470541 0ALABAMA 10000HEFLIN 95 803 0.882 0.002 -0.086 0.591 3 1049465941 0ALABAMA 20000SHELBY 58 805 0.928 0.002 0.253 0.176 4 1041490781 0ALASKA 20000MURKOWSKI 32 798 0.960 0.002 0.245 0.088 5 1041210981 0ALASKA 20000STEVENS 52 799 0.935 0.002 0.218 0.081 6 1041542961 0ARIZONA 20000KYL 40 818 0.951 0.002 0.301 -0.017 7 1041503961 0ARIZONA 20000MCCAIN 84 801 0.895 0.002 0.272 -0.195 8 1041430042 0ARKANSA 10000BUMPERS 70 794 0.912 0.003 -0.334 -0.005 9 1041079142 0ARKANSA 10000PRYOR 67 762 0.912 0.002 -0.293 0.169 10 1041501171 0CALIFOR 10000BOXER 53 813 0.935 0.002 -0.344 -0.120 etc etc etcThe last two columns are the two-dimensional coordinates. Make a two-dimensional plot of the coordinates using R. The party code is just to the right of the state name: 100=Democrat, 200=Republican. The state codes are to the left of the column of zeroes at the beginning of the state names. Use "D" for non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" representing President Clinton. You can modify my R program:
# rm(list=ls(all=TRUE)) # library(foreign) # ## does the 102nd senate ## (assumes the relevant .dta files are in the current directory) # ## write out the filehead (only need to do that here) filehead <- "c:/ucsd_homework_5/sen108kh" filehead2 <- "sen108kh" # ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE ## ####################################################### # ## read data in filestring <- paste(filehead, ".dta", sep="") mydata <- read.dta(filestring) # ## add rownames corresponding to legislators' names rownames(mydata) <- gsub(" ", "", mydata$name) # ## recode votes to 0/1/NA ## 1 yea, 0 nay, NA missing ncdata <- ncol(mydata) submat <- mydata[,10:ncdata] submat[submat==0] <- -999 submat[submat==7] <- -999 submat[submat==8] <- -999 submat[submat==9] <- -999 submat[submat==4] <- 0 submat[submat==5] <- 0 submat[submat==6] <- 0 submat[submat==2] <- 1 submat[submat==3] <- 1 submat[submat==-999] <- NA mydata[,10:ncdata] <- submat ## change name of mydata to whatever filehead is expr <- parse(text=paste(filehead2, " <- mydata")) eval(expr) ## save as an .rda file expr <- parse(text=paste("save(", filehead2, ", file=\"", filehead, ".rda\", ascii=TRUE)", sep="")) eval(expr)This will create the SEN108KH.RDA file.
# library(MCMCpack) load("c:/ucsd_homework_5/sen108kh.rda") Sen.rollcalls <- sen108kh[,10:ncol(sen108kh)] posterior <- MCMCirt1d(Sen.rollcalls, theta.constraints=list("KENNEDYED"=-2, "KYL"=2), burnin=2000, mcmc=100000, thin=20, verbose=500) geweke.diag(posterior) pdf(file="c:/ucsd_homework_5/keithplots3.pdf") plot(posterior) summary(posterior) sss <- summary(posterior) write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt") write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt") dev.off() plot(posterior[,1])