POLS 6386 MEASUREMENT THEORY
Sixth Assignment
Due 11 March 2003
#
# Program to Illustrate Singular Value Decomposition
#
# Draws n points from bivariate normal distribution
# and draws graph of points along with the
# Singular Vectors
#
nrow <- 1000
ncol <- 2
#
# Create Variance-Covariance Matrix
#
# 1.0 0.9
# 0.9 1.0
#
Sigma <- matrix(c(1,.9,.9,1),2,2)
#
# Call Bivariate Normal with zero mean and
# Sigma Var-Cov Matrix, Place in X
#
X <- mvrnorm(n=nrow,rep(0,2),Sigma)
#
# Perform Singular Value Decomposition
#
xsvd <- svd(X)
#
# The Two Lines Below Put the Singular Values in a
# Diagonal Matrix -- The first one creates an
# identity matrix and the second command puts
# the singular values on the diagonal
#
Lambda <- diag(ncol)
diag(Lambda) <- xsvd$d
#
# Compute U*LAMBDA*V' for check below
#
XX <- xsvd$u %*% Lambda %*% t(xsvd$v)
#
# Do basic plot with two line title
#
#
plot(X[,1],X[,2],asp=1,xlab="First Dimension",ylab="Second Dimension",xlim=c(-3,3),ylim=c(-3,3),col="blue")
mtext(side=3,line=1.5,"Bivariate Normal Random Draw: Correlation = .9 \nEigenvectors Shown in Red",font=2)
#
# Draw arrows for the Singular Vectors -- To make the
# plot look good, set the length of the first
# singular vector = 2.5, and the second singular
# vector = (LAMBDA_2/LAMBDA_1)*2.5. This preserves
# the ratio of the first two singular values and
# graphs the singular vectors in terms of their
# importance.
#
rat11 <- 2.5
rat22 <- (sqrt(xsvd$d[2])/sqrt(xsvd$d[1]))*rat11
sval1 <- xsvd$v[1,1]*rat11
sval2 <- xsvd$v[2,1]*rat11
arrows(0.0, 0.0,sval1,sval2,length=0.1,lwd=3,col="red")
sval1 <- xsvd$v[1,2]*rat22
sval2 <- xsvd$v[2,2]*rat22
arrows(0.0, 0.0, sval1, sval2,length=0.1,lwd=3,col="red")
#
# Rotate Matrix to its Principle Axes and Decompose it Again
#
YY <- XX %*% xsvd$v
ysvd <- svd(YY)
#
# These Two Lines draw arrows for the Singular Vectors --
# They will point along the first and second axes
#
arrows(0.0, 0.0, ysvd$v[1,1], ysvd$v[2,1],length=0.1,lwd=3,col="black")
arrows(0.0, 0.0, ysvd$v[1,2], ysvd$v[2,2],length=0.1,lwd=3,col="black")
#
#
# Compute Fit of SVD -- This is just the sum of squared
# error -- Note that ssesvd should be zero!
#
#
i <- 0
j <- 0
ssesvd <- 0
while (i < nrow) {
i <- i + 1
j <- 0
while (j < ncol) {
j <- j + 1
ssesvd <- ssesvd + (X[i,j] - XX[i,j])**2
}
}

plot(YY[,1],YY[,2],asp=1,xlab="Rotated First Dimension",ylab="Rotated Second Dimension",xlim=c(-3,3),ylim=c(-3,3),pch=16,col="blue") mtext(side=3,line=1.5,"Bivariate Normal Random Draw: Correlation = .9 \nConfiguration Rotated to Principle Axes",font=2)Turn in these graphs as well with the appropriate titles.
1681 0 10 1 1 1 1 1 63 4 4 5 7 1 2 2 3 7 1 1
1124 0 10 1 0 0 0 1 82 1 1 4 4 1 1 1 1 4 1 5
78 5 10 1 0 1 1 1 78 2 1 5 7 4 5 5 6 6 7 5
The variables, in order, are:
RESPONDENT ID = unique 4 digit number
PARTY ID = 0 to 6 -- 0 = Strong Democrat
1 = Weak Democrat
2 = Lean Democrat
3 = Independent
4 = Lean Republican
5 = Weak Republican
6 = Strong Republican
RAW INCOME = **do not use**
FAMILY INCOME = income quintile 1 - 5
SEX = 0 Man, 1 Woman
RACE = 0 White, 1 Black
SOUTH = 0 North, 1 South
EDUCATION = 1 High School or less, 2 Some College, 3 College
AGE = In Years
URBAN UNREST SCALE = Johnson, Humphrey, Nixon, Wallace, Self-Placement
VIETNAM SCALE = Johnson, Humphrey, Nixon, Wallace, Self-Placement
VOTED = 1 Voted, 5 Did Not Vote
LINE # CASE # R POS ALPHA BETA SCALED POS RSQ
1 1681 1.0 -2.6243 0.5249 -2.0994 0.9994 0.9997
2 1124 1.0 -0.8831 0.3532 -0.5298 0.6790 0.8240
3 78 4.0 -0.9588 0.2557 0.0639 0.8992 0.9483
4 553 4.0 -1.2302 0.2895 -0.0724 0.6460 0.8037
The second column is the respondent ID number. Use the respondent ID number
to match OLS68A.DAT with the output file and insert the party ID code into
the output file. After you have inserted the party code you can delete all the
columns except the party code, BETA (you will need that for graphing),
and the Scaled Position. If you have done everything correctly the first few
lines of your file should look like this:
0 0.5249 -2.0994
0 0.3532 -0.5298
5 0.2557 0.0639
1 0.2895 -0.0724
1 0.2763 0.6907
1 1.3930 -0.3482
0 1.0597 -0.5298
0 0.2322 -0.3482
0 0.4371 -0.8742
5 0.3901 0.6827
0 0.0033 0.0050
0 0.2119 -0.5298
1 0.2624 0.2624
2 0.2938 -0.5142
1 0.2745 0.1373
etc.
etc.
etc.
Turn in the Epsilon macro you write that inserts
the party variable into the coordinate file. In the macro, use a split
screen and place the coordinate file in the top screen and
OLS68A.DAT in the bottom screen.
strong.democrat <- T[T[,1]==0 & T[,2] > 0,3] This selects only the strong
Democrats with Postive Betas From
the 3rd column of T
plot(density(strong.democrat),xlab="Scaled Value Urban Unrest",main="Strong Democrat",xlim=c(-3,3),type="l",lwd=3)
This does the smoothed histogram.
Note that type="l" is "l" as in Line not
the number "1"!!!
To make a similar plot for all Democrats with Positive Betas you have to
define the variable with:
all.democrat <- T[(T[,1]==0 | T[,1]==1 | T[,1]==2) & T[,2] > 0,3]
If party is 0 or 1 or 2 AND Beta is Positive,
Select column 3
To make the smoothed histogram simply change the arguments in the plot command above.
plot(density(strong.democrat),xlab="Scaled Value Urban Unrest",main="Strong Party Identifiers",xlim=c(-3,3),type="l",lwd=3,col="red")
lines(density(strong.republican),lwd=3,col="blue") This is the trick to get
a second line in the plot!
text(1,1.3,"Red = Strong Democrats\nBlue = Strong Republicans") This puts in
a simple legend
Make smoothed histograms showing strong Democrats and strong Republicans,
all Democrats and all Republicans, Independents by themselves, and strong Democrats,
strong Republicans, and Independents. Also report the number, mean, and standard
deviation for each variable you graph (put them in a nice, neat table!!).
To get the number simply use the
"length" command; for example:
SEN107KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 107TH SENATE
1 633 20 36 10
(36A1,3900I1)
(I5,1X,36A1,2I5,50F8.3)
The first line is the roll call data file, the second line is the title, the third
line, in order, is the number of dimensions, the number of roll calls, the
number of iterations (leave as is), the number of "characters" to read off
the front of each legislator's record (leave as is), and the record number of a
Senator that you believe is on the "Left" (this just sets "liberal" on the left --
lower ranks or negative values on the first dimension if the scaling is in
two or more dimensions -- and
"conservative" on the right). The last line is the format for
the coordinate output file.
21 FEBRUARY 2003 13.59.33.79.
RANDOM NUMBER SEED 116300
SEN107KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 107TH SENATE
1 633 20 36 10
(36A1,3900I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 1 4102 48745 0.08415 0.91585 0.73739
2 LEGISLATORS 1 4044 48745 0.08296 0.91704 0.74110 0.99787
3 ROLL CALLS 1 3995 48745 0.08196 0.91804 0.74424
4 LEGISLATORS 1 3981 48745 0.08167 0.91833 0.74513 0.99903
5 ROLL CALLS 1 3953 48745 0.08110 0.91890 0.74693
6 LEGISLATORS 1 3946 48745 0.08095 0.91905 0.74738 0.99922
7 ROLL CALLS 1 3935 48745 0.08073 0.91927 0.74808
8 LEGISLATORS 1 3935 48745 0.08073 0.91927 0.74808 0.99996
9 ROLL CALLS 1 3931 48745 0.08064 0.91936 0.74834
10 LEGISLATORS 1 3932 48745 0.08066 0.91934 0.74827 0.99999
13.59.33.92.
ELAPSED TIME OF JOB 13.59.36.06.
The sixth column reports the correct classification. For the 107th
Senate the algorithm quickly converges to 0.91934 (91.9%).
21 FEBRUARY 2003 14.05.20.15.
1 1074930925 0WISCONS 10001FEINGOLD 75 498 0.849 1.000
2 1074010133 0MINNESO 10001DAYTON 19 486 0.961 2.000
3 1074010412 0NEW JER 10001CORZINE 33 495 0.933 3.000
4 1074910133 0MINNESO 10011WELLSTONE 6 484 0.988 4.000
5 1071501171 0CALIFOR 10001BOXER 23 481 0.952 5.000
6 1071303952 0MARYLAN 10001SARBANES 14 498 0.972 6.500
7 10729142 5 0RHODE I 10001REED 22 497 0.956 6.500
8 10710808 3 0MASSACH 10001KENNEDY, ED 25 480 0.948 8.000
9 1071502121 0ILLINOI 10001DURBIN 24 496 0.952 9.000
10 1071423031 0IOWA 10001HARKIN 29 488 0.941 10.000
11 10714213 1 0CONNECT 10001DODD 25 483 0.948 11.000
etc.
etc.
etc.
100 1071490853 0OKLAHOM 20001NICKLES, DO 30 495 0.939 100.000
101 1071462849 0TEXAS 20001GRAMM, PHIL 24 480 0.950 101.000
102 1071542961 0ARIZONA 20001KYL 25 495 0.949 102.000
******************************************************************************
1 1079991099 0USA 200 BUSH 2 63 0.968 70.500
2 1074970041 0ALABAMA 20001SESSIONS 54 481 0.888 65.000
3 1079465941 0ALABAMA 20001SHELBY 55 488 0.887 63.500
4 1071490781 0ALASKA 20001MURKOWSKI 26 459 0.943 70.500
5 1071210981 0ALASKA 20001STEVENS 27 452 0.940 62.000
6 1071542961 0ARIZONA 20001KYL 25 495 0.949 102.000
7 1071503961 0ARIZONA 20001MCCAIN 124 478 0.741 57.000
8 1072930642 0ARKANSA 20001HUTCHINSON, 47 484 0.903 66.000
9 1072930542 0ARKANSA 10001LINCOLN 33 495 0.933 47.000
10 1071501171 0CALIFOR 10001BOXER 23 481 0.952 5.000
etc.
etc.
Use Epsilon to combine the rank ordering
of the Senators with the W-NOMINATE coordinates
and the KYST coordinates you
estimated for Homework 3 Q.2. Read the combined
file into R and compute the correlations between
the rank ordering with the first dimension of W-NOMINATE,
the first dimension of the KYST two dimensional coordinates, and the KYST one
dimensional coordinates. Graph the rank ordering (horizontal axis) against
the first dimension of W-NOMINATE with the
appropriate titles and labels and graph the rank ordering (horizontal axis)
against the first dimension of the two-dimensional
KYST coordinates with the appropriate titles
and labels.
etc.
etc.
PERFORMANCE INDEX EIGENVALUE/VECTOR ROUTINE= 1 102 0 0
1 9.1091 70.3239 70.3239 11.4572 57.7452 57.7452
2 0.2922 2.2557 72.5796 0.7181 3.6190 61.3642
3 0.2525 1.9493 74.5289 0.4965 2.5023 63.8665
4 0.1772 1.3683 75.8973 0.3837 1.9337 65.8002
5 0.1557 1.2019 77.0992 0.3225 1.6256 67.4258
6 0.1430 1.1040 78.2032 0.2785 1.4038 68.8296
7 0.1300 1.0040 79.2071 0.2561 1.2906 70.1203
8 0.1068 0.8245 80.0317 0.2247 1.1327 71.2530
9 0.0918 0.7090 80.7407 0.1989 1.0023 72.2553
10 0.0783 0.6041 81.3448 0.1915 0.9650 73.2203
11 0.0752 0.5806 81.9254 0.1813 0.9138 74.1341
12 0.0684 0.5280 82.4534 0.1648 0.8306 74.9647
13 0.0654 0.5049 82.9582 0.1565 0.7889 75.7537
14 0.0583 0.4503 83.4086 0.1531 0.7714 76.5251
15 0.0562 0.4336 83.8422 0.1435 0.7232 77.2483
16 0.0500 0.3857 84.2279 0.1412 0.7118 77.9600
17 0.0468 0.3614 84.5893 0.1402 0.7066 78.6667
18 0.0441 0.3405 84.9298 0.1333 0.6719 79.3385
19 0.0407 0.3146 85.2444 0.1293 0.6519 79.9904
20 0.0388 0.2999 85.5443 0.1216 0.6129 80.6032
etc.
etc.
The second column with a first entry of 9.1091 are the eigenvalues. Make
a graph of the eigenvalues like the one you did for
Homework 3 Q.2.
21 FEBRUARY 2003 14.54.27.96.
RANDOM NUMBER SEED 87600
SEN107KH.ORD
NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 107TH SENATE
2 633 20 36 10
(36A1,3900I1)
(I5,1X,36A1,2I5,50F8.3)
******************************************************************************
1 ROLL CALLS 2 3723 48745 0.07638 0.92362 0.76165 0.00000
LEGISLATORS 2 3654 48745 0.07496 0.92504 0.76607 0.00000
2 ROLL CALLS 2 3594 48745 0.07373 0.92627 0.76991 0.99244
LEGISLATORS 2 3574 48745 0.07332 0.92668 0.77119 0.99799
3 ROLL CALLS 2 3552 48745 0.07287 0.92713 0.77260 0.99702
LEGISLATORS 2 3546 48745 0.07275 0.92725 0.77298 0.99808
4 ROLL CALLS 2 3540 48745 0.07262 0.92738 0.77337 0.99653
LEGISLATORS 2 3539 48745 0.07260 0.92740 0.77343 0.99991
5 ROLL CALLS 2 3534 48745 0.07250 0.92750 0.77375 0.99849
LEGISLATORS 2 3533 48745 0.07248 0.92752 0.77382 0.99995
6 ROLL CALLS 2 3531 48745 0.07244 0.92756 0.77394 0.99821
LEGISLATORS 2 3531 48745 0.07244 0.92756 0.77394 0.99985
7 ROLL CALLS 2 3528 48745 0.07238 0.92762 0.77414 0.99989
LEGISLATORS 2 3527 48745 0.07236 0.92764 0.77420 0.99955
8 ROLL CALLS 2 3525 48745 0.07232 0.92768 0.77433 0.99918
LEGISLATORS 2 3525 48745 0.07232 0.92768 0.77433 0.99999
9 ROLL CALLS 2 3524 48745 0.07229 0.92771 0.77439 0.99882
LEGISLATORS 2 3524 48745 0.07229 0.92771 0.77439 0.99999
10 ROLL CALLS 2 3524 48745 0.07229 0.92771 0.77439 0.99867
LEGISLATORS 2 3524 48745 0.07229 0.92771 0.77439 1.00000
11 ROLL CALLS 2 3523 48745 0.07227 0.92773 0.77446 0.99994
LEGISLATORS 2 3523 48745 0.07227 0.92773 0.77446 1.00000
12 ROLL CALLS 2 3523 48745 0.07227 0.92773 0.77446 0.99910
LEGISLATORS 2 3523 48745 0.07227 0.92773 0.77446 1.00000
13 ROLL CALLS 2 3522 48745 0.07225 0.92775 0.77452 0.99808
LEGISLATORS 2 3522 48745 0.07225 0.92775 0.77452 1.00000
14 ROLL CALLS 2 3521 48745 0.07223 0.92777 0.77458 0.99778
LEGISLATORS 2 3521 48745 0.07223 0.92777 0.77458 1.00000
15 ROLL CALLS 2 3521 48745 0.07223 0.92777 0.77458 0.99715
LEGISLATORS 2 3524 48745 0.07229 0.92771 0.77439 1.00000
16 ROLL CALLS 2 3521 48745 0.07223 0.92777 0.77458 0.99721
LEGISLATORS 2 3521 48745 0.07223 0.92777 0.77458 1.00000
17 ROLL CALLS 2 3521 48745 0.07223 0.92777 0.77458 0.99768
LEGISLATORS 2 3521 48745 0.07223 0.92777 0.77458 1.00000
18 ROLL CALLS 2 3521 48745 0.07223 0.92777 0.77458 0.99691
LEGISLATORS 2 3520 48745 0.07221 0.92779 0.77465 0.99999
19 ROLL CALLS 2 3520 48745 0.07221 0.92779 0.77465 0.99727
LEGISLATORS 2 3520 48745 0.07221 0.92779 0.77465 0.99999
20 ROLL CALLS 2 3434 48745 0.07045 0.92955 0.78015 0.99401
LEGISLATORS 2 3431 48745 0.07039 0.92961 0.78035 0.99948
MEAN VOLUME LEG. 0.0047 0.0068 3.6961 6.7777 -0.0264
MACHINE PREC. 2 3430 48745 0.07037 0.92963
14.54.28.01.
ELAPSED TIME OF JOB 14.56.17.40.
The correct classification is now 92.963% (0.92963). The first few lines of
PERF25.DAT should look very similar to this:
21 FEBRUARY 2003 14.54.27.96.
1 1079991099 0USA 200 BUSH 1 63 0.984 0.063 0.208 0.280
2 1074970041 0ALABAMA 20001SESSIONS 50 481 0.896 0.002 0.224 -0.178
3 1079465941 0ALABAMA 20001SHELBY 44 488 0.910 0.007 0.181 0.200
4 1071490781 0ALASKA 20001MURKOWSKI 21 459 0.954 0.004 0.198 0.140
5 1071210981 0ALASKA 20001STEVENS 32 452 0.929 0.003 0.177 0.157
6 1071542961 0ARIZONA 20001KYL 23 495 0.954 0.002 0.285 -0.359
7 1071503961 0ARIZONA 20001MCCAIN 62 478 0.870 0.002 0.130 -0.665
8 1072930642 0ARKANSA 20001HUTCHINSON, 45 484 0.907 0.012 0.199 0.187
9 1072930542 0ARKANSA 10001LINCOLN 27 495 0.945 0.002 -0.112 0.327
10 1071501171 0CALIFOR 10001BOXER 22 481 0.954 0.012 -0.275 -0.115
The last two columns are the two-dimensional coordinates. Make a two-dimensional plot
of the coordinates using R. Use the names of the
Senators and label the axes and the plot appropriately.