Simulation Study of Linking using mirt

This simulation study is to show how to do IRT Linking Process using mirt R Package. The simulation data includes 2 forms - Form A and Form B. These 2 forms are simulated based on 2 groups of individual, one group has 0 mean trait, another has 0.25 mean trait. Both groups have same sd.

Introduction

The means and sds of simulated Form A and B are like:

$$ \theta_{A} = \theta_{B} - 0.25 \ \sigma_A^2 = \sigma_B^2 $$

Calibration of Form A

The mean of \(\theta\) for individuals administrated with form A is 0, the standard deviation (SD=1). In the dataset, X is ID, V1 is true trait ($\theta$), V3 to V52 is unique items, V54 to V63 are common items.

Look at the data

First of all, have a look at the data

library(mirt)
library(tidyverse)
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
library(knitr)
## Warning: package 'knitr' was built under R version 4.1.2
### Read in Raw data from Form A:
dat <- read.csv(file="files/FormA.csv")
glimpse(dat)
## Rows: 5,000
## Columns: 64
## $ X   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,…
## $ V1  <dbl> -0.79498, -0.05589, 0.62650, -1.26832, 0.74921, 1.00922, -0.19147,…
## $ V2  <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V3  <int> 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, …
## $ V4  <int> 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ V5  <int> 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, …
## $ V6  <int> 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, …
## $ V7  <int> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, …
## $ V8  <int> 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, …
## $ V9  <int> 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, …
## $ V10 <int> 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, …
## $ V11 <int> 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, …
## $ V12 <int> 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, …
## $ V13 <int> 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, …
## $ V14 <int> 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, …
## $ V15 <int> 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, …
## $ V16 <int> 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, …
## $ V17 <int> 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ V18 <int> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, …
## $ V19 <int> 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, …
## $ V20 <int> 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, …
## $ V21 <int> 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ V22 <int> 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, …
## $ V23 <int> 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, …
## $ V24 <int> 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, …
## $ V25 <int> 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, …
## $ V26 <int> 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, …
## $ V27 <int> 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, …
## $ V28 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, …
## $ V29 <int> 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, …
## $ V30 <int> 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, …
## $ V31 <int> 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, …
## $ V32 <int> 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, …
## $ V33 <int> 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, …
## $ V34 <int> 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, …
## $ V35 <int> 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, …
## $ V36 <int> 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, …
## $ V37 <int> 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, …
## $ V38 <int> 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, …
## $ V39 <int> 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, …
## $ V40 <int> 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, …
## $ V41 <int> 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, …
## $ V42 <int> 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, …
## $ V43 <int> 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, …
## $ V44 <int> 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, …
## $ V45 <int> 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, …
## $ V46 <int> 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, …
## $ V47 <int> 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, …
## $ V48 <int> 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, …
## $ V49 <int> 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, …
## $ V50 <int> 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, …
## $ V51 <int> 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, …
## $ V52 <int> 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, …
## $ V53 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V54 <int> 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, …
## $ V55 <int> 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, …
## $ V56 <int> 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, …
## $ V57 <int> 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, …
## $ V58 <int> 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, …
## $ V59 <int> 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, …
## $ V60 <int> 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, …
## $ V61 <int> 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, …
## $ V62 <int> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, …
## $ V63 <int> 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, …

Plot the density of true \(\theta\) of Group A

From the density function, \(mu_{\theta}\) is 0, \(sd_{\theta}\) is 1.

plot(density(dat$V1), main = "Group A True Trait Density", 
     xlab=expression(theta) )

CTT Table

CTT table could provide a brief description of table. 2 key valables in the table is item difficulty (item.diff) calculated by item means \(P(y=1)\) and item discrimination (item.disc), which is item-total correlation.

Clean data

By cleaning, data has 60 items including 50 unique item (from item1 to item50) and 10 common items (from item51 to item60). The sample size is 5000.

dat_cali <- dat %>% select(V3:V52, V54:V63)
colnames(dat_cali) <- paste0("item",1:60)
N <- nrow(dat_cali)
n <- ncol(dat_cali)

Classical Test Theory

Then calculate the CTT table for Form A. The item discrimination and difficulty could be compared between Form A and Form B. Because the relationship between trait and total score is non-linear, so there is effect of shrinkage.

# item stats
## item discrimnation
item.disc <- apply(dat_cali, 2, function(x) cor(x, rowSums(dat_cali, na.rm = TRUE)))

## item difficulty
item.diff <- colMeans(dat_cali)

## item response frequency
item.freq <- reduce(lapply(dat_cali, table), bind_rows)

CTT <- cbind(item.disc, item.diff, item.freq)

kable(CTT, digits = 3, caption = "CTT Table for Form A")

Table: Table 1: CTT Table for Form A

item.discitem.diff01
item10.3650.76111973803
item20.4640.8338354165
item30.4570.53923052695
item40.4710.52423802620
item50.3850.22138931107
item60.4000.56321832817
item70.2610.45727172283
item80.2640.36531761824
item90.3310.56021982802
item100.5050.43128432157
item110.3060.35432281772
item120.5280.56021992801
item130.3620.55022512749
item140.4270.76111963804
item150.4330.24137931207
item160.3770.45827102290
item170.5240.54522752725
item180.3510.57821122888
item190.4290.50924552545
item200.3300.30734651535
item210.3990.40329842016
item220.4180.60919573043
item230.3560.35432321768
item240.4600.67016513349
item250.3880.50724652535
item260.2090.31034511549
item270.2420.47626182382
item280.4950.50224922508
item290.4410.48125932407
item300.4500.73113443656
item310.6010.39930061994
item320.3960.68215883412
item330.5840.61319333067
item340.4890.55722162784
item350.4670.75912033797
item360.3120.36731661834
item370.4100.77811093891
item380.4430.57121432857
item390.5190.75512243776
item400.2800.32733651635
item410.4130.58220882912
item420.3380.39530241976
item430.4300.63418303170
item440.3480.38230881912
item450.3700.53023512649
item460.2860.33733141686
item470.4560.40329852015
item480.3400.39730141986
item490.3100.29135441456
item500.4350.35732141786
item510.2830.8218954105
item520.2720.32433821618
item530.4740.53123472653
item540.2980.33633181682
item550.4270.54622712729
item560.4710.66716673333
item570.4100.47026522348
item580.4830.43328342166
item590.3630.23238411159
item600.3760.66616703330

Final Calibration of Form A

Model Specification

SPECS <- mirt.model('F = 1-60
                    PRIOR = (1-60, a1, lnorm, 0, 1),
                    (1-60,  d,  norm, 0, 1),
                    (1-60,  g,  norm, -1.39,1)')
mod_A3PL <- mirt(data=dat_cali, model=SPECS, itemtype='3PL')
parms_a <- coef(mod_A3PL, simplify=TRUE, IRTpars = TRUE)$items
a_A <- parms_a[,1] 
b_A <- parms_a[,2] 
c_A <- parms_a[,3] 
theta_A <- fscores(mod_A3PL,method="EAP",
                   full.scores=TRUE, full.scores.SE=TRUE,
                   scores.only=TRUE)

Using 3-PL for irt model of form A. Extracting the cofficients (a, b, c) of IRT. The model-implied theta was outputed, whose mean is 0.1314943

The plot below suggest that SE is low when theta is close to mean, but low theta and high theta has large SE.

plot(theta_A[,1],theta_A[,2])

Calibration of Form B

The structure of Form B is as same as Form A.

dat_b <- read.csv(file="files/FormB.csv")
glimpse(dat_b)
## Rows: 5,000
## Columns: 64
## $ X   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,…
## $ V1  <dbl> 1.12513, 0.29724, -0.54719, 1.18477, 0.37517, 0.05821, 1.11918, 1.…
## $ V2  <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V3  <int> 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, …
## $ V4  <int> 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, …
## $ V5  <int> 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, …
## $ V6  <int> 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, …
## $ V7  <int> 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, …
## $ V8  <int> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, …
## $ V9  <int> 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, …
## $ V10 <int> 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, …
## $ V11 <int> 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, …
## $ V12 <int> 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, …
## $ V13 <int> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, …
## $ V14 <int> 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, …
## $ V15 <int> 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, …
## $ V16 <int> 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, …
## $ V17 <int> 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, …
## $ V18 <int> 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, …
## $ V19 <int> 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, …
## $ V20 <int> 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, …
## $ V21 <int> 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, …
## $ V22 <int> 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, …
## $ V23 <int> 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, …
## $ V24 <int> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, …
## $ V25 <int> 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, …
## $ V26 <int> 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, …
## $ V27 <int> 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, …
## $ V28 <int> 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, …
## $ V29 <int> 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, …
## $ V30 <int> 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, …
## $ V31 <int> 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, …
## $ V32 <int> 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, …
## $ V33 <int> 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, …
## $ V34 <int> 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, …
## $ V35 <int> 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, …
## $ V36 <int> 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, …
## $ V37 <int> 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ V38 <int> 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, …
## $ V39 <int> 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, …
## $ V40 <int> 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, …
## $ V41 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, …
## $ V42 <int> 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, …
## $ V43 <int> 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, …
## $ V44 <int> 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, …
## $ V45 <int> 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ V46 <int> 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, …
## $ V47 <int> 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, …
## $ V48 <int> 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, …
## $ V49 <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, …
## $ V50 <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, …
## $ V51 <int> 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, …
## $ V52 <int> 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, …
## $ V53 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V54 <int> 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, …
## $ V55 <int> 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, …
## $ V56 <int> 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, …
## $ V57 <int> 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, …
## $ V58 <int> 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, …
## $ V59 <int> 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, …
## $ V60 <int> 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, …
## $ V61 <int> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, …
## $ V62 <int> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ V63 <int> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, …
library(ggplot2)

ggplot(data = dat_b) +
  ggtitle("Group B True Trait Density") +
  geom_density(aes(x = V1), fill = "skyblue", col = "skyblue", alpha=0.8) +
  xlab("True Latent Trait")
dat_cali2 <- dat_b %>% select(V3:V52, V54:V63)
colnames(dat_cali2) <- paste0("item",1:60)
N <- nrow(dat_cali2)
n <- ncol(dat_cali2)


# item stats
## item discrimnation
item.disc <- apply(dat_cali2, 2, function(x) cor(x, rowSums(dat_cali, na.rm = TRUE)))

## item difficulty
item.diff <- colMeans(dat_cali2)

## item response frequency
item.freq <- reduce(lapply(dat_cali2, table), bind_rows)

CTT2 <- cbind(item.disc, item.diff, item.freq)

kable(CTT2, digits = 3, caption = "CTT Table for Form A")

Table: Table 2: CTT Table for Form A

item.discitem.diff01
item1-0.0290.51724152585
item2-0.0050.57321372863
item3-0.0370.55222422758
item40.0080.51924052595
item50.0130.70914553545
item6-0.0120.8049794021
item70.0170.68715673433
item80.0030.64117943206
item90.0000.62618693131
item10-0.0210.39230381962
item110.0210.8218934107
item12-0.0120.47426292371
item13-0.0260.68715643436
item140.0080.51724142586
item150.0040.47526262374
item16-0.0120.52523752625
item17-0.0250.36331851815
item180.0010.39330361964
item190.0030.49025502450
item200.0010.50524772523
item210.0060.71014513549
item220.0150.24837591241
item230.0070.59420302970
item24-0.0100.36032001800
item250.0110.66416813319
item26-0.0020.79010503950
item270.0110.39530261974
item280.0090.35332361764
item29-0.0030.33533251675
item30-0.0150.34632681732
item310.0050.34033011699
item320.0070.61319353065
item33-0.0260.37731171883
item340.0030.50025022498
item35-0.0070.43628212179
item36-0.0090.76711663834
item370.0030.42828612139
item380.0020.59320332967
item390.0110.66316833317
item400.0200.60619683032
item41-0.0100.8407994201
item420.0150.55422322768
item430.0120.8248804120
item44-0.0180.43328372163
item45-0.0050.42928552145
item46-0.0190.31634211579
item47-0.0070.8278674133
item48-0.0230.76012013799
item490.0090.38930551945
item50-0.0130.31734131587
item51-0.0020.8487624238
item52-0.0180.36231911809
item530.0070.58021012899
item54-0.0130.36331841816
item55-0.0230.58120962904
item56-0.0340.73113443656
item57-0.0070.52423792621
item58-0.0170.49125472453
item590.0040.27536271373
item600.0220.71514233577

Final Calibration of Form A

Model Specification of B

SPECS2 <- mirt.model('F = 1-60
                    PRIOR = (1-60, a1, lnorm, 0, 1),
                    (1-60,  d,  norm, 0, 1),
                    (1-60,  g,  norm, -1.39,1)')
mod_B3PL <- mirt(data=dat_cali2, model=SPECS, itemtype='3PL')
parms_b <- coef(mod_B3PL, simplify=TRUE, IRTpars = TRUE)$items
a_B <- parms_b[,1] 
b_B <- parms_b[,2] 
c_B <- parms_b[,3] 
theta_B <- fscores(mod_B3PL,method="EAP",
                   full.scores=TRUE, full.scores.SE=TRUE,
                   scores.only=TRUE)
head(theta_B, 20) %>% kable(digits = 3, caption = "Model-implied Theta of B")

Table: Table 3: Model-implied Theta of B

FSE_F
0.9330.214
0.1930.220
-0.7830.293
0.8650.208
-0.2330.261
-0.2170.239
0.8900.215
1.3690.232
-0.8750.306
-0.4110.266
1.0910.223
-0.6380.261
-1.2530.321
-0.7840.261
0.4360.228
0.1350.211
-1.8410.491
-0.4910.259
0.0650.225
-0.3390.240

b-plot

The relationship between b parameters of A and B reflect the latent traits of A and B:

$$ \theta_{A} = \theta_{B} - 0.25 \ \sigma_A^2 = \sigma_B^2 $$ thus, \(b_B -b_A\) should also be -0.25. the estimated difference of b is calculate by the mean of b parametes of Form A’s common items and that of Form B’s common items, which is -0.2756464. Thus, it is very close to difference of true traits.

### b-plot
plot(b_A[51:60],b_B[51:60],
     main=paste0("r =", round(cor(b_A[51:60],b_B[51:60]),5)), 
     xlab = "b_A",
     ylab = "b_B"
     )
# mean(b_B[51:60])-mean(b_A[51:60])

a-plot

Because \(\theta_B - \theta_A = 0.25\), so a parametes are: $$ a_A / a_B = 1 $$

The true ratio of (means of) a parameters is 1.016284, which is very close to 1. SD of A and B are both close to 1.

### a-plot
plot(a_A[51:60],a_B[51:60],
     main=round(cor(a_A[51:60],a_B[51:60]),5)
     )
# mean(a_A[51:60]) / mean(a_B[51:60]) 


#SDs of b-values across forms
SD_bA <- sd(b_A[51:60])
SD_bB <- sd(b_B[51:60])

Mean_bA <- mean(b_A[51:60])
Mean_bB <- mean(b_B[51:60])

Linking

###
### Run one or the other, NOT BOTH!
###

### MS linking: place item parameters from
### Form B on the scale of Form A
slope <- SD_bA / SD_bB
inter <- Mean_bA - slope*Mean_bB

### MM linking: place item parameters from
### Form B on the scale of Form A
# slope <- Mean_aA / Mean_aB
# inter <- Mean_bA - slope*Mean_bB


###
### Perform the Linking
###

LINKED_items <- matrix(0,50,3)

#Column 3 is c, it stays the same
LINKED_items[,3] <- c_B[1:50]

#Column 2 is b, it is linked:
LINKED_items[,2] <- b_B[1:50]*slope + inter

#Column 1 is a, it is also linked:
LINKED_items[,1] <- a_B[1:50] / slope

### Now the ITEM BANK has 110 items all
### linked to a common metric!
rownames(LINKED_items) <- paste0("item_B", 1:50)
ITEM.BANK <- rbind(parms_a[,-4], LINKED_items)
# write.csv(ITEM.BANK,file="item_bank.csv")

#Link the theta estimates
LINKED_theta <- theta_B[,1]*slope + inter
LINKED_se <- theta_B[,2]/slope

# write.csv(cbind(LINKED_theta,LINKED_se),file="LINKED_FormB_theta_est.csv")
paste0("Mean of Theta of B is ",mean(theta_B[,1]) %>% round(3))
## [1] "Mean of Theta of B is -0.013"
paste0("SD of Theta of B is ", sd(theta_B[,1]) %>% round(3))
## [1] "SD of Theta of B is 0.97"
# after Linked
print("After Linking:")
## [1] "After Linking:"
paste0("Mean of Theta of B is ",mean(LINKED_theta) %>% round(3))
## [1] "Mean of Theta of B is 0.269"
paste0("SD of Theta of B is ",sd(LINKED_theta) %>% round(3))
## [1] "SD of Theta of B is 0.935"
# Theta of A is
paste0("Mean of Theta of A is ",mean(theta_A[,1]) %>% round(3))
## [1] "Mean of Theta of A is -0.021"
paste0("SD of Theta of A is ", sd(theta_A[,1]) %>% round(3))
## [1] "SD of Theta of A is 0.965"
Jihong Zhang, M.S.
Jihong Zhang, M.S.
Ph.D. Candidate

My research interests mainly focus on the Bayesian Diagnostic Classification Models (DCMs) - a special kind of Item Response Model and the model checking method, as applied in the psychological, educational, and social sciences. I seek to improve the utility of advanced psychometric modeling and provide easy-to-use tools or software for researchers.