---
title: "Lecture 04"
subtitle: "Linear Regression Model with Stan II"
author: "Jihong Zhang"
institute: "Educational Statistics and Research Methods"
title-slide-attributes:
data-background-image: ../Images/title_image.png
data-background-size: contain
data-background-opacity: "0.9"
execute:
echo: true
warning: false
format:
html:
code-tools: true
code-line-numbers: false
code-fold: true
code-summary: 'Click here to see R code'
number-offset: 1
fig.width: 10
fig-align: center
message: false
revealjs:
logo: ../Images/UA_Logo_Horizontal.png
incremental: true # choose "false "if want to show all together
theme: [serif, ../pp.scss]
footer: <https://jihongzhang.org/posts/2024-01-12-syllabus-adv-multivariate-esrm-6553>
transition: slide
background-transition: fade
slide-number: true
chalkboard: true
number-sections: false
code-line-numbers: true
code-link: true
code-annotations: hover
code-copy: true
highlight-style: arrow
code-block-border-left: true
code-block-background: "#b22222"
output-file: slides-index.html
---
## Today's Lecture Objectives
1. Explain Stan Syntax for linear regression model
2. Computing Functions of Model Parameters
**Download R file [DietDataExample2.R]{.underline}**
## In previous class...
::::: columns
::: {.column width="50%"}
```{r}
#| output: true
#| code-fold: true
#| code-summary: "R code for data read in"
library(cmdstanr)
library(bayesplot)
library(tidyr)
library(dplyr)
library(kableExtra)
color_scheme_set('brightblue')
dat <- read.csv(here::here("teaching", "2024-01-12-syllabus-adv-multivariate-esrm-6553", "Lecture03", "Code", "DietData.csv"))
dat$DietGroup <- factor(dat$DietGroup, levels = 1:3)
dat$HeightIN60 <- dat$HeightIN - 60
kable( rbind(head(dat), tail(dat)) ) |>
kable_classic_2() |>
kable_styling(full_width = F, font_size = 15)
```
:::
::: {.column width="50%"}
1. Introduce the empty model
2. Example: Post-Diet Weights
- WeightLB (*Dependent Variable*): The respondents' weight in pounds
- HeightIN: The respondents' height in inches
- DietGroup: 1, 2, 3 representing the group to which a respondent was assigned
3. The empty model has two parameters to be estimated: (1) $\beta_0$, (2) $\sigma_e$
4. The posterior mean/median of $\beta_0$ should be mean of WeightLB
5. The posterior mean/median of $\sigma_e$ should be sd of WeightLB
:::
:::::
## Making `Stan` Code Short and Efficient
The Stan syntax from our previous model was lengthy:
- A declared variable for each parameter
- The linear combination of coefficients by multiplying predictors
Stan has built-in features to shorten syntax:
- Matrices/Vectors
- Matrix products
- Multivariate distribution (initially for prior distributions)
- Built-in Functions (`sum()` better than `+=`)
Note: if you are interested in Efficiency tuning in Stan, look at this [Charpter](https://mc-stan.org/docs/stan-users-guide/efficiency-tuning.html) for more details.
------------------------------------------------------------------------
## Linear Models without Matrices
The linear model from our example was:
$$
\text{WeightLB}_p = \beta_0 + \beta_1 \text{HeightIN}_p + \beta_2 \text{Group2}_p + \beta_3\text{Group3}_p \\
+\beta_4 \text{HeightIN}_p\text{Group2}_p \\
+\beta_5 \text{HeightIN}_p\text{Group3}_p \\
+ e_p
$$
with:
- $\text{Group2}_p$ the binary indicator of person $p$ being in group 2
- $\text{Group}3_p$ the binary indicator of person $p$ being in group 3
- $e_p \sim N(0, \sigma_e)$
------------------------------------------------------------------------
### Path Diagram of the Full Model
```{mermaid}
%%| echo: false
%%| label: fig-diagram
graph LR;
HeightIN60 --> WeightLB;
DietGroup2 --> WeightLB;
DietGroup3 --> WeightLB;
HeightIN60xDietGroup2 --> WeightLB;
HeightIN60xDietGroup3 --> WeightLB;
```
------------------------------------------------------------------------
## Linear Models with Matrices
::::: columns
::: {.column width="50%"}
Model (predictor) matrix with the size 30 (rows) $\times$ 6 (columns)
$$
\mathbf{X} = \begin{bmatrix}1 & -4 & 0 & 0 & 0 & 0\\
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\
1 & 12 & 0 & 1 & 0 & 12 \end{bmatrix}
$$
:::
::: {.column width="50%"}
Coefficients vectors with the size 6 (rows) $\times$ 1 (column):
$$
\mathbf{\beta} =
\begin{bmatrix}
\beta_0 \\
\beta_1 \\
\beta_2 \\
\beta_3 \\
\beta_4 \\
\beta_5 \\
\end{bmatrix}
$$
:::
:::::
`model.matrix` creates a design (or model) matrix ($X$), e.g., by expanding factors to a set of dummy variables (depending on the contrasts) and expanding interactions similarly.
```{r}
FullModelFormula = as.formula("WeightLB ~ HeightIN60 + DietGroup + HeightIN60*DietGroup")
model.matrix(FullModelFormula, data = dat) |> unique()
```
## Linear Models with Matrices (Cont.)
We then rewrite the equation from
$$
\text{WeightLB}_p = \beta_0 + \beta_1 \text{HeightIN}_p + \beta_2 \text{Group2}_p + \beta_3\text{Group3}_p \\
+\beta_4 \text{HeightIN}_p\text{Group2}_p \\
+\beta_5 \text{HeightIN}_p\text{Group3}_p \\
+ e_p
$$
to:
$$
\mathbf{WeightLB} = \mathbf{X}\boldsymbol{\beta} + \mathbf{e}
$$
Where:
- $\mathbf{WeightLB}$ is the vector of outcome (N $\times$ 1)
- $\mathbf{X}$ is the model (predictor) matrix (N $\times$ P for P - 1 predictors)
- $\boldsymbol{\beta}$ is the coefficients vector (P $\times$ 1)
- $\mathbf{e}$ is the vector for residuals (N $\times$ 1)
------------------------------------------------------------------------
### Example: Predicted Values and $\text{R}^2$
::::: columns
::: {.column width="50%"}
**Similar to Monte Carlo Simulation, given** matrices $P$ **and** $\boldsymbol{\beta}$
```{r}
#| code-fold: true
set.seed(1234)
fit_lm <- lm(formula = FullModelFormula, data = dat)
beta = coefficients(fit_lm)
P = length(beta)
X = model.matrix(FullModelFormula, data = dat)
head(X%*%beta)
```
:::
::: {.column width="50%"}
**Calculating** $R^2$ **and adjusted** $R^2$**:**
```{r}
rss = crossprod(dat$WeightLB - X%*%beta) # residual sum of squares
tss = crossprod(dat$WeightLB - mean(dat$WeightLB)) # total sum of squares
R2 = 1 - rss / tss
R2.adjust = 1 - (rss/(nrow(dat)-P)) / (tss/((nrow(dat)-1)))
data.frame(
R2, # r-square
R2.adjust # adjusted. r-square
)
```
**`lm` function:** $R^2$ **and adjusted** $R^2$:
```{r}
summary_lm <- summary(fit_lm)
summary_lm$r.squared
summary_lm$adj.r.squared
```
:::
:::::
## Vectorize prior distributions
Previously, we defined a normal distribution for each regression coefficient
$$
\beta_0 \sim normal(0, 1) \\
\vdots \\
\beta_p \sim normal(0, 1)
$$
- They are all univariate normal distribution
- Issue: Each parameter had a prior that was independent of the other parameter; then the correlation between betas is low and cannot be changed.
For example, the code shows two betas with univariate normal distribution have low correlation (r = -0.025)
```{r}
set.seed(1234)
beta0 = rnorm(100, 0, 1)
beta1 = rnorm(100, 0, 1)
cor(beta0, beta1)
```
------------------------------------------------------------------------
## Vectorize prior distributions (Cont.)
When combining all parameters into a vector, a natural extension is a multivariate normal distribution, so that the betas have a pre-defined correlation strength
- The syntax shows the two betas generated by the multivariate normal distribution with correlation of .5
```{r}
set.seed(1234)
sigma_of_betas = matrix(c(1, 0.5, 0.5, 1), ncol = 2)
betas = mvtnorm::rmvnorm(100, mean = c(0, 0), sigma = sigma_of_betas)
beta0 = betas[,1]
beta1 = betas[,2]
cor(beta0, beta1)
```
Back to the `stan` code, we need to specify:
- Mean vector of betas (`meanBeta`; size P $\times$ 1)
- Put all prior means for those coefficients into a vector
- Covariance matrix for betas (`covBeta`; size P $\times$ P)
- Put all prior variances into the diagonal; zeros for off diagonal; 'cause we are not sure the potential correlation between betas
## Syntax Changes: Data Section
::::::: columns
:::: {.column width="50%"}
::: nonincremental
- **Old syntax without matrix:**
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| eval: false
data{
int<lower=0> N;
vector[N] weightLB;
vector[N] height60IN;
vector[N] group2;
vector[N] group3;
vector[N] heightXgroup2;
vector[N] heightXgroup3;
}
```
:::
::::
:::: {.column width="50%"}
::: nonincremental
- **New syntax with matrix:**
:::
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| eval: false
data{
int<lower=0> N; // number of observations
int<lower=0> P; // number of predictors (plus column for intercept)
matrix[N, P] X; // model.matrix() from R
vector[N] weightLB; // outcome
real sigmaRate; // hyperparameter: rate parameter for residual standard deviation
}
```
::::
:::::::
## Syntax Changes: Parameters Section
::::::: columns
:::: {.column width="50%"}
::: nonincremental
- **Old syntax without matrix:**
:::
```{stan output.var='display', eval = FALSE, tidy = FALSE}
parameters {
real beta0;
real betaHeight;
real betaGroup2;
real betaGroup3;
real betaHxG2;
real betaHxG3;
real<lower=0> sigma;
}
```
::::
:::: {.column width="50%"}
::: nonincremental
- **New syntax with matrix:**
:::
```{stan output.var='display', eval = FALSE, tidy = FALSE}
parameters {
vector[P] beta; // vector of coefficients for Beta
real<lower=0> sigma; // residual standard deviation
}
```
::::
:::::::
## Syntax Changes: Prior Distributions Definition
::::::: columns
:::: {.column width="50%"}
::: nonincremental
- **Old syntax without matrix:**
:::
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| code-line-numbers: "|2-8|9-12"
model {
beta0 ~ normal(0,100);
betaHeight ~ normal(0,100);
betaGroup2 ~ normal(0,100);
betaGroup3 ~ normal(0,100);
betaHxG2 ~ normal(0,100);
betaHxG3 ~ normal(0,100);
sigma ~ exponential(.1); // prior for sigma
weightLB ~ normal(
beta0 + betaHeight * height60IN + betaGroup2 * group2 +
betaGroup3*group3 + betaHxG2*heightXgroup2 +
betaHxG3*heightXgroup3, sigma);
}
```
::::
:::: {.column width="50%"}
**New syntax with matrix:**
::: nonincremental
- `multi_normal()` is the multivariate normal sampling in Stan, similar to `rmvnorm()` in R; For uninformative, we did not need to specify
- `exponential()` is the exponential distribution sampling in Stan, similar to `rexp()` in R
:::
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| code-line-numbers: "|3"
model {
sigma ~ exponential(sigmaRate); // prior for sigma
weightLB ~ normal(X*beta, sigma); // linear model
}
```
::::
:::::::
------------------------------------------------------------------------
### A little more about exponential distribution
- The mean of the exp. distribution is $\frac{1}{\lambda}$, where $\lambda$ is called **rate parameter**
- The variance of the exp. distribution is $\frac{1}{\lambda^2}$
- It is typically positive skewed (skewness is 2)
- Question: which hyperparameter rate $\lambda$ is most informative/uninformative
```{r}
#| echo: true
#| output-location: slide
#| label: fig-pdf-exp
#| fig-cap: PDF for the exponential distribution by varied rate parameters
#| fig-cap-location: top
#| code-line-numbers: "1-3|4|5|7-16"
library(tidyr)
library(dplyr)
library(ggplot2)
rate_list = seq(0.1, 1, 0.2)
pdf_points = sapply(rate_list, \(x) dexp(seq(0, 20, 0.01), x)) |> as.data.frame()
colnames(pdf_points) <- rate_list
pdf_points$x = seq(0, 20, 0.01)
pdf_points %>%
pivot_longer(-x, values_to = 'y') %>%
mutate(
sigmaRate = factor(name, levels = rate_list)
) %>%
ggplot() +
geom_path(aes(x = x, y = y, color = sigmaRate, group = sigmaRate), size = 1.2) +
scale_x_continuous(limits = c(0, 20)) +
labs(x = "Sigma")
```
------------------------------------------------------------------------
### Since we talked about Exponential distribution...
::::: columns
::: {.column width="50%"}
Let's dive deeper into Laplace distribution. It is sometimes called double-exponential distribution. Exponential distribution is positive part of Laplace distribution.
$$
\text{PDF}_{exp.} = \lambda e^{-\lambda x}
$$
$$
\text{PDF}_{laplace} = \frac{1}{2b} e^{-\frac{|x - u|}{b}}
$$
- Thus, we know that for x \> 0, exponential distribution is a special case of Laplace distribution with scale parameter $b$ as $\frac{1}{\lambda}$ and location parameter as 0.
- Laplace-based distribution, Cauchy, and Horseshoe distribution all belong to so-called "**shrinkage**" priors.
:::
::: {.column width="50%"}
- Shrinkage priors will be very useful for high-dimensional data (say P = 1000) and variable selection
```{r}
library(LaplacesDemon)
b_list = 1 / rate_list * 2
pdf_points = sapply(b_list, \(x) dlaplace(seq(-20, 20, 0.01), scale = x, location = 0)) |> as.data.frame()
colnames(pdf_points) <- round(b_list, 2)
pdf_points$x = seq(-20, 20, 0.01)
pdf_points %>%
pivot_longer(-x, values_to = 'y') %>%
mutate(
scale = factor(name, levels = round(b_list, 2))
) %>%
ggplot() +
geom_path(aes(x = x, y = y, color = scale, group = scale), size = 1.2) +
scale_x_continuous(limits = c(-20, 20)) +
labs(x = "")
```
:::
:::::
------------------------------------------------------------------------
## Compare results and computational time
::::: columns
::: {.column width="50%"}
```{r}
#| code-fold: true
#| results: hide
mod_full_old <- cmdstan_model("Code/FullModel_Old.stan")
data_full_old <- list(
N = nrow(dat),
weightLB = dat$WeightLB,
height60IN = dat$HeightIN60,
group2 = as.numeric(dat$DietGroup == 2),
group3 = as.numeric(dat$DietGroup == 3),
heightXgroup2 = as.numeric(dat$DietGroup == 2) * dat$HeightIN60,
heightXgroup3 = as.numeric(dat$DietGroup == 3) * dat$HeightIN60
)
fit_full_old <- mod_full_old$sample(
data = data_full_old,
seed = 1234,
chains = 4,
parallel_chains = 4,
refresh = 0
)
```
```{r}
fit_full_old$summary()[, -c(9, 10)]
```
:::
::: {.column width="50%"}
```{r}
#| code-fold: true
#| results: hide
mod_full_new <- cmdstan_model("Code/FullModel_New.stan")
FullModelFormula = as.formula("WeightLB ~ HeightIN60 + DietGroup + HeightIN60*DietGroup")
X = model.matrix(FullModelFormula, data = dat)
data_full_new <- list(
N = nrow(dat),
P = ncol(X),
X = X,
weightLB = dat$WeightLB,
sigmaRate = 0.1
)
fit_full_new <- mod_full_new$sample(
data = data_full_new,
seed = 1234,
chains = 4,
parallel_chains = 4
)
```
```{r}
fit_full_new$summary()[, -c(9, 10)]
```
:::
:::::
The differences between two method:
- `betaGroup3` has the largest differences between two methods
```{r}
cbind(fit_full_old$summary()[,1], fit_full_old$summary()[, -c(1, 9, 10)] - fit_full_new$summary()[, -c(1, 9, 10)])
```
## Compare computational time
- The Stan code with matrix has faster computation:
::::: columns
::: {.column width="50%"}
```{r}
fit_full_old$time()
```
:::
::: {.column width="50%"}
```{r}
fit_full_new$time()
```
:::
:::::
- Pros: With matrices, there is less syntax to write
- Model is equivalent
- More efficient for sampling (sample from matrix space)
- More flexible: modify matrix elements in R instead of individual parameters in Stan
- Cons: Output, however, is not labeled with respect to parameters
- May have to label output
## Computing Functions of Parameters
- Often, we need to compute some linear or non-linear function of parameters in a linear model
- Missing effects - beta for diet group 2 and 3
- Model fit indices: $R^2$
- Transformed effects - residual variance $\sigma^2$
- In non-Bayesian (frequentist) analyses, there are often formed with the point estimates of parameters (with standard errors - second derivative of likelihood function)
- For Bayesian analyses, however, we seek to build the posterior distribution for any function of parameters
- This means applying the function to all posterior samples
- It is especially useful when you want to propose your new statistic
------------------------------------------------------------------------
### Example: Need Slope for Diet Group 2
Recall our model:
$$
\text{WeightLB}_p = \beta_0 + \beta_1 \text{HeightIN}_p + \beta_2 \text{Group2}_p + \beta_3\text{Group3}_p \\
+\beta_4 \text{HeightIN}_p\text{Group2}_p \\
+\beta_5 \text{HeightIN}_p\text{Group3}_p \\
+ e_p
$$
Here, $\beta_1$ denotes the average change in $\text{WeightLB}$ with one-unit increase in $\text{HeightIN}$ for members in the reference group— Diet Group 1.
Question: What about the slope for members in Diet Group 2.
- Typically, we can calculate by hand by assign $\text{Group2}$ as 1 and all effects regarding $\text{HeightIN}$:
$$
\beta_{\text{group2}}*\text{HeightIN} = (\beta_1 + \beta_4*1 + \beta_5*0)*\text{HeightIN}
$$
$$
\beta_{\text{group2}}= \beta_1 +\beta_4
$$
- Similarly, the intercept for Group2 - the average mean of $\text{WeightLB}$ is $\beta_0 + \beta_2$.
------------------------------------------------------------------------
### Computing slope for Diet Group 2
Our task: Create posterior distribution for Diet Group 2
- We must do so for each iteration we've kept from our MCMC chain
- A somewhat tedious way to do this is after using Stan
```{r}
fit_full_new$summary()
beta_group2 <- fit_full_new$draws("beta[2]") + fit_full_new$draws("beta[5]")
summary(beta_group2)
```
------------------------------------------------------------------------
### Computing slope within Stan
Stan can compute these values for us-with the "generated quantities" section of the syntax
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| code-line-numbers: "16-20"
#| code-fold: true
#| code-summary: "Stan code"
data{
int<lower=0> N; // number of observations
int<lower=0> P; // number of predictors (plus column for intercept)
matrix[N, P] X; // model.matrix() from R
vector[N] weightLB; // outcome
real sigmaRate; // hyperparameter: prior rate parameter for residual standard deviation
}
parameters {
vector[P] beta; // vector of coefficients for Beta
real<lower=0> sigma; // residual standard deviation
}
model {
sigma ~ exponential(sigmaRate); // prior for sigma
weightLB ~ normal(X*beta, sigma); // linear model
}
generated quantities{
real slopeG2;
slopeG2 = beta[2] + beta[5];
}
```
The generated quantities block computes values that do not affect the posterior distributions of the parameters–they are computed after the sampling from each iteration
- The values are then added to the Stan object and can be seen in the summary
- They can also be plotted using `bayesplot` package
```{r}
#| results: hide
mod_full_compute <- cmdstan_model("Code/FullModel_compute.stan")
fit_full_compute <- mod_full_compute$sample(
data = data_full_new,
seed = 1234,
chains = 4,
parallel_chains = 4,
refresh = 0
)
```
```{r}
fit_full_compute$summary('slopeG2')
```
------------------------------------------------------------------------
```{r}
bayesplot::mcmc_dens_chains(fit_full_compute$draws('slopeG2'))
```
------------------------------------------------------------------------
### Alternative way of computing the slope with Matrix
This is a little more complicated but more flexible method.
That is, we can make use of matrix operation and form a contrast matrix
- Contrasts are linear combinations of parameters
- You may have used these in R using `glht` package
For use, we form a contrast matrix that is size of $C \times P$ where C is the number of contrasts:
- The entries of this matrix are the values that multiplying the coefficients
- For $(\beta_1 + \beta_2)$ this would be:
- a "1" in the corresponding entry for $\beta_1$
- a "1" in the corresponding entry for $\beta_4$
- "0"s elsewhere
- $$
\mathbf{C} = \begin{bmatrix}
0\ \mathbf{1}\ 0\ 0\ \mathbf{1}\ 0
\end{bmatrix}
$$
- Then, the contrast matrix is multiplied by the coefficients vector to form the values:
- $\mathbf{C} * \beta$
------------------------------------------------------------------------
### Contrasts in Stan
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| code-line-numbers: "7-8,18-21"
#| code-fold: true
#| code-summary: "Stan code"
data{
int<lower=0> N; // number of observations
int<lower=0> P; // number of predictors (plus column for intercept)
matrix[N, P] X; // model.matrix() from R
vector[N] weightLB; // outcome
real sigmaRate; // hyperparameter: prior rate parameter for residual standard deviation
int<lower=0> nContrasts;
matrix[nContrasts, P] contrast; // C matrix
}
parameters {
vector[P] beta; // vector of coefficients for Beta
real<lower=0> sigma; // residual standard deviation
}
model {
sigma ~ exponential(sigmaRate); // prior for sigma
weightLB ~ normal(X*beta, sigma); // linear model
}
generated quantities{
vector[nContrasts] computedEffects;
computedEffects = contrast*beta;
}
```
```{r}
#| code-line-numbers: "2-9,11"
#| code-fold: true
#| code-summary: "R code"
#| output: false
mod_full_contrast <- cmdstan_model("Code/FullModel_contrast.stan")
contrast_dat <- list(
nContrasts = 2,
contrast = matrix(
c(0,1,0,0,1,0, # slope for diet group2
1,0,1,0,0,0),# intercept for diet group 2
nrow = 2, byrow = TRUE
)
)
fit_full_contrast <- mod_full_contrast$sample(
data = c(data_full_new, contrast_dat),
seed = 1234,
chains = 4,
parallel_chains = 4,
refresh = 0
)
```
```{r}
fit_full_contrast$summary('computedEffects')[, -c(9, 10)]
```
```{r}
bayesplot::mcmc_hist(fit_full_contrast$draws('computedEffects'))
```
------------------------------------------------------------------------
### Computing $\text{R}^2$
We can use the `generated quantities` section to build a posterior distribution for $\text{R}^2$
There are several formulas for $\text{R}^2$, we will use the following:
$$
\text{R}^2 = 1-\frac{RSS}{TSS} = 1- \frac{\Sigma_{p=1}^{N}(y_p -\hat{y}_p)}{\Sigma_{p=1}^{N}(y_p -\bar{y}_p)}
$$ Where:
1. RSS is the residual sum of squares
2. TSS is the total sum of squares of dependent variable
3. $\hat{y}_p$ is the predicted values: $\hat{y}_p = \mathbf{X}\boldsymbol{\beta}$
4. $\bar{y}_p$ is the mean value of dependent variable: $\bar{y}_p = \frac{\Sigma_{p=1}^{N}y_p}{N}$
Notice: RSS depends on sampled parameters, so we will use this to build our posterior distribution for $\text{R}^2$
------------------------------------------------------------------------
For adjusted $\text{R}^2$, we use the following:
$$
\text{adj.R}^2 = 1-\frac{RSS/(N-P)}{TSS/(N-1)} = 1- \frac{\Sigma_{p=1}^{N}(y_p -\hat{y}_p)}{\Sigma_{p=1}^{N}(y_p -\bar{y}_p)}*\frac{N-P}{N-1}
$$
Then, we can calculate the how to calculate $\text{adj.R}^2$ by $\text{R}^2$:
$$
\text{adj.R}^2 = 1-(1-\text{R}^2)*\frac{N-P}{N-1} = \frac{(P-1)+(N-1)R^2}{N-P}
$$
------------------------------------------------------------------------
### `Stan` code for Computing $\text{R}^2$
```{stan output.var='display', eval = FALSE, tidy = FALSE}
#| code-fold: true
#| code-summary: "Stan code"
#| code-line-numbers: "4-15"
generated quantities{
vector[nContrasts] computedEffects;
computedEffects = contrast*beta;
// compute R2
real rss;
real tss;
real R2;
real R2adj;
{// anything in these brackets will not appear in summary table
vector[N] pred = X*beta;
rss = dot_self(weightLB-pred); // dot_self is stan function for matrix square
tss = dot_self(weightLB-mean(weightLB));
}
R2 = 1-rss/tss;
R2adj = 1-(rss/(N-P))/(tss/(N-1));
}
```
Recall that our `lm` function provides $\text{R}^2$ as 0.9787 and adjusted $\text{R}^2$ as 0.9742
```{r}
fit_full_contrast$summary(c('rss', 'tss', 'R2','R2adj'))[, -c(9, 10)]
bayesplot::mcmc_hist(fit_full_contrast$draws(c('R2', 'R2adj')))
```
------------------------------------------------------------------------
### Get posterior mode
```{r}
# Create the function.
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}
# Calculate the mode using the user function.
getmode(fit_full_contrast$draws('R2'))
getmode(fit_full_contrast$draws('R2adj'))
```
## Wrapping up
Today we further added generated quantities into our Bayesian toolset:
- How to make `Stan` use less syntax using matrices
- How to form posterior distributions for functions of parameters
We will use both of these features in psychometric models.
## Next Class
1. Bayesian Model fit
2. Bayesian Model Comparison