Stats blog 5 (Correlated data, Maximum likelihood and Linear mixed models)

Helen Li
2 min readMar 24, 2021

Vocabulary (related to correlated data)

  1. Fixed effects: These are non-random quantities. All the coefficients we were estimating in STA302 are the examples of fixed effects as we were not treating beta_1 as a random variable.
  2. Random effects: These are random quantities. These model parameters are treated as random variables.
  3. Mixed effects model: A model that includes both fixed and random effects as its parameters. These are also called hierarchical models and just mixed models. They are not the same as mixed methods, which is a ‘mix’ of quantitative and qualitative research methodology.
  4. Nested/nesting design: Observational units are grouped within grouping units. There may be multiple levels of grouping.
  5. Crossed effect design: Every observational unit experiences every level of the treatment variable.
  6. Observational units: The person or thing on which our outcome of interest is measured. In an experiment, we might also call this the ‘experimental unit’ or some might say ‘statistical unit’.
  7. Grouping units: How our observational units are grouped together. Some are referred to as ‘level-two observational units’, but we could have even more levels of grouping. Groups within groups, etc.

Maximum likelihood

Recall:

  1. Likelihood helps us understand how well our model fits our data.
  2. Maximizing the likelihood function find estimate the coefficient values for our model that make the data we actually observed the most likely.

Properties of maximum likelihood estimators:

For large sample sizes:

  1. Bias goes to 0 (as sample size increases)
  2. Approximately minimum variance
  3. Approximately normal distribution (usually)

Another nice features of MLEs are that they are “invariant” under transformation.

Linear mixed models (LMMs)

Recall: Linear regression assumptions

  1. Errors are independent (observations are independent)
  2. Errors are identically distributed and the expected value of the errors is zero (i.e. the expectation of errors is zero)
  3. Constant variance (homoscedasticity)
  4. A straight-line relationship exits between the errors and responses.

Linear mixed models assumptions:

  1. There is a continuous response variable
  2. We have modelled the dependency structure correctly (i.e. made correct choices about our random variables)
  3. Our units/subjects are independent, even though observations within each subject are taken not to be
  4. Both the random effects and within-unit residual errors follow normal distributions
  5. The random effects errors and within-unit residual errors have constant variance

From the above list, we can find out that the linear mixed models can not assume independence since multiple responses from the same subject can not be regarded as independent from each other.

--

--