Classical Test Theory (CTT)

A scale development method where the assumption is that the participants’ responses or overall score on a measure are a linear combination of their true ability plus random error. The goal in CTT is to get as close to the true score as possible by minimizing noise.

Construct

A construct refers to the unobserved (i.e., latent) attitude, cognition, or attribute that is the target of the study. Unobserved (or latent) in this context simply refers to a type of construct that exists in the mind of the participant and cannot easily be directly observed. The term construct can be used interchangeably with other terms such as domain and latent variable.

Concurrent Validity

Measures the degree to which the performance on the current scale predicts performance on a criterion (gold standard) measurement. Typically, the two measures are administered at the same time or consecutively (hence “concurrent”). It is common, however, that no gold standard measure exists making evaluation of concurrent validity impossible; this is especially true in human robot interaction.

Construct Validity

Construct validity refers to the extent to which the scale measures what it was developed to measure and how much it is associated with other factors within the domain.

Convergent Validity

Convergent validity refers to how well the new scale correlates with other variables that are designed to measure similar constructs.

Criterion Validity

Criterion validity refers to the degree to which there is a relationship between the construct on the current scale and construct on another similar measure or in another context that is of interest to the researcher.

Custom Scale

Any scale that has not been validated.

Dimension

A psychological variable that represents a component of the construct that is captured by the items within a scale. This term is also used interchangeably with factor.

Discriminant Validity

Discriminant validity refers to the extent to which the scale differs from other unrelated constructs. Discriminant validity is measured by analyzing correlations between the measure of interest and other measures that do not measure the same domain or concept [6] where weaker correlations are expected.

Domain

A domain refers to the unobserved (i.e., latent) attitude, cognition, or attribute that is the target of the study. It is often used interchangeably with “construct”.

Factor

A psychological variable that represents a component of the construct that is captured by the items within a scale. This term is also used interchangeably with dimension.

Item

An item refers to the direct questions, directives, or statements that make up a scale. Each item within a scale is intended to capture the construct (i.e., attitude or behavior) either in part or in full.

Item Response Theory (IRT)

Item response theory uses an item-level approach to determining item and person fit within the scale.

Latent variable

The unobservable behavior, attitude, or attribute that is being measured. This term is often used interchangeably with construct and domain.

Predictive Validity

Predictive validity is a type of validation method. It measures the degree to which performance on the current scale predicts performance on another scale taken at a later time.

Psychometrics or Psychometric theory

The scientific study of testing, measurement, and assessment int he social and behavioral sciences.

Rasch

One of the more common IRT models is the Rasch model. The Rasch model prioritizes invariance in measurement and can be thought of as a theory for how the data should be structured which can then be used to identify deviations in observed data. In other words, the Rasch model is a process for fitting data to a model.

Reliability

Reliability refers to the principle that a measurement produces similar results under similar conditions.

Scale

The term “scale” refers to any instrument that measures a behavior, attitude, or other latent construct that isn’t directly observable.

Subscale

Subscales refer to complete sets of items that load onto one factor in an existing validated scale. For example, the competence subscale in the RoSAS consists of six items that are related to the intelligence or ability of the robot.