Respiratory status data set

In this data set, 111 patients have been administrated a placebo or an active treatment. At randomization and at four visits during the treatment, their respiratory status was determined as being “poor” or “good”, which constitutes the categorical output. Covariates such as center, sex and age were also recorded. The goal was to evaluate the effect of the treatment on the respiratory status.

The data originates from Davis, C. S. (1991). Semi-parametric and non-parametric methods for the analysis of repeated measurements with applications to clinical trials. Statistics in Medicine, 10(12), 1959–80. It can be downloaded here.

Below we show a snapshot of the data set:

Note that in the MonolixSuite the output categories must be coded as integers. This is why we have created the column statusInteger were the respiratory status is coded as 0 for “poor” and 1 for “good”. For individual 1 on placebo, the respiratory status is poor at randomization and remains so during the 4 months. For individual 12 on treatment, the respiratory status is poor at randomization and improves to good during the first three months before deteriorating again to poor at month 4.

The definition of the columns is the following:

  • subject: patient identifier, column-type ID
  • month: month after treatment start, column-type TIME
  • statusInteger: respiratory status coded as integer, 0 for “poor” and 1 for “good”, column-type Y
  • status: respiratory status coded as strings “poor” and “good”, column to be ignored
  • treatment: type of treatment, active or placebo, column-type CAT
  • sex: sex of the patient, female or name, column-type CAT
  • age: age of the patient (in years), column-type COV
  • centre: index of the study center, 1 or 2, column-type CAT

Several points can be noticed:

  1. The categories must be coded as integers.
  2. There are respiratory status measures for each individual, the month column allows to define at which time the measures were done.
  3. ID and TIME column are mandatory. Thus, even when there is only one measurement per individual, an additional column with TIME should be added (full of 0 for example).
  4. Covariates must be constant within subjects (or subject-occasions when occasions are defined).
  5. In this example two categories are present (“good” and “poor”), but any number of categories is possible.

When loading this data set into Datxplore, one can easily visualize the number of individuals with “poor” (coded as 0, in back) or “good” (coded as 1, in orange) respiratory status over time in the case of placebo (left) or active treatment (right):

The Datxplore project can be downloaded here.