### OBSERVATION: response

The OBSERVATION column-type (formerly Y) can be used to record continuous, categorical, count or time-to-event data.

When there is no EVID or MDV column ‘”forcing” the usage of the measurement or the dose), the observation is taken into account if the value is not ‘.’.

**Remarks**

**If there is a non null dose and a value in the response-column, we consider it as both dose and response. Is was formerly considered as a response.****In MonolixSuite version prior to 2018R1, in the case of the definition of both a non null amount and a measurement, the choice was made to favor the measurement**. It is no longer the case. However, providing two distinct lines to provide both a dose-line and a response line is still possible and recommended.

#### For continuous data:

The value represents what has been measured (e.g concentrations) and can be any double value.

*Examples:*

- Basic example:

ID TIME AMT Y 1 0 50 . 1 0.5 . 1.1 1 1 . 9.2 1 1.5 . 8.5 1 2 . 6.3 1 2.5 . 5.5

- Full data set for continuous data: theophylline data set, the warfarin data set, and the HIV data set

#### For categorical data:

In case of categorical data, the observations at each time point can only take values in a fixed and finite set of nominal categories. In the data set, the **output categories must be coded as consecutive integers.**

*Examples:*

- Basic example:

ID TIME Y 1 0.5 3 1 1 0 1 1.5 2 1 2 2 1 2.5 3

- Full data set for joint continuous and categorical data: warfarin data set.

#### For count data:

**Count data can take only non-negative integer values** that come from counting something, e.g., the number of trials required for completing a given task. The task can for instance be repeated several times and the individuals performance followed.

Count data can also represent the number of events happening in regularly spaced intervals, e.g the number of seizures every week. If the time intervals are not regular, the data may be considered as repeated time-to-event interval censored, or the interval length can be given as regressor to be used to define the probability distribution in the model.

*Examples:*

- Basic example: in the data set below, 10 trials are necessary the first day (t=0), 6 the second day (t=24), etc.

ID TIME Y 1 0 10 1 24 6 1 48 5 1 72 2

#### For (repeated) time-to-event data:

In this case, the observations are the “`times at which events occur`

“. An event may be on-off (e.g., death) or repeated (e.g., epileptic seizures, mechanical incidents, strikes). In addition, an event can be exactly observed, interval censored or right censored. The figure below summarizes the different situations:

##### For single events exactly observed:

One must indicated the start time of the observation period with Y=0, and the time of event (Y=1) or the time of the end of the observation period if no event has occurred (Y=0).

*Examples:*

- Basic example: in the following dataset, the observation period last from starting time t=0 to the final time t=80. For individual 1, the event is observed at t=34, and for individual 2, no event is observed during the period. Thus it is noticed that at the final time (t=80), no event occurred.

ID TIME Y 1 0 0 1 34 1 2 0 0 2 80 0

- Full data sets for time-to-event data: PBC data set and Oropharynx data set

##### For repeated events exactly observed:

One must indicate the start time of the observation period (Y=0), the end time (Y=0) and the time of each event (Y=1).

*Examples:*

- Basic example: below the observation period last from starting time t=0 to the final time t=80. For individual 1, two events are observed at t=34 and t=76, and for individual 2, no event is observed during the period.

ID TIME Y 1 0 0 1 34 1 1 76 1 1 80 0 2 0 0 2 80 0

##### For single events interval censored:

When the exact time of the event is not known, but only an interval can be given, the start time of this interval is given with Y=0, and the end time with Y=1. As before, the start time of the observation period must be given with Y=0.

*Examples:*

- Basic example: we only know that the event has happened between t=32 and t=35.

ID TIME Y 1 0 0 1 32 0 1 35 1

##### For repeated events interval censored:

In this case, we do not know the exact event times, but only the number of events that occurred for each individual in each interval of time. The column-type Y can now take values greater than 1, if several events occurred during an interval.

Examples:

ID TIME Y 1 0 0 1 32 0 1 35 1 1 50 1 1 56 0 1 78 2 1 80 1

No event occurred between t=0 and t=32, 1 event occurred between t=32 and t=35, 1 between t=35 and t=50, none between t=50 and t=56, 2 between t=56 and t=78 and finally 1 between t=78 and t=80.

#### Format restrictions

- A data set shall not contain more than one column with column-type OBSERVATION.
- Response-column shall contain double value or string “.”.
- If there is a non null double value in dose-column, there must be a non null double value in the response-column.

#### Warnings

- If a subject or a subject/occasion has no observations, a warning message arises telling which individuals, subjects/individuals have no measurements.

#### FAQ

**My data is not “over time”, what should I do?**You can arbitrarily set the time of each observation to 0.

### OBSERVATION ID: response type (former YTYPE)

If observations are recorded on several quantities (several concentrations, effects, etc), the column-type OBSERVATION ID permits to assign names to the observations of the column-type Y, for mapping with the quantities outputted by the model.** Notice that in case of a dose line, the value in the OBSERVATION ID column will not be read, **thus the user can set any value (‘.’; the same as a concentration, …)** **

Entries in the column-type YTYPE can be strings or integers however, we strongly recommend to use only alphanumeric characters. The underscore “_” character is allowed in the strings of your data set. The mapping of the YTYPE to the model output (in the OUTPUT block of the Mlxtran model file) is done following alphabetical order (and not name matching). In the following data set:

TIME DOSE Y Y_TYPE 0 . 12 conc 5 . 6 conc 10 . 4 effect 15 . 3 effect 20 . 2.1 conc 25 . 2 conc

with the following OUTPUT block in the Mlxtran model file:

OUTPUT: output = {E, Cc}

the observations tagged with “conc” will be mapped to the first output “E”, and those tagged with “effect” will be mapped to the second output “Cc”, because in alphabetical order “conc” comes before “effect”. To avoid confusion, **we recommend to use integers in the OBSERVATION ID column-type**, with “1” corresponding to the first output, “2” to the second, etc… If you have more than 10 types of observations, notice that in alphabetical order “10” comes before “2”.

If you use strings, note that “.” is not considered as a repetition or previous line but as the name of a response. For instance, the following data set creates three different types of responses : “type1”, “.”, and “type2”:

TIME DOSE Y Y_TYPE 0 . 12 type1 5 . 6 type1 10 . 4 . 15 . 3 . 20 . 2.1 type2 25 . 2 type2

*Format restrictions (an exception will be thrown otherwise):*

- A data set shall not contain more than one column with column-type OBSERVATION ID.

### CENSORED: censored response

- CENSORED = 1 means that the value in response-column (), the content of the column with column-type Y) is an upper limit, true observation y verifies .
- CENSORED = 0 means the value in response-column corresponds to a valid observation (no interval associated).
- CENSORED = -1 means that the value in response-column () is a lower bound, true observation y verifies .

*Format restrictions (an exception will be thrown otherwise):*

- A data set shall not contain more than one column with column-type CENSORED.
- There are only three possible values : -1, 0, and 1.
- String “.” is interpreted as 0.

### LIMIT: limit for censored values

When column LIMIT contains a value and CENS is different that 0, then the value in the LIMIT column, it can be interpreted as the second bound of the observation interval. Thus, it implies that .

*Format restrictions (an exception will be thrown otherwise):*

- A data set shall not contain more than one column with column-type LIMIT.
- A data set shall not contain any column with column-type LIMIT if no column with column-type CENS is present.
- Column LIMIT shall contain either a string that can be converted to a double or “.”.

### Example of censored data definition

The proposed example illustrates the case of upper and lower bound on a classical data set of a classical PK model (first order absorption and linear elimination). From the measurements point of view

- There is a lower bound at .5 as the censor is not able to measure lower concentrations, it corresponds to CENS=1 case. Moreover, the concentration can not be lower than 0, thus LIMIT=0.
- There is an upper bound at 5 as the censor is not able to measure higher concentrations, it corresponds to CENS=-1 case. Moreover, from the experimental/modeler point of view, the concentration can not be higher than 6, thus LIMIT=6.

The measurement is represented in the following figure

The measurement corresponds to the blue stars, the real values when censoring arises are in red and green. The corresponding data set is

ID Time Y CENS LIMIT1 0 0.5 1 0 1 1 0.5 1 01 2 4.7 0 01 3 5.0 -1 6 1 4 5.0 -1 61 5 4.5 0 0 1 6 3.8 0 0 * * * * * 1 15 0.6 0 0 1 16 0.5 0 01 17 0.5 1 0 1 18 0.5 1 0 * * * * *

The mathematical handling of censored data is described here.