Abstract

This study analyzes survival outcomes for patients with squamous cell carcinoma of the oropharynx, using data from a multicenter clinical trial conducted by the Radiation Therapy Oncology Group. A total of 195 patients from six major institutions were randomized to receive either radiation therapy alone or radiation therapy combined with a chemotherapeutic agent. The primary objective was to compare the effectiveness of the two treatment strategies while accounting for other prognostic factors.

A Cox proportional hazards model was used to evaluate the relationship between survival time and clinical/demographic covariates, including age, sex, tumor stage (T and N), general condition, tumor grade, and treatment group. Approximately 30% of patients were censored due to being alive at last follow-up. Forward stepwise regression and interaction modeling were applied to identify the best-fitting model, and diagnostics were performed to validate model assumptions.

Results indicated that treatment type was not statistically significant after adjusting for other variables. However, tumor staging, general condition, and certain interaction terms demonstrated a meaningful effect on survival. The final model incorporated interaction effects to address violations of the proportional hazards assumption, improving interpretability and validity. These findings underscore the importance of considering both individual covariates and their interactions when modeling time-to-event outcomes in cancer survival studies.

Introduction

Pharyngeal cancer represents a serious public health concern, affecting a substantial number of individuals worldwide. Its prognosis and treatment outcomes are often influenced by a complex interplay of tumor characteristics, patient-specific factors, and therapeutic strategies. In clinical research, survival analysis plays a pivotal role in assessing the effectiveness of such treatments by evaluating the time duration between a defined starting point—often diagnosis or treatment initiation—and the occurrence of a terminal event, typically death.

This study presents a survival analysis of patients diagnosed with squamous cell carcinoma of the oropharynx, utilizing data contributed by six major institutions involved in a broader multi-center clinical trial. Patients were randomly assigned to one of two treatment arms:

Radiation therapy alone, or
Radiation therapy combined with chemotherapy.

The core objective of this analysis is to evaluate whether the combined treatment improves survival outcomes relative to radiation alone, while controlling for a series of clinical and demographic covariates.

In addition to treatment type, several patient and tumor-related attributes are hypothesized to impact survival outcomes. These include:

Sex: Male or Female
T staging: Describes the size and local extent of the primary tumor
- T=1: Small tumor (≤2 cm)
- T=4: Extensive involvement of adjacent tissues
N staging: Captures the extent of regional lymph node involvement
- N=0: No nodes affected
- N=3: Severe nodal spread
Age: Continuous variable, evaluated using piecewise specification
General condition: Clinical estimate of the patient’s functional status at diagnosis
Tumor grade: Reflects differentiation and aggressiveness
Treatment (Tx):
- Tx = 1: Standard radiation therapy
- Tx = 2: Combined radiation + chemotherapy

To ensure a relatively homogeneous clinical cohort, patients with early-stage disease (e.g., T=1, N=0; T=2, N=0) or distant metastases were excluded. Nonetheless, considerable clinical heterogeneity remains, as the included population varies in tumor burden and baseline health status.

Approximately 30% of survival times are censored, primarily because patients were still alive at the time of the final follow-up. A smaller fraction of patients were lost to follow-up, typically due to relocation or transfer to non-participating institutions.

While the primary focus of the study is to assess differences in survival between treatment arms, a secondary objective is to identify which covariates exert the most substantial influence on survival. These findings can inform future clinical decision-making and support more personalized treatment strategies.

To conduct this analysis, we apply the Cox proportional hazards model, an established method for modeling time-to-event data. We begin with exploratory data analysis, followed by stepwise model development, evaluation of the proportional hazards (PH) assumption, and the incorporation of interaction terms where appropriate. The results provide insight into both treatment efficacy and the role of patient- and tumor-specific factors in shaping survival outcomes.

The subsequent sections detail the exploratory data analysis, model development, proportional hazards diagnostics, and interpretation of key findings, ultimately guiding conclusions about the impact of treatment and covariates on patient survival outcomes.

Data Cleaning

The dataset required minor preprocessing to prepare it for analysis.

The following steps were taken:

Recode Sex:
The variable Sex was recoded from numeric codes (1 = Male, 2 = Female) into descriptive character labels, “Male” and “Female”.
Handle Missing and Invalid Values:
- For Cond (patient condition), values of 0 and 9 were treated as invalid or missing and replaced with NA.
- For Grade (tumor differentiation), values of 9 were treated as missing and replaced with NA.
- Missing values in Cond and Grade were imputed using their median values, which maintains the ordinal nature of these variables.

Exploratory Data Analysis (EDA)

1. Data Overview

We analyzed data from 195 patients diagnosed with head and neck cancer, with survival time measured in days from diagnosis.

Table 1 summarizes the variables included in the dataset.

Table 1: Description of the Pharynx Dataset

Variable	Description
`Time`	Survival time (days)
`Status`	Event indicator (1 = death, 0 = censored)
`Sex`	Male or Female
`Tx`	Treatment group (Standard or Test)
`Grade`	Tumor differentiation (1 = well differentiated, 2 = moderately differentiated, 3 = poorly differentiated, 9 = missing)
`Age`	Age in years at time of diagnosis
`Cond`	Condition at diagnosis (1 = no disability, 2 = restricted work, 3 = requires assistance with self-care, 4 = bed-confined, 9 = missing)
`Site`	Tumor location (Faucial arch, Tonsillar fossa, Posterior pillar, Pharyngeal tongue, Posterior wall)
`T_Stage`	Primary tumor stage (1–4)
`N_Stage`	Node metastasis stage (0–3)
`Inst`	Institution code
`Case`	Patient identifier

2. Age Distribution

To understand the age profile of patients, we plotted the distribution of age at diagnosis (Figure 1).
The mean age was 60.44 years, and the median age was 60 years, with ages ranging from 20 to 90 years.

Figure 1: Distribution of patient age at diagnosis.

3. Participant Demographics: Age and Gender

Figure 2 provides an overview of age distribution and gender composition among the study participants.

Gender distribution: The study population consists of 149 males (76%) and 46 females (24%), highlighting a male-dominated cohort.
Age distribution: Both males and females exhibit similar age profiles, with most patients clustered between 50 and 70 years of age.
Age density by gender: The density plots show a slightly higher concentration of males between 60–70 years compared to females.
Boxplots: Median ages for males and females are close, suggesting no major age difference across gender groups.

Figure 2: Age and gender overview of participants, including density plots, boxplots, and gender counts.

4. Event Time Distribution by Status

Figure 3 shows the distribution of survival times (in days), separated by observed events (deaths) and censored cases.

Events (Dead): The median event time was 325.5 days, indicating that half of the deaths occurred before approximately 11 months from diagnosis.
Censored cases: The median censored time was 943 days, meaning patients who survived or were lost to follow-up had a much longer observation period.
Pattern observation: Most deaths occurred early (within the first 500 days), while censored cases dominate after 800 days.

Figure 3: Distribution of survival time by status (event vs. censored).

5. Kaplan–Meier Survival by Sex

Figure 4 illustrates the Kaplan–Meier (KM) survival curves stratified by sex. Females (n = 46) appear to have slightly higher survival probabilities compared to males (n = 149) during the first year.

Figure 4: Kaplan–Meier survival curves by sex with 95% confidence intervals and risk tables.

5.1 1-Year Survival Probability

To complement the Kaplan–Meier curves, Table 2 reports the 1-year survival probabilities for both sexes, along with 95% confidence intervals derived using Greenwood’s formula.

Table 2: 1-year survival probabilities by sex (Kaplan–Meier estimates with 95% CI).

Characteristic	1-year survival (95% CI)
Gender
Female	63% (51%, 79%)
Male	56% (49%, 65%)

5.2 Mathematical Approach

The Kaplan–Meier estimator for the survival function $S(t)$ is:

\[ \widehat{S}(t) = \prod_{t_{(i)} \le t} \left( 1 - \frac{d_i}{n_i} \right), \]

where:
- $t_{(i)}$ are the ordered event times,
- $d_i$ is the number of events (deaths) at time $t_{(i)}$,
- $n_i$ is the number of individuals at risk just prior to $t_{(i)}$.

The variance of $\widehat{S}(t)$ is approximated using Greenwood’s formula:

\[ \widehat{\mathrm{Var}}\{ \widehat{S}(t) \} = \widehat{S}(t)^2 \sum_{t_{(i)} \le t} \frac{d_i}{n_i (n_i - d_i)}. \]

Confidence intervals are computed on the log–log scale for stability:

\[ \log \{-\log \widehat{S}(t)\} \pm z_{1-\alpha/2} \sqrt{ \frac{ \sum_{t_{(i)} \le t} \frac{d_i}{n_i (n_i - d_i)} } { [\log \widehat{S}(t)]^2 } }, \]

and then back-transformed to the survival scale.

5.3 Interpretation

1-year survival estimates:
- Females: 63% (95% CI: 51%–79%)
- Males: 56% (95% CI: 49%–65%)

These results suggest that females have a modest survival advantage compared to males at 1 year. However, Kaplan–Meier estimates are unadjusted and do not account for potential confounding by other variables (e.g., age, tumor stage, condition).

To formally test for group differences, the log-rank test is used, which compares the observed and expected event counts across groups under the null hypothesis of identical survival curves.

The Kaplan–Meier analysis serves as an initial univariate exploration, motivating the use of a Cox proportional hazards model to evaluate the combined effects of multiple covariates.

Model Development

1. Objective and Modeling Framework

Our objective was to quantify the effect of clinical and demographic covariates on time to death. We used the Cox proportional hazards (PH) model, which specifies the hazard for subject $i$ with covariate vector $X_i$ at time $t$ as:

\[ h(t \mid X_i) = h_0(t)\exp\left( \beta^\top X_i \right), \]

where $h_0(t)$ is an unspecified baseline hazard, and $\beta$ is a vector of log–hazard ratios (log‑HRs). Estimation proceeds by maximizing the partial likelihood:

\[ L(\beta) = \prod_{i \in \mathcal{D}} \frac{\exp(\beta^\top X_i)} {\sum_{j \in R(t_i)} \exp(\beta^\top X_j)}, \]

where $\mathcal{D}$ is the set of subjects who experienced the event, and $R(t_i)$ is the risk set at time $t_i$. Hypothesis tests and confidence intervals are based on the score, Wald, or likelihood ratio statistics derived from this likelihood.

2. Candidate Covariates

The following covariates were considered for model selection:

Age (continuous, in years),
Sex (Male/Female),
Tx (Treatment group: Standard or Test),
Grade (1 = well differentiated, 2 = moderately differentiated, 3 = poorly differentiated; 9 = missing),
Cond (1 = no disability, 2 = restricted work, 3 = requires assistance with self-care, 4 = bed-confined; 9 = missing),
Site (tumor location: 1 = Faucial arch, 2 = Tonsillar fossa, 3 = Posterior pillar, 4 = Pharyngeal tongue, 5 = Posterior wall),
T_Stage (tumor size: 1–4),
N_Stage (nodal metastasis: 0–3).

3. Functional Form Assessment with Martingale Residuals

To evaluate whether Age had a linear effect on the log-hazard, we examined martingale residuals from a Cox model excluding Age. Martingale residuals for subject $ i $ are defined as:

\[ M_i = \delta_i - \widehat{H}_0(T_i) \exp(\hat{\beta}^\top X_i), \]

where:

$\delta_i$ is the event indicator (1 = death, 0 = censored),
$\widehat{H}_0(T_i)$ is the estimated baseline cumulative hazard,
$X_i$ is the covariate vector for subject $i$.

Figure 5: Martingale residual plot from ggcoxfunctional() for Age, showing departure from linearity.

The martingale residual plot (Figure 5) suggests that the relationship between Age and the log-hazard is not perfectly linear, particularly for patients older than approximately 46 years. To address this non-linearity, we compared three Age specifications:

Model 1: Age as a continuous linear term.
Model 1.1: Piecewise Age with an interaction $\mathrm{Age} \times I(\mathrm{Age} > 46)$.
Model 1.2: Dichotomized Age using $I(\mathrm{Age} > 46)$ only.

Now, the AIC values (Table 3) indicate that Model 1.1 (AIC = 1322.379) provides the best fit among these, offering a better trade-off between model complexity and goodness-of-fit.

Conclusion: We selected the piecewise Age model

\[\mathrm{Age} + I(\mathrm{Age} > 46) + \mathrm{Age} \times I(\mathrm{Age} > 46)\]

for all subsequent modeling steps, as it best captures the observed non-linearity without oversimplifying Age into a single binary variable.

Table 3: AIC comparison of different Age specifications.

Model	df	AIC
Model 1	1	1325.886
Model 1.1	3	1322.379
Model 1.2	1	1322.532

4. Multivariable Model Development

To evaluate the combined effects of all clinical and demographic covariates, we initially fit a full Cox proportional hazards model including the piecewise Age specification and all other predictors:

full_model <- coxph(Surv(Time, Status) ~ Age * I(Age > 46) + Sex + Tx + Grade + Cond + Site + T_Stage + N_Stage, data = df)

Table 4: Full Cox proportional hazards model: Estimated hazard ratios and 95% confidence intervals.

Characteristic	HR	95% CI	p-value
Age	0.86	0.75, 0.97	0.017
I(Age > 46)
FALSE	—	—
TRUE	0.00	0.00, 0.77	0.040
Sex
Female	—	—
Male	1.30	0.87, 1.94	0.2
Tx
Standard	—	—
Test	1.11	0.79, 1.57	0.5
Grade	0.86	0.66, 1.12	0.3
Cond	2.37	1.75, 3.21	<0.001
Site	0.95	0.82, 1.10	0.5
T_Stage	1.34	1.06, 1.69	0.015
N_Stage	1.15	0.99, 1.33	0.060
Age * I(Age > 46)
Age * TRUE	1.16	1.02, 1.32	0.024
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

As shown in Table 4, the model incorporates all key predictors and allows the effect of Age to differ before and after age 46.

4.1 Model Comparison with Age-Only Model

To evaluate the contribution of additional covariates beyond Age, we compare the AIC of the Full Model to that of the best-fitting Age-only specification (Model 1.1):

Table 5: AIC comparison of Age-only model (Model 1.1) and Full Model.

Model	df	AIC
Model 1.1	3	1322.379
Full Model	10	1294.149

As shown in Table 5, the Full Model achieves a lower AIC, indicating a better fit than the Age-only model. This supports the inclusion of clinical and demographic covariates in explaining survival outcomes.

4.2 Proportional Hazards Check for Full Model

We test the proportional hazards (PH) assumption using Schoenfeld residuals:

Table 6: Schoenfeld residual test for the proportional hazards (PH) assumption for the Full Model.

Variable	chisq	df	P-Value
Age	1.38	1	0.24
I(Age > 46)	0.03	1	0.87
Sex	0.00	1	1.00
Tx	0.66	1	0.42
Grade	0.03	1	0.86
Cond	5.22	1	0.02
Site	0.50	1	0.48
T_Stage	3.51	1	0.06
N_Stage	3.40	1	0.07
Age:I(Age > 46)	0.42	1	0.51
GLOBAL	14.78	10	0.14

Interpretation:

The GLOBAL test ($\chi^2 = 14.78$, p = 0.14) suggests no overall violation of the PH assumption for the Full Model.
However, Cond shows a p-value of 0.02, indicating potential non-proportionality.
T_Stage (p = 0.06) and N_Stage (p = 0.07) are borderline.

4.3 Stepwise Model Selection (Forward)

To build a more parsimonious model while retaining the piecewise Age specification, we applied forward stepwise AIC selection, beginning with Model 1.1.

step_forward <- stepAIC(m1.1, 
                        scope = ~ Age * I(Age > 46) + Sex + Tx + Grade + Cond + Site + T_Stage + N_Stage, 
                        direction = "forward", 
                        trace = FALSE)

Table 7: Forward stepwise AIC-selected Cox model: Estimated hazard ratios and 95% confidence intervals.

Characteristic	HR	95% CI	p-value
Age	0.86	0.76, 0.98	0.027
I(Age > 46)
FALSE	—	—
TRUE	0.01	0.00, 1.30	0.062
Cond	2.40	1.78, 3.23	<0.001
T_Stage	1.38	1.10, 1.74	0.006
N_Stage	1.12	0.97, 1.29	0.11
Sex
Female	—	—
Male	1.34	0.90, 2.00	0.2
Age * I(Age > 46)
Age * TRUE	1.15	1.01, 1.31	0.037
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

Interpretation of Current Model:

Several predictors show strong or borderline associations with survival, including Age (p = 0.027), Cond (p < 0.001), and T_Stage (p = 0.006).
The interaction term Age × I(Age > 46) remains significant, supporting the non-linear specification of Age.
While N_Stage and Sex are not significant at conventional levels, they were retained for model fit under the AIC framework.

4.4 AIC Comparison

We now compare the AIC of the Full Model and the Forward Model:

Table 8: AIC comparison between the Full Cox model and the Forward AIC-selected model.

Model	df	AIC
Full Model	10	1294.149
Forward Model	7	1289.970

As shown in Table 8, the Forward Model achieves a lower AIC (1289.97) than the Full Model (1294.15), indicating a better fit with fewer predictors.

4.5 Proportional Hazards Check for Forward Model

We test the PH assumption for the Forward Model:

Table 9: Schoenfeld residual test results for the Forward Cox model.

Variable	chisq	df	P-Value
Age	1.53	1	0.22
I(Age > 46)	0.04	1	0.84
Cond	4.98	1	0.03
T_Stage	3.55	1	0.06
N_Stage	3.65	1	0.06
Sex	0.00	1	0.96
Age:I(Age > 46)	0.44	1	0.51
GLOBAL	13.44	7	0.06

As shown in Table 9, the GLOBAL test yields p = 0.06, indicating no strong evidence of violation in the PH assumption overall.

However:

Cond exhibits possible non-proportionality (p = 0.03), suggesting its effect may vary over time.
All other variables meet the proportional hazards assumption.

We next explore Schoenfeld residual plots to visually assess potential time-varying effects.

Figure 6: Schoenfeld residual plots for the Forward Model. Most covariates show approximately flat trends over time, except for Cond, which displays potential deviation from the PH assumption.

As visualized in Figure 6:

Cond shows a noticeable deviation from flatness, supporting the earlier statistical finding of non-proportionality.
T_Stage and N_Stage exhibit mild curvature, hinting at possible borderline time-varying effects.
The remaining covariates appear consistent with the proportional hazards assumption, displaying relatively stable residual trends over time.

4.6 Handling PH Violations in Categsinorical Covariates

The Schoenfeld residual diagnostics revealed potential violations of the proportional hazards (PH) assumption for three covariates:

Cond (p = 0.03),
T_Stage (p = 0.06),
N_Stage (p = 0.06).

These variables are all ordinal:

Cond represents worsening patient condition (1 = no disability, 4 = bed-confined),
T_Stage reflects increasing tumor size (1 to 4),
N_Stage captures nodal metastasis severity (0 to 3).

Because these are ordinal variables, we are not restricted to removing them or treating them as unordered factors. Instead, several options are available for addressing the non-proportional hazards:

Possible strategies include:

Modify the functional form:
- Treat ordinal variables as continuous (e.g., linear trend),
Try interaction terms:
- Add interactions with covariates like Age,
- Add interactions with time proxies (e.g., Surv(Time, Status) ~ Cond * Age),
- Allows modeling of non-proportional effects without full stratification.
Remove the variable:
- Easiest approach, but not ideal, as these variables are clinically meaningful.

Given the clinical relevance of Cond, T_Stage, and N_Stage, we chose to retain all three variables in the model. To address the non-proportionality, we will explore interaction terms as our primary strategy while also examining the effect of removing them for sensitivity analysis.

The next section fits interaction models and compares their fit and assumptions.

4.7 Interaction Term Exploration

To address previously observed violations of the proportional hazards assumption and assess whether interaction terms could improve model fit and stability, we extended the Forward Model by introducing key interactions between clinical and demographic covariates.

Specifically, we tested the following interactions:

Cond × Tx: The effect of treatment may depend on patient condition,
T_Stage × Sex: The prognostic impact of tumor size may differ between males and females,
N_Stage × Age: The effect of nodal involvement may vary across age.

We retained the piecewise Age specification using Age + I(Age > 46) + Age:I(Age > 46), and included Sex as a main effect to ensure interpretability of the interaction terms.

The resulting model was:

m2 <- coxph(Surv(Time, Status) ~ Age * I(Age > 46) + Sex + Cond:Tx + T_Stage:Sex + N_Stage:Age, data = df)

4.8 Proportional Hazards Check for Interaction Model

We evaluated the proportional hazards (PH) assumption for the Interaction Model using Schoenfeld residuals:

Table 10: Schoenfeld residual test results for the Interaction Model, including all interaction terms.

Variable	chisq	df	P-Value
Age	1.59	1	0.21
I(Age > 46)	0.04	1	0.84
Sex	0.04	1	0.85
Age:I(Age > 46)	0.46	1	0.50
Cond:Tx	4.19	2	0.12
Sex:T_Stage	3.92	2	0.14
Age:N_Stage	1.93	1	0.16
GLOBAL	12.68	9	0.18

Interpretation:

The GLOBAL Schoenfeld test yields p = 0.13, suggesting no evidence of PH violation overall.
All individual terms, including interaction terms such as Cond:Tx, T_Stage:Sex, and N_Stage:Age, show p-values > 0.05.
This suggests that including these interaction terms helped address the prior non-proportionality, particularly with Cond, which had previously violated the assumption.

We then visually inspect the Schoenfeld residual plots to confirm that covariate effects remain constant over time:

Figure 7: Schoenfeld residual plots for interaction terms in the Interaction Model. All terms display stable patterns over time, supporting the PH assumption.

As visualized in Figure 7, the interaction terms display stable residual patterns over time. No substantial deviation is observed, reinforcing the conclusion that the proportional hazards assumption is satisfied even for interaction effects.

Table 11: AIC comparison between the Forward Model and the Interaction Model.

Model	df	AIC
Forward Model	7	1289.970
Interactive Model	9	1293.286

4.8.1 Model Selection Summary

As shown in Table 11, the Forward Model yields a slightly lower AIC (1289.97) compared to the Interactive Model (1293.29), suggesting a marginally better fit.

However, the Forward Model violates the proportional hazards (PH) assumption, especially for Cond.

In contrast, the Interactive Model resolves these issues, supported by:

A non-significant GLOBAL Schoenfeld test (p = 0.18),
And Schoenfeld residual plots indicating no time-varying effects.

Thus, the Interactive Model is chosen as the final model for its stronger adherence to the PH assumption and interpretability of interaction effects.

4.9 Interpretation of Final Model Coefficients

The final model selected was the Interactive Model, which includes the piecewise specification for Age and interaction terms to address prior violations of the proportional hazards (PH) assumption. The model is specified as:

\[ h(t \mid \mathbf{X}) = h_0(t)\,\exp\left\{ \beta_1\,\text{Age} + \beta_2\,\mathbb{I}(\text{Age} > 46) + \beta_3\,\text{Age}\cdot\mathbb{I}(\text{Age} > 46) + \beta_4\,\text{Sex} + \beta_5\,(\text{Cond}\times\text{Tx}) + \beta_6\,(\text{T\_Stage}\times\text{Sex}) + \beta_7\,(\text{N\_Stage}\times\text{Age}) \right\}. \]

Where:

$h(t \mid \mathbf{X})$ is the hazard at time $t$ given covariates $\mathbf{X}$
$h_0(t)$ is the unspecified baseline hazard
$\beta_1, \dots, \beta_7$ are regression coefficients for main effects and interactions
$\mathbb{I}(\text{Age} > 46)$ is an indicator function (1 if Age > 46, 0 otherwise)

Exponentiating the coefficients yields hazard ratios (HRs), which are interpreted as the multiplicative change in the hazard for a one-unit increase in the covariate, holding all else constant.

Table 12: Final Cox proportional hazards model (Interactive Model): Estimated hazard ratios and 95% confidence intervals.

Characteristic	HR	95% CI	p-value
Age	0.86	0.75, 0.98	0.023
I(Age > 46)
FALSE	—	—
TRUE	0.00	0.00, 1.19	0.058
Sex
Female	—	—
Male	0.41	0.06, 2.74	0.4
Age * I(Age > 46)
Age * TRUE	1.16	1.01, 1.32	0.034
Cond * Tx
Cond * Standard	2.34	1.64, 3.34	<0.001
Cond * Test	2.52	1.86, 3.41	<0.001
Sex * T_Stage
Female * T_Stage	1.02	0.60, 1.73	>0.9
Male * T_Stage	1.49	1.14, 1.94	0.003
Age * N_Stage	1.00	1.00, 1.00	0.2
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

As shown in Table 12, the estimated hazard ratios allow for interpretation of both main effects and interactions.

4.9.1 Interpretation of Key Terms

Age and Age × I(Age > 46): The significant interaction supports a nonlinear relationship with hazard. For patients over 46, the effect of aging on hazard is amplified — the risk of death increases more steeply per year above 46 than below.
Cond:Tx: This interaction implies that the effect of treatment varies by functional condition. For example, patients in worse condition (e.g., requiring assistance or bed-confined) may benefit more (or less) from the test treatment than those in better condition. This term accounts for that heterogeneity.
T_Stage:Sex: This term captures differences in how tumor size affects survival between males and females. A significant hazard ratio here would indicate that sex modifies the impact of tumor progression.
N_Stage:Age: Suggests that nodal involvement interacts with age — the effect of nodal metastasis might be more harmful in older patients.
Sex (main effect): Included to interpret interactions properly. Even if not significant alone, it may still play an important role when combined with terms like T_Stage or others.

4.9.2 Statistical Significance

Terms with p < 0.05 are considered statistically significant, indicating strong evidence that the covariate is associated with survival time.
Borderline terms (e.g., p < 0.1) may still be clinically relevant, especially when part of an interaction.

4.9.2 Clinical Implications

This model enables nuanced, individualized predictions:

A 65-year-old with advanced N_Stage may have a very different risk profile than a 40-year-old with the same N_Stage.
Patients receiving treatment must be evaluated within the context of their functional status (Cond).
Sex-specific effects of tumor characteristics are accounted for, allowing more tailored risk assessments.

Research Question

We investigate whether treatment type (Standard vs Test) affects survival outcomes for two otherwise identical patients, using the final Cox proportional hazards model that includes interaction terms.

1. Clinical Setup

We consider two hypothetical patients with the following identical characteristics:

Age: 60 years old
Sex: Male
General Condition (Cond): 1
Tumor Stage (T_Stage): 1
Nodal Stage (N_Stage): 1

The only difference:

Patient A receives Standard treatment ($\text{Tx} = 0$)
Patient B receives Test treatment ($\text{Tx} = 1$)

We aim to compute the hazard ratio (HR) comparing these two patients, focusing on the interaction term Cond × Tx.

2. Mathematical Framework

The fitted Cox model includes interaction terms to account for context-dependent effects of covariates. Of particular interest is the interaction between Condition and Treatment:

\[ h(t \mid \mathbf{X}) = h_0(t) \cdot \exp\left( \beta_1 \cdot \text{Age} + \beta_2 \cdot \mathbb{I}(\text{Age} > 46) + \cdots + \beta_{ct} \cdot (\text{Cond} \times \text{Tx}) + \cdots \right) \]

Since both patients have ( = 1 ), and differ only in treatment type, the hazard comparison simplifies to:

\[ \log \left( \frac{h_{\text{Test}}(t)}{h_{\text{Standard}}(t)} \right) = \beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}} \]

Exponentiating gives the hazard ratio:

\[ HR = \exp\left( \beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}} \right) \]

This HR quantifies the relative hazard of receiving Test treatment versus Standard treatment for patients with identical clinical profiles.

3. Interpretation

The hazard ratio comparing Test treatment to Standard treatment, holding all other covariates constant, is approximately:

\[ HR = \exp(\beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}}) \approx 1.077 \]

This means that a patient receiving Test treatment has an estimated:

\[ 7.7\% \text{ higher hazard} \]

compared to a similar patient receiving Standard treatment, assuming all other clinical characteristics are the same.

The 95% confidence interval for this hazard ratio is:

\[ [0.846,\ 1.372] \]

Since the interval does not include 1, this difference is statistically significant, suggesting a meaningful association between treatment type and survival outcome.

Conclusion

This analysis investigated how demographic and clinical factors—including age, sex, tumor stage, general condition, and treatment type—affect patient survival using a Cox proportional hazards model.

After comparing several candidate models, the Interactive Model was selected as the final model due to its adherence to the proportional hazards assumption and its capacity to capture clinically relevant interactions. The inclusion of interaction terms—such as Cond × Tx and T_Stage × Sex—allowed for a more nuanced understanding of how certain covariates influence risk in combination, rather than in isolation. This structure also improved model adequacy, as supported by diagnostic checks including Schoenfeld residual plots and global tests.

A key finding was that treatment type plays a significant role in patient outcomes: patients receiving the Test treatment exhibited a statistically significant increase in hazard compared to those receiving the Standard treatment, even after adjusting for age, sex, and staging variables. While the main effect of sex alone was not significant, its interaction with tumor stage revealed heterogeneity in risk that may have important clinical implications.

In summary, the final model provides a statistically robust and clinically interpretable framework for understanding which factors most strongly influence survival. These insights may support personalized treatment strategies and inform future research on time-to-event outcomes.

Do you enjoy my blog? Subscribe here to get notifications and updates (it's free!):

--- title: "A Clinical Trial in the Treatment of Carcinoma of the Oropharynx" description: "A randomized, multi‑center study evaluating novel therapy combinations to improve survival and quality of life for patients with oropharyngeal carcinoma." lightbox: true number-sections: false date: July 23, 2025 image: louis-reed-pwcKF7L4-no-unsplash.jpg categories: - R - Biostatistics - Machine Learning - Survival Analysis format: html: fig-cap-location: bottom include-before-body: ../../../../html/margin_image.html include-after-body: ../../../../html/blog_footer.html editor: markdown: wrap: sentence --- ```{r} #| include: FALSE library(tidyverse) library(readxl) library(skimr) library(showtext) library(ggtext) library(glue) library(patchwork) library(ggsurvfit) library(survival) library(gtsummary) library(gt) library(survminer) library(kableExtra) library(gtExtras) library(MASS) ``` # Abstract This study analyzes survival outcomes for patients with squamous cell carcinoma of the oropharynx, using data from a multicenter clinical trial conducted by the Radiation Therapy Oncology Group. A total of 195 patients from six major institutions were randomized to receive either radiation therapy alone or radiation therapy combined with a chemotherapeutic agent. The primary objective was to compare the effectiveness of the two treatment strategies while accounting for other prognostic factors. A Cox proportional hazards model was used to evaluate the relationship between survival time and clinical/demographic covariates, including age, sex, tumor stage (T and N), general condition, tumor grade, and treatment group. Approximately 30% of patients were censored due to being alive at last follow-up. Forward stepwise regression and interaction modeling were applied to identify the best-fitting model, and diagnostics were performed to validate model assumptions. Results indicated that treatment type was not statistically significant after adjusting for other variables. However, tumor staging, general condition, and certain interaction terms demonstrated a meaningful effect on survival. The final model incorporated interaction effects to address violations of the proportional hazards assumption, improving interpretability and validity. These findings underscore the importance of considering both individual covariates and their interactions when modeling time-to-event outcomes in cancer survival studies. # Introduction Pharyngeal cancer represents a serious public health concern, affecting a substantial number of individuals worldwide. Its prognosis and treatment outcomes are often influenced by a complex interplay of tumor characteristics, patient-specific factors, and therapeutic strategies. In clinical research, **survival analysis** plays a pivotal role in assessing the effectiveness of such treatments by evaluating the time duration between a defined starting point—often diagnosis or treatment initiation—and the occurrence of a terminal event, typically death. This study presents a survival analysis of patients diagnosed with **squamous cell carcinoma of the oropharynx**, utilizing data contributed by six major institutions involved in a broader multi-center clinical trial. Patients were randomly assigned to one of two treatment arms: 1. **Radiation therapy alone**, or 2. **Radiation therapy combined with chemotherapy**. The core objective of this analysis is to evaluate whether the combined treatment improves survival outcomes relative to radiation alone, while controlling for a series of clinical and demographic covariates. In addition to treatment type, several patient and tumor-related attributes are hypothesized to impact survival outcomes. These include: - **Sex**: Male or Female - **T staging**: Describes the size and local extent of the primary tumor - *T=1*: Small tumor (≤2 cm) - *T=4*: Extensive involvement of adjacent tissues - **N staging**: Captures the extent of regional lymph node involvement - *N=0*: No nodes affected - *N=3*: Severe nodal spread - **Age**: Continuous variable, evaluated using piecewise specification - **General condition**: Clinical estimate of the patient's functional status at diagnosis - **Tumor grade**: Reflects differentiation and aggressiveness - **Treatment (Tx)**: - *Tx = 1*: Standard radiation therapy - *Tx = 2*: Combined radiation + chemotherapy To ensure a relatively homogeneous clinical cohort, patients with early-stage disease (e.g., T=1, N=0; T=2, N=0) or distant metastases were excluded. Nonetheless, considerable **clinical heterogeneity** remains, as the included population varies in tumor burden and baseline health status. Approximately **30% of survival times are censored**, primarily because patients were still alive at the time of the final follow-up. A smaller fraction of patients were lost to follow-up, typically due to relocation or transfer to non-participating institutions. While the **primary focus** of the study is to assess differences in survival between treatment arms, a secondary objective is to identify **which covariates exert the most substantial influence on survival**. These findings can inform future clinical decision-making and support more **personalized treatment strategies**. To conduct this analysis, we apply the **Cox proportional hazards model**, an established method for modeling time-to-event data. We begin with exploratory data analysis, followed by stepwise model development, evaluation of the proportional hazards (PH) assumption, and the incorporation of interaction terms where appropriate. The results provide insight into both treatment efficacy and the role of patient- and tumor-specific factors in shaping survival outcomes. The subsequent sections detail the exploratory data analysis, model development, proportional hazards diagnostics, and interpretation of key findings, ultimately guiding conclusions about the impact of treatment and covariates on patient survival outcomes. ## Data Cleaning The dataset required minor preprocessing to prepare it for analysis. The following steps were taken: 1. **Recode `Sex`:**\ The variable `Sex` was recoded from numeric codes (1 = Male, 2 = Female) into descriptive character labels, **"Male"** and **"Female"**. 2. **Handle Missing and Invalid Values:** - For `Cond` (patient condition), values of **0** and **9** were treated as invalid or missing and replaced with `NA`.\ - For `Grade` (tumor differentiation), values of **9** were treated as missing and replaced with `NA`.\ - Missing values in `Cond` and `Grade` were imputed using their **median values**, which maintains the ordinal nature of these variables. ```{r} #| include: false # Add Google fonts font_add_google("Oswald", family = "Oswald") font_add_google("Roboto", family = "Roboto") # Add local font font_add("Font Awesome 6 Brands", here::here("fonts/otfs/Font Awesome 6 Brands-Regular-400.otf")) # Automatically enable the use of showtext for all plots showtext_auto() # Set DPI for high-resolution text rendering showtext_opts(dpi = 300) ``` ```{r} #| include: false # Generate a social media caption with custom colors and font styling social <- andresutils::social_caption(font_family = "Oswald", icon_color = "#007FFF") # Construct the final plot caption with TidyTuesday details, data source, and social caption cap <- paste0( "**Source**: Pharynx | **Graphic**: ", social ) ``` ```{r} #| include: false df <- read_excel(here::here("Data/pharynx.xls")) ``` ```{r} #| include: false glimpse(df) skim(df) ``` ```{r} #| include: false df <- df %>% mutate(Sex = if_else(Sex == 2, "Female", "Male"), Cond = if_else(Cond %in% c(0,9), NA, Cond), Grade = if_else(Grade == 9, NA, Grade), Cond = if_else(is.na(Cond), median(Cond, na.rm = TRUE), Cond), Grade = if_else(is.na(Grade), median(Grade, na.rm = TRUE), Grade) ) unique(df$Sex) head(df) ``` # Exploratory Data Analysis (EDA) ## 1. Data Overview We analyzed data from **195 patients** diagnosed with head and neck cancer, with survival time measured in days from diagnosis. @tbl-descriptions summarizes the variables included in the dataset. ```{r} #| label: tbl-descriptions #| echo: false #| tbl-cap: "Description of the Pharynx Dataset" # Create a tibble summarizing the dataset features features <- tibble( Variable = c( "<code>Time</code>", "<code>Status</code>", "<code>Sex</code>", "<code>Tx</code>", "<code>Grade</code>", "<code>Age</code>", "<code>Cond</code>", "<code>Site</code>", "<code>T_Stage</code>", "<code>N_Stage</code>", "<code>Inst</code>", "<code>Case</code>" ), Description = c( "Survival time (days)", "Event indicator (1 = death, 0 = censored)", "Male or Female", "Treatment group (Standard or Test)", "Tumor differentiation (1 = well differentiated, 2 = moderately differentiated, 3 = poorly differentiated, 9 = missing)", "Age in years at time of diagnosis", "Condition at diagnosis (1 = no disability, 2 = restricted work, 3 = requires assistance with self-care, 4 = bed-confined, 9 = missing)", "Tumor location (Faucial arch, Tonsillar fossa, Posterior pillar, Pharyngeal tongue, Posterior wall)", "Primary tumor stage (1–4)", "Node metastasis stage (0–3)", "Institution code", "Patient identifier" ) ) # Display the table with kableExtra styling kbl(features, align = "c", escape = FALSE) %>% kable_styling(full_width = F, bootstrap_options = c("striped", "hover")) %>% row_spec(0, bold = TRUE, color = "#e9ecef", background = "#007FFF") ``` ## 2. Age Distribution To understand the age profile of patients, we plotted the distribution of age at diagnosis (@fig-age).\ The mean age was **`r round(mean(df$Age), 2)`** years, and the median age was **`r median(df$Age)`** years, with ages ranging from **`r min(df$Age)`** to **`r max(df$Age)`** years. ```{r} #| label: fig-age #| fig-cap: "Distribution of patient age at diagnosis." #| echo: false #| warning: false age_median <- median(df$Age) age_mean <- round(mean(df$Age), 2) df %>% ggplot(aes(x = Age)) + geom_histogram(fill = "#007FFF", color = "black") + geom_vline(xintercept = age_median, linetype = "dashed", color = "#FEBE10", linewidth = 1) + annotate("label", x = 30, y = 20, label = paste0("Median: ", age_median, " Years\n", "Mean: ", age_mean, " Years"), family = "Roboto", size = 2.5, lineheight = 1.5) + scale_x_continuous(breaks = seq(0, 100, 5)) + labs(y = "Frequency", title = "Age Distribution", caption = cap) + theme_minimal(base_size = 9, base_family = "Oswald") + theme( plot.title.position = "plot", plot.title = element_text(face = "bold"), plot.caption = element_markdown(size = 5), plot.margin = margin(5, 5, 5, 5), panel.grid.minor = element_blank(), panel.grid.major.y = element_line(linetype = "dashed") ) ``` ## 3. Participant Demographics: Age and Gender @fig-age-gender provides an overview of age distribution and gender composition among the study participants. - **Gender distribution:** The study population consists of **149 males (76%)** and **46 females (24%)**, highlighting a male-dominated cohort. - **Age distribution:** Both males and females exhibit similar age profiles, with most patients clustered between **50 and 70 years** of age. - **Age density by gender:** The density plots show a slightly higher concentration of males between 60–70 years compared to females. - **Boxplots:** Median ages for males and females are close, suggesting no major age difference across gender groups. ```{r, fig.width=10, fig.height=10} #| label: fig-age-gender #| fig-cap: "Age and gender overview of participants, including density plots, boxplots, and gender counts." #| warning: false #| echo: false p1 <- df %>% ggplot(aes(x = Age, fill = Sex)) + geom_density(alpha = 0.65, linewidth = 1.1) + scale_fill_manual(values = c("#c90076", "#2986cc")) + facet_wrap(vars(Sex)) + labs(title = "Age Density by Gender", y = "Density") + theme_minimal(base_size = 9, base_family = "Oswald") + theme( plot.title.position = "plot", plot.title = element_text(size = 10, face = "bold", margin = margin(t = 5, b = 5)), legend.position = "none", panel.grid.minor = element_blank(), panel.grid.major.x = element_blank(), strip.text = element_text(face = "bold", size = 10) ) p2 <- df %>% ggplot(aes(x = Sex, y = Age, color = Sex)) + geom_jitter( position = position_jitter(width = 0.2, seed = 123), alpha = 0.5, size = 2 ) + geom_boxplot( fill = "#FEFFC4", alpha = 0.25, outlier.shape = NA, width = 0.5) + scale_y_continuous(breaks = seq(0, 100, by = 10)) + scale_color_manual(values = c("#c90076", "#2986cc")) + labs(title = "Age Distribution by Gender") + theme_minimal(base_size = 9, base_family = "Oswald") + theme( legend.position = "none", panel.grid = element_blank(), plot.title.position = "plot", plot.title = element_text(size = 10, face = "bold", margin = margin(t = 5, b = 5)), axis.title.x = element_blank() ) gender_count <- df %>% group_by(Sex) %>% summarise(count = n()) %>% ungroup() p3 <- gender_count %>% ggplot(aes(x = Sex, y = count, fill = Sex, label = count)) + geom_bar(stat = "identity", color = "black") + geom_text(vjust = -0.5, family = "Oswald", fontface = "bold", size = 3.2) + labs(x = "Gender", y = "Count", title = "Gender Distribution of Participants") + scale_y_continuous(limits = c(0, 155)) + scale_fill_manual(values = c("#c90076", "#2986cc")) + coord_cartesian(clip = "off") + theme_minimal(base_size = 9, base_family = "Oswald") + theme( legend.position = "none", plot.title.position = "plot", plot.title = element_text(size = 10, face = "bold", margin = margin(t = 5, b = 5)), panel.grid = element_blank(), axis.title.x = element_blank() ) combined_plots <- (p1 / (p2 + p3)) + plot_layout( heights = c(0.8, 1.2), widths = c(1, 1) ) + plot_annotation( title = "Participant Demographics: Age and Gender Overview", subtitle = "Age distribution and gender composition of the study population", caption = cap, theme = theme( plot.title = element_text(size = 18, face = "bold", margin = margin(t = 5, b = 5)), plot.subtitle = element_text(size = 9, color = "grey30", margin = margin(t = 5, b = 5)), plot.caption = element_markdown(size = 5, hjust = 0.5, margin = margin(t = 10)), text = element_text(family = "Oswald"), plot.margin = margin(t = 20, r = 10, b = 20, l = 10) ) ) combined_plots ``` ```{r} #| include: false # Generate a social media caption with custom colors and font styling social <- andresutils::social_caption(font_family = "Oswald", icon_color = "grey35") # Construct the final plot caption with TidyTuesday details, data source, and social caption cap <- paste0( "**Source**: Pharynx | **Graphic**: ", social ) ``` ## 4. Event Time Distribution by Status @fig-event-time shows the distribution of survival times (in days), separated by **observed events (deaths)** and **censored cases**. - **Events (Dead):** The median event time was **325.5 days**, indicating that half of the deaths occurred before approximately 11 months from diagnosis. - **Censored cases:** The median censored time was **943 days**, meaning patients who survived or were lost to follow-up had a much longer observation period. - **Pattern observation:** Most deaths occurred early (within the first 500 days), while censored cases dominate after 800 days. ```{r} #| label: fig-event-time #| fig-cap: "Distribution of survival time by status (event vs. censored)." #| echo: false #| warning: false df_status <- df %>% mutate(Status = if_else(Status == 0, "Censored", "Dead")) time_median <- df_status %>% summarise( median_time = median(Time, na.rm = TRUE), .by = Status ) df_status %>% ggplot(aes(x = Time, fill = Status)) + geom_histogram(alpha = 0.7) + geom_vline( data = time_median, aes(xintercept = median_time, color = Status), linetype = "dashed", linewidth = 0.5, show.legend = FALSE ) + annotate( "text", x = time_median$median_time[time_median$Status == "Censored"] + 215, y = 16, label = paste0("Median censored time: ", round(time_median$median_time[time_median$Status == "Censored"], 2)), color = "#6A89A7", size = 1.8, alpha = 0.8, family = "Oswald", fontface = "bold" ) + annotate( "text", x = time_median$median_time[time_median$Status == "Dead"] + 255, y = 17.5, label = paste0("Median event time (dead): ", round(time_median$median_time[time_median$Status == "Dead"], 2)), color = "#fd5c63", size = 1.8, alpha = 0.8, family = "Oswald", fontface = "bold" ) + scale_fill_manual(values = c("#6A89A7", "#fd5c63")) + scale_color_manual(values = c("#6A89A7", "#fd5c63")) + labs(x = "Survival Time in Days", y = "Frequency", title = "Event Time Distribution by Status", subtitle = "Comparison of observed events and censored cases across time", caption = cap) + theme_minimal(base_size = 9, base_family = "Oswald") + theme( plot.title.position = "plot", plot.title = element_text(face = "bold"), plot.subtitle = element_text(color = "gray30"), plot.caption = element_markdown(size = 5), panel.grid.minor = element_blank(), panel.grid.major.x = element_blank(), margin = margin(5, 5, 5, 5), legend.position = "inside", legend.justification.inside = c(1, 1) ) ``` ## 5. Kaplan–Meier Survival by Sex @fig-km-sex illustrates the Kaplan–Meier (KM) survival curves stratified by sex. Females (n = 46) appear to have slightly higher survival probabilities compared to males (n = 149) during the first year. ```{r} #| label: fig-km-sex #| fig-cap: "Kaplan–Meier survival curves by sex with 95% confidence intervals and risk tables." #| echo: false #| warning: false survfit2(Surv(Time, Status) ~ Sex, data = df) %>% ggsurvfit(linewidth = 1) + labs( x = "Days", y = "Overall survival probability" ) + add_confidence_interval() + add_risktable(risktable_height = 0.18, size = 2.2, theme = # increase font size of risk table title and y-axis label list( theme_risktable_default(axis.text.y.size = 7, plot.title.size = 7), theme(plot.title = element_text(family = "Oswald", face = "bold"), text = element_text(family = "Oswald")) )) + scale_ggsurvfit() + scale_x_continuous(limits = c(0,1500)) + scale_color_manual(values =c("#c90076", "#2986cc")) + scale_fill_manual(values = c("#c90076", "#2986cc")) + labs(y = "Survival Probability", title = "Kaplan-Meier Survival Curves by Sex") + theme_minimal(base_size = 9, base_family = "Oswald") + theme( plot.title.position = "plot", plot.title = element_text(face = "bold", margin = margin(b = 10)), legend.position = "bottom" ) ``` ### 5.1 1-Year Survival Probability To complement the Kaplan–Meier curves, @tbl-km-1year reports the **1-year survival probabilities** for both sexes, along with 95% confidence intervals derived using Greenwood's formula. ```{r} #| label: tbl-km-1year #| tbl-cap: "1-year survival probabilities by sex (Kaplan–Meier estimates with 95% CI)." #| echo: false survfit(Surv(Time, Status) ~ Sex, data = df) %>% tbl_survfit( times = 365.25, label_header = "**1-year survival (95% CI)**", label = list(Sex = "Gender") ) %>% as_gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) ``` ### 5.2 Mathematical Approach The Kaplan–Meier estimator for the survival function $S(t)$ is: $$ \widehat{S}(t) = \prod_{t_{(i)} \le t} \left( 1 - \frac{d_i}{n_i} \right), $$ where:\ - $t_{(i)}$ are the ordered event times,\ - $d_i$ is the number of events (deaths) at time $t_{(i)}$,\ - $n_i$ is the number of individuals at risk just prior to $t_{(i)}$. The variance of $\widehat{S}(t)$ is approximated using **Greenwood's formula**: $$ \widehat{\mathrm{Var}}\{ \widehat{S}(t) \} = \widehat{S}(t)^2 \sum_{t_{(i)} \le t} \frac{d_i}{n_i (n_i - d_i)}. $$ Confidence intervals are computed on the **log–log scale** for stability: $$ \log \{-\log \widehat{S}(t)\} \pm z_{1-\alpha/2} \sqrt{ \frac{ \sum_{t_{(i)} \le t} \frac{d_i}{n_i (n_i - d_i)} } { [\log \widehat{S}(t)]^2 } }, $$ and then **back-transformed** to the survival scale. ### 5.3 Interpretation - **1-year survival estimates:** - **Females:** 63% (95% CI: 51%–79%) - **Males:** 56% (95% CI: 49%–65%) These results suggest that females have a modest survival advantage compared to males at 1 year. However, Kaplan–Meier estimates are **unadjusted** and do not account for potential confounding by other variables (e.g., age, tumor stage, condition). To formally test for group differences, the **log-rank test** is used, which compares the observed and expected event counts across groups under the null hypothesis of identical survival curves. The Kaplan–Meier analysis serves as an **initial univariate exploration**, motivating the use of a **Cox proportional hazards model** to evaluate the combined effects of multiple covariates. # Model Development ## 1. Objective and Modeling Framework Our objective was to quantify the effect of clinical and demographic covariates on time to death. We used the **Cox proportional hazards (PH) model**, which specifies the hazard for subject $i$ with covariate vector $X_i$ at time $t$ as: $$ h(t \mid X_i) = h_0(t)\exp\left( \beta^\top X_i \right), $$ where $h_0(t)$ is an **unspecified baseline hazard**, and $\beta$ is a vector of log–hazard ratios (log‑HRs). Estimation proceeds by **maximizing the partial likelihood**: $$ L(\beta) = \prod_{i \in \mathcal{D}} \frac{\exp(\beta^\top X_i)} {\sum_{j \in R(t_i)} \exp(\beta^\top X_j)}, $$ where $\mathcal{D}$ is the set of subjects who experienced the event, and $R(t_i)$ is the **risk set** at time $t_i$. Hypothesis tests and confidence intervals are based on the **score**, **Wald**, or **likelihood ratio** statistics derived from this likelihood. ## 2. Candidate Covariates The following covariates were considered for model selection: - **Age** (continuous, in years), - **Sex** (Male/Female), - **Tx** (Treatment group: Standard or Test), - **Grade** (1 = well differentiated, 2 = moderately differentiated, 3 = poorly differentiated; 9 = missing), - **Cond** (1 = no disability, 2 = restricted work, 3 = requires assistance with self-care, 4 = bed-confined; 9 = missing), - **Site** (tumor location: 1 = Faucial arch, 2 = Tonsillar fossa, 3 = Posterior pillar, 4 = Pharyngeal tongue, 5 = Posterior wall), - **T_Stage** (tumor size: 1–4), - **N_Stage** (nodal metastasis: 0–3). ## 3. Functional Form Assessment with Martingale Residuals To evaluate whether **Age** had a linear effect on the log-hazard, we examined **martingale residuals** from a Cox model excluding Age. Martingale residuals for subject \$ i \$ are defined as: $$ M_i = \delta_i - \widehat{H}_0(T_i) \exp(\hat{\beta}^\top X_i), $$ where: - $\delta_i$ is the event indicator (1 = death, 0 = censored), - $\widehat{H}_0(T_i)$ is the estimated baseline cumulative hazard, - $X_i$ is the covariate vector for subject $i$. ```{r} #| include: false df <- df %>% mutate(Tx = if_else(Tx == 1, "Standard", "Test")) ``` ```{r} #| label: fig-mart-age #| fig-cap: "Martingale residual plot from ggcoxfunctional() for Age, showing departure from linearity." #| echo: false #| warning: false m1 <- coxph(Surv(Time, Status) ~ Age,data = df) ggcoxfunctional( m1, data = df, point.size = 1.5, point.alpha = 0.7, point.col = "#4D4DFF", ggtheme = theme_classic( base_size = 9, base_family = "Oswald" ) ) ``` The martingale residual plot (@fig-mart-age) suggests that the relationship between **Age** and the log-hazard is not perfectly linear, particularly for patients older than approximately **46 years**. To address this non-linearity, we compared three Age specifications: - **Model 1:** Age as a continuous linear term.\ - **Model 1.1:** Piecewise Age with an interaction $\mathrm{Age} \times I(\mathrm{Age} > 46)$.\ - **Model 1.2:** Dichotomized Age using $I(\mathrm{Age} > 46)$ only. Now, the AIC values (@tbl-aic-age) indicate that **Model 1.1** (AIC = 1322.379) provides the best fit among these, offering a better trade-off between model complexity and goodness-of-fit. **Conclusion:** We selected the piecewise Age model $$\mathrm{Age} + I(\mathrm{Age} > 46) + \mathrm{Age} \times I(\mathrm{Age} > 46)$$ for all subsequent modeling steps, as it best captures the observed non-linearity without oversimplifying Age into a single binary variable. ```{r} #| label: tbl-aic-age #| tbl-cap: "AIC comparison of different Age specifications." #| echo: false m1.1 <- coxph(Surv(Time, Status) ~ Age * I(Age > 46),data = df) m1.2 <- coxph(Surv(Time, Status) ~ I(Age > 46),data = df) AIC(m1, m1.1, m1.2) %>% as_tibble() %>% mutate(Model = c("Model 1", "Model 1.1", "Model 1.2")) %>% relocate(Model, .before = df) %>% gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) %>% gt_highlight_rows( rows = 2, fill = "#9AD0DC", bold_target_only = TRUE, target_col = Model ) ``` ## 4. Multivariable Model Development To evaluate the combined effects of all clinical and demographic covariates, we initially fit a **full Cox proportional hazards model** including the piecewise Age specification and all other predictors: ```{r} full_model <- coxph(Surv(Time, Status) ~ Age * I(Age > 46) + Sex + Tx + Grade + Cond + Site + T_Stage + N_Stage, data = df) ``` ```{r} #| label: tbl-full-model #| tbl-cap: "Full Cox proportional hazards model: Estimated hazard ratios and 95% confidence intervals." #| echo: false full_model %>% tbl_regression(exp = TRUE) %>% as_gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) ``` As shown in @tbl-full-model, the model incorporates all key predictors and allows the effect of Age to differ before and after age 46. ### 4.1 Model Comparison with Age-Only Model To evaluate the contribution of additional covariates beyond Age, we compare the AIC of the **Full Model** to that of the best-fitting Age-only specification (**Model 1.1**): ```{r} #| label: tbl-aic-age-comparison #| tbl-cap: "AIC comparison of Age-only model (Model 1.1) and Full Model." #| echo: false AIC(m1.1, full_model) %>% as_tibble() %>% mutate(Model = c("Model 1.1", "Full Model")) %>% relocate(Model, .before = df) %>% gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) %>% gt_highlight_rows( rows = 2, fill = "#9AD0DC", bold_target_only = TRUE, target_col = Model ) ``` As shown in @tbl-aic-age-comparison, the Full Model achieves a lower AIC, indicating a better fit than the Age-only model. This supports the inclusion of clinical and demographic covariates in explaining survival outcomes. ### 4.2 Proportional Hazards Check for Full Model We test the **proportional hazards (PH) assumption** using Schoenfeld residuals: ```{r} #| label: tbl-ph-check #| tbl-cap: "Schoenfeld residual test for the proportional hazards (PH) assumption for the Full Model." #| echo: false fit <- cox.zph(full_model) fit$table %>% as.data.frame() %>% rownames_to_column(var = "Variable") %>% as_tibble() %>% gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) %>% gt_highlight_rows( rows = 6, fill = "#ef5a68", bold_target_only = TRUE, target_col = Variable ) %>% fmt_number( columns = c(chisq, p), decimals = 2) %>% cols_label( p = "P-Value" ) ``` **Interpretation:** - The **GLOBAL test** ($\chi^2 = 14.78$, *p* = 0.14) suggests no overall violation of the PH assumption for the Full Model. - However, **Cond** shows a *p*-value of 0.02, indicating potential non-proportionality. - **T_Stage** (*p* = 0.06) and **N_Stage** (*p* = 0.07) are borderline. ### 4.3 Stepwise Model Selection (Forward) To build a more parsimonious model while retaining the piecewise Age specification, we applied **forward stepwise AIC selection**, beginning with **Model 1.1.** ```{r} step_forward <- stepAIC(m1.1, scope = ~ Age * I(Age > 46) + Sex + Tx + Grade + Cond + Site + T_Stage + N_Stage, direction = "forward", trace = FALSE) ``` ```{r} #| label: tbl-step-forward #| tbl-cap: "Forward stepwise AIC-selected Cox model: Estimated hazard ratios and 95% confidence intervals." #| echo: false step_forward %>% tbl_regression(exp = TRUE) %>% as_gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) ``` **Interpretation of Current Model:** - Several predictors show strong or borderline associations with survival, including **Age** (*p* = 0.027), **Cond** (*p* \< 0.001), and **T_Stage** (*p* = 0.006).\ - The **interaction term Age × I(Age \> 46)** remains significant, supporting the non-linear specification of Age.\ - While **N_Stage** and **Sex** are not significant at conventional levels, they were retained for model fit under the AIC framework. ### 4.4 AIC Comparison We now compare the AIC of the **Full Model** and the **Forward Model:** ```{r} #| label: tbl-aic-comparison #| tbl-cap: "AIC comparison between the Full Cox model and the Forward AIC-selected model." #| echo: false AIC(full_model, step_forward) %>% as_tibble() %>% mutate(Model = c("Full Model", "Forward Model")) %>% relocate(Model, .before = df) %>% gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) %>% gt_highlight_rows( rows = 2, fill = "#9AD0DC", bold_target_only = TRUE, target_col = Model ) ``` As shown in @tbl-aic-comparison, the **Forward Model** achieves a lower AIC (1289.97) than the **Full Model** (1294.15), indicating a better fit with fewer predictors. ### 4.5 Proportional Hazards Check for Forward Model We test the PH assumption for the **Forward Model:** ```{r} #| label: tbl-ph-forward #| tbl-cap: "Schoenfeld residual test results for the Forward Cox model." #| echo: false fit <- cox.zph(step_forward) fit$table %>% as.data.frame() %>% rownames_to_column(var = "Variable") %>% as_tibble() %>% gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) %>% gt_highlight_rows( rows = 3, fill = "#ef5a68", bold_target_only = TRUE, target_col = Variable ) %>% fmt_number( columns = c(chisq, p), decimals = 2) %>% cols_label( p = "P-Value" ) ``` As shown in @tbl-ph-forward, the **GLOBAL test** yields *p* = 0.06, indicating no strong evidence of violation in the PH assumption overall. However: - **Cond** exhibits possible non-proportionality (*p* = 0.03), suggesting its effect may vary over time. - All other variables meet the proportional hazards assumption. We next explore **Schoenfeld residual plots** to visually assess potential time-varying effects. ```{r fig.width=10, fig.height=10} #| label: fig-schoenfeld-forward #| fig-cap: "Schoenfeld residual plots for the Forward Model. Most covariates show approximately flat trends over time, except for Cond, which displays potential deviation from the PH assumption." #| echo: false #| warning: false ggcoxzph( fit, point.col = "#4D4DFF", point.alpha = 0.7, ggtheme = theme_classic( base_size = 9, base_family = "Oswald" ) + theme( title = element_text(size = 7) ) ) ``` As visualized in @fig-schoenfeld-forward: - **Cond** shows a noticeable deviation from flatness, supporting the earlier statistical finding of non-proportionality. - **T_Stage** and **N_Stage** exhibit mild curvature, hinting at possible borderline time-varying effects. - The remaining covariates appear consistent with the proportional hazards assumption, displaying relatively stable residual trends over time. ### 4.6 Handling PH Violations in Categsinorical Covariates The Schoenfeld residual diagnostics revealed potential violations of the **proportional hazards (PH) assumption** for three covariates: - **Cond** (p = 0.03), - **T_Stage** (p = 0.06), - **N_Stage** (p = 0.06). These variables are all **ordinal**: - **Cond** represents worsening patient condition (1 = no disability, 4 = bed-confined), - **T_Stage** reflects increasing tumor size (1 to 4), - **N_Stage** captures nodal metastasis severity (0 to 3). Because these are **ordinal** variables, we are not restricted to removing them or treating them as unordered factors. Instead, several options are available for addressing the non-proportional hazards: ------------------------------------------------------------------------ **Possible strategies include:** 1. **Modify the functional form:** - Treat ordinal variables as continuous (e.g., linear trend), 2. **Try interaction terms:** - Add interactions with covariates like **Age**, - Add interactions with time proxies (e.g., `Surv(Time, Status) ~ Cond * Age`), - Allows modeling of non-proportional effects without full stratification. 3. **Remove the variable:** - Easiest approach, but **not ideal**, as these variables are clinically meaningful. ------------------------------------------------------------------------ > Given the clinical relevance of **Cond**, **T_Stage**, and **N_Stage**, we chose to **retain all three variables** in the model. > To address the non-proportionality, we will explore **interaction terms** as our primary strategy while also examining the effect of removing them for sensitivity analysis. The next section fits interaction models and compares their fit and assumptions. ### 4.7 Interaction Term Exploration To address previously observed violations of the proportional hazards assumption and assess whether interaction terms could improve model fit and stability, we extended the **Forward Model** by introducing key interactions between clinical and demographic covariates. Specifically, we tested the following interactions: - **Cond × Tx**: The effect of treatment may depend on patient condition, - **T_Stage × Sex**: The prognostic impact of tumor size may differ between males and females, - **N_Stage × Age**: The effect of nodal involvement may vary across age. We retained the **piecewise Age specification** using `Age + I(Age > 46) + Age:I(Age > 46)`, and included **Sex** as a main effect to ensure interpretability of the interaction terms. The resulting model was: ```{r} m2 <- coxph(Surv(Time, Status) ~ Age * I(Age > 46) + Sex + Cond:Tx + T_Stage:Sex + N_Stage:Age, data = df) ``` ### 4.8 Proportional Hazards Check for Interaction Model We evaluated the proportional hazards (PH) assumption for the **Interaction Model** using Schoenfeld residuals: ```{r} #| label: tbl-ph-interaction #| tbl-cap: "Schoenfeld residual test results for the Interaction Model, including all interaction terms." #| echo: false fit <- cox.zph(m2) fit$table %>% as.data.frame() %>% rownames_to_column(var = "Variable") %>% as_tibble() %>% gt() %>% opt_table_font(font = (google_font(name = "Roboto Condensed"))) %>% fmt_number(columns = c(chisq, p), decimals = 2) %>% gt_highlight_cols( columns = 4, fill = "#9AD0DC", font_weight = "bold" ) %>% cols_label(p = "P-Value") ``` **Interpretation:** - The **GLOBAL Schoenfeld test** yields *p* = 0.13, suggesting **no evidence of PH violation overall**. - All individual terms, including interaction terms such as **Cond:Tx**, **T_Stage:Sex**, and **N_Stage:Age**, show *p*-values \> 0.05. - This suggests that **including these interaction terms helped address the prior non-proportionality**, particularly with **Cond**, which had previously violated the assumption. We then visually inspect the Schoenfeld residual plots to confirm that covariate effects remain constant over time: ```{r fig.width=7, fig.height=7} #| label: fig-schoenfeld-interactions #| fig-cap: "Schoenfeld residual plots for interaction terms in the Interaction Model. All terms display stable patterns over time, supporting the PH assumption." #| echo: false #| warning: false ggcoxzph( fit, var = c("Cond:Tx", "Sex:T_Stage", "Age:N_Stage"), point.col = "#4D4DFF", point.alpha = 0.7, ggtheme = theme_classic(base_size = 9, base_family = "Oswald") + theme(title = element_text(size = 7)) ) ``` As visualized in @fig-schoenfeld-interactions, the interaction terms display stable residual patterns over time. No substantial deviation is observed, reinforcing the conclusion that the proportional hazards assumption is satisfied even for interaction effects. ```{r} #| label: tbl-aic-interaction #| tbl-cap: "AIC comparison between the Forward Model and the Interaction Model." #| echo: false AIC(step_forward, m2) %>% as_tibble() %>% mutate(Model = c("Forward Model", "Interactive Model")) %>% relocate(Model, .before = df) %>% gt() %>% opt_table_font(font = (google_font(name = "Roboto Condensed"))) %>% gt_highlight_rows( rows = 1, fill = "#9AD0DC", bold_target_only = TRUE, target_col = Model ) ``` #### 4.8.1 Model Selection Summary As shown in @tbl-aic-interaction, the **Forward Model** yields a slightly lower AIC (*1289.97*) compared to the **Interactive Model** (*1293.29*), suggesting a marginally better fit. However, the Forward Model **violates the proportional hazards (PH) assumption**, especially for **Cond**. In contrast, the **Interactive Model** resolves these issues, supported by: - A **non-significant GLOBAL Schoenfeld test** (*p* = 0.18), - And **Schoenfeld residual plots** indicating no time-varying effects. Thus, the **Interactive Model** is chosen as the **final model** for its stronger adherence to the PH assumption and interpretability of interaction effects. ### 4.9 Interpretation of Final Model Coefficients The final model selected was the **Interactive Model**, which includes the piecewise specification for Age and interaction terms to address prior violations of the proportional hazards (PH) assumption. The model is specified as: $$ h(t \mid \mathbf{X}) = h_0(t)\,\exp\left\{ \beta_1\,\text{Age} + \beta_2\,\mathbb{I}(\text{Age} > 46) + \beta_3\,\text{Age}\cdot\mathbb{I}(\text{Age} > 46) + \beta_4\,\text{Sex} + \beta_5\,(\text{Cond}\times\text{Tx}) + \beta_6\,(\text{T\_Stage}\times\text{Sex}) + \beta_7\,(\text{N\_Stage}\times\text{Age}) \right\}. $$ Where: - $h(t \mid \mathbf{X})$ is the hazard at time $t$ given covariates $\mathbf{X}$\ - $h_0(t)$ is the unspecified baseline hazard\ - $\beta_1, \dots, \beta_7$ are regression coefficients for main effects and interactions\ - $\mathbb{I}(\text{Age} > 46)$ is an indicator function (1 if Age \> 46, 0 otherwise) Exponentiating the coefficients yields **hazard ratios (HRs)**, which are interpreted as the **multiplicative change in the hazard** for a one-unit increase in the covariate, **holding all else constant**. ```{r} #| label: tbl-final-model #| tbl-cap: "Final Cox proportional hazards model (Interactive Model): Estimated hazard ratios and 95% confidence intervals." #| echo: false m2 %>% tbl_regression(exp = TRUE) %>% as_gt() %>% opt_table_font( font = ( google_font(name = "Roboto Condensed") ) ) ``` As shown in @tbl-final-model, the estimated hazard ratios allow for interpretation of both main effects and interactions. #### 4.9.1 Interpretation of Key Terms - **Age and Age × I(Age \> 46):** The significant interaction supports a **nonlinear relationship** with hazard. For patients over 46, the effect of aging on hazard is **amplified** — the risk of death increases more steeply per year above 46 than below. - **Cond:Tx:** This interaction implies that the **effect of treatment varies by functional condition**. For example, patients in worse condition (e.g., requiring assistance or bed-confined) may benefit more (or less) from the test treatment than those in better condition. This term accounts for that heterogeneity. - **T_Stage:Sex:** This term captures **differences in how tumor size affects survival between males and females**. A significant hazard ratio here would indicate that sex modifies the impact of tumor progression. - **N_Stage:Age:** Suggests that **nodal involvement interacts with age** — the effect of nodal metastasis might be more harmful in older patients. - **Sex (main effect):** Included to interpret interactions properly. Even if **not significant alone**, it may still play an important role when combined with terms like T_Stage or others. #### 4.9.2 Statistical Significance - Terms with *p* \< 0.05 are considered **statistically significant**, indicating strong evidence that the covariate is associated with survival time. - **Borderline terms** (e.g., *p* \< 0.1) may still be **clinically relevant**, especially when part of an interaction. #### 4.9.2 Clinical Implications This model enables **nuanced, individualized predictions**: - A **65-year-old** with advanced **N_Stage** may have a very different risk profile than a **40-year-old** with the same N_Stage. - Patients receiving treatment must be **evaluated within the context of their functional status** (Cond). - **Sex-specific effects** of tumor characteristics are accounted for, allowing **more tailored risk assessments**. # Research Question We investigate whether treatment type (**Standard** vs **Test**) affects survival outcomes for two otherwise identical patients, using the final Cox proportional hazards model that includes interaction terms. ## 1. Clinical Setup We consider two hypothetical patients with the following identical characteristics: - **Age**: 60 years old - **Sex**: Male - **General Condition (Cond)**: 1 - **Tumor Stage (T_Stage)**: 1 - **Nodal Stage (N_Stage)**: 1 The only difference: - Patient A receives **Standard** treatment ($\text{Tx} = 0$) - Patient B receives **Test** treatment ($\text{Tx} = 1$) We aim to compute the **hazard ratio (HR)** comparing these two patients, focusing on the interaction term **Cond × Tx**. ## 2. Mathematical Framework The fitted Cox model includes interaction terms to account for context-dependent effects of covariates. Of particular interest is the interaction between **Condition** and **Treatment**: $$ h(t \mid \mathbf{X}) = h_0(t) \cdot \exp\left( \beta_1 \cdot \text{Age} + \beta_2 \cdot \mathbb{I}(\text{Age} > 46) + \cdots + \beta_{ct} \cdot (\text{Cond} \times \text{Tx}) + \cdots \right) $$ Since both patients have $ \text{Cond} = 1 $, and differ only in treatment type, the hazard comparison simplifies to: $$ \log \left( \frac{h_{\text{Test}}(t)}{h_{\text{Standard}}(t)} \right) = \beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}} $$ Exponentiating gives the hazard ratio: $$ HR = \exp\left( \beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}} \right) $$ This HR quantifies the relative hazard of receiving **Test** treatment versus **Standard** treatment for patients with identical clinical profiles. ```{r} #| include: false coefficients <- m2$coefficients hr_standard <- coefficients[[5]] hr_test <- coefficients[[6]] diff <- hr_test - hr_standard HR <- exp(diff) # Get standard errors from summary m <- summary(m2) # Variance-covariance matrix v <- vcov(m2) # Variance and covariance components var_standard <- v[5, 5] var_test <- v[6, 6] covar <- v[5, 6] # Variance and SE of the difference var_diff <- var_test + var_standard - 2 * covar se_diff <- sqrt(var_diff) # 95% CI for log(HR) ci_log_lower <- diff - 1.96 * se_diff ci_log_upper <- diff + 1.96 * se_diff # 95% CI for HR ci_lower <- exp(ci_log_lower) ci_upper <- exp(ci_log_upper) ``` ```{r} #| label: tbl-hr-cond-tx #| tbl-cap: "Comparison of log hazard ratios and overall hazard ratio for the Cond × Tx interaction term." #| include: false tibble( `log(HR): Cond × Tx (Standard)` = hr_standard, `log(HR): Cond × Tx (Test)` = hr_test, `Difference in log(HR)` = diff, `Hazard Ratio (Test vs Standard)` = HR, `95% CI Lower` = ci_lower, `95% CI Upper` = ci_upper ) %>% gt() %>% fmt_number(everything(), decimals = 3) %>% tab_header( title = "Hazard Ratio: Test vs Standard Treatment", subtitle = "Interaction Term Cond × Tx" ) %>% opt_table_font(font = google_font("Roboto Condensed")) %>% cols_align(align = "center") ``` ## 3. Interpretation The hazard ratio comparing **Test treatment** to **Standard treatment**, holding all other covariates constant, is approximately: $$ HR = \exp(\beta_{\text{Cond:TxTest}} - \beta_{\text{Cond:TxStandard}}) \approx `r round(HR, 3)` $$ This means that a patient receiving **Test treatment** has an estimated: $$ `r round((HR - 1) * 100, 1)`\% \text{ higher hazard} $$ compared to a similar patient receiving **Standard treatment**, assuming all other clinical characteristics are the same. The 95% confidence interval for this hazard ratio is: $$ [`r round(ci_lower, 3)`,\ `r round(ci_upper, 3)`] $$ Since the interval does **not** include 1, this difference is **statistically significant**, suggesting a meaningful association between treatment type and survival outcome. # Conclusion This analysis investigated how demographic and clinical factors—including age, sex, tumor stage, general condition, and treatment type—affect patient survival using a Cox proportional hazards model. After comparing several candidate models, the **Interactive Model** was selected as the final model due to its adherence to the proportional hazards assumption and its capacity to capture clinically relevant interactions. The inclusion of interaction terms—such as **Cond × Tx** and **T_Stage × Sex**—allowed for a more nuanced understanding of how certain covariates influence risk in combination, rather than in isolation. This structure also improved model adequacy, as supported by diagnostic checks including Schoenfeld residual plots and global tests. A key finding was that **treatment type** plays a significant role in patient outcomes: patients receiving the **Test treatment** exhibited a statistically significant increase in hazard compared to those receiving the **Standard treatment**, even after adjusting for age, sex, and staging variables. While the main effect of sex alone was not significant, its interaction with tumor stage revealed heterogeneity in risk that may have important clinical implications. In summary, the final model provides a statistically robust and clinically interpretable framework for understanding which factors most strongly influence survival. These insights may support personalized treatment strategies and inform future research on time-to-event outcomes.