name: multi-endpoint-sim description: > Guide users through multi-endpoint group sequential trial simulation with multiplicity-controlled testing. Use this skill when the user asks about: simulating trials with OS, PFS, and ORR endpoints, illness-death model simulation with gsDesign bounds, sequential p-values in simulation loops, combining graphicalMCP with gsDesign for simulation-based operating characteristics, cumulative rejection probabilities, or building a full pipeline from design through simulation to multiplicity-adjusted testing.
Multi-Endpoint Group Sequential Trial Simulation
This cross-package skill covers the full pipeline for simulating multi-endpoint trials with group sequential bounds and graphical multiplicity control.
Required packages
library(gsDesign) # Group sequential design and sequential p-values
library(gMCPLite) # Illness-death model (sim_illness_death, cut_illness_death)
library(graphicalMCP) # Multiplicity graph and graph_test_shortcut
library(simtrial) # wlr() for logrank Z-statistics
library(parallel) # mclapply for simulation
When to use this skill
- Designing a trial with multiple endpoints (OS, PFS, ORR) tested at different analyses
- Simulating correlated endpoints via the illness-death model
- Computing simulation-based operating characteristics (power, rejection probabilities)
- Applying graphical multiplicity testing per simulated trial
- Using sequential p-values from
gsDesign::sequentialPValue()in simulation loops
Pipeline overview
- Design: Use
gsDesign::gsSurvCalendar()for the sample-size-driving endpoint (OS),gsDesign::gsSurvPower()for secondary time-to-event (PFS), andgsDesign::nBinomial()for binary endpoints (ORR). - Multiplicity graph: Build with
graphicalMCP::graph_create()allocating alpha across hypotheses. - Transition rates: Calibrate with
build_transition_rates(), modify for piecewise hazards. - Simulation: Use
sim_illness_death()+cut_illness_death()to generate ADTTE data at each analysis time. - Test statistics:
simtrial::wlr()for TTE endpoints,gsDesign::testBinomial()for binary. - Sequential testing: Compute sequential p-values with
gsDesign::sequentialPValue(), then test withgraphicalMCP::graph_test_shortcut()at each analysis. - Operating characteristics: Track first rejection analysis per hypothesis, compute cumulative rejection probabilities.
Key code patterns
For detailed code templates, read references/code_patterns.md.
Topics covered:
- Test statistic functions (logrank via wlr, binomial via testBinomial)
- Sign conventions for Z-statistics across packages
- Simulation loop structure (parallel processing)
- Spending time computation from actual vs planned events
- Sequential p-value computation at each analysis
- Per-trial graphical testing loop
- Cumulative rejection probability table
- Correlation matrix of test statistics
Related skills
- gsDesign: Design and spending functions (
gsSurvCalendar,sfExtremeValue2,testLower,sequentialPValue). Sequential p-value theory: Liu & Anderson (2008). - illness-death: Illness-death model calibration and piecewise rate modification
- simtrial:
wlr()for logrank Z-statistics - graphicalMCP:
graph_create()andgraph_test_shortcut()for multiplicity testing - graphicalMCP-gsDesign2: Similar workflow using gsDesign2 instead of gsDesign (observed data, not simulation). Includes theoretical basis for the Maurer-Bretz (2013) framework.
Important design considerations
Theoretical basis: The per-trial sequential testing loop implements Algorithm 1 of Maurer & Bretz (2013). Sequential p-values from
sequentialPValue()(Liu & Anderson, 2008) are passed tograph_test_shortcut()at each analysis to control FWER.Sign conventions:
wlr()returns positive Z when experimental is better.testBinomial(x1=exp, x2=ctrl)returns positive Z when experimental is better. Both conventions must match forsequentialPValue()(which expects positive Z = favorable).Spending time in simulation: Use
pmin(planned_events, actual_events) / planned_final_eventsat interim analyses and spending time = 1 at final analyses. This prevents over-spending when simulated events exceed planned.ORR is not group sequential: ORR is tested at a single analysis (e.g., IA2). Use the nominal p-value
pnorm(-z_orr)directly rather than a sequential p-value.Separate design vs simulation effects: Use the design HR for sample size and bounds (e.g., 0.75), but weaker simulation HR (e.g., 0.80) for realistic operating characteristics.
Graph carries forward: Use the same
graph_test_shortcut()call at each analysis with updated sequential p-values. The graph handles alpha reallocation from rejected hypotheses.Track first rejection: Record the first analysis at which each hypothesis is rejected. Use this to compute cumulative rejection probabilities by analysis.