Flu Fighters workflow

Predicting LAIV Response

This example workflow demonstrates the application of PANDORA to investigate predictors of immune response to the Live Attenuated Influenza Vaccine (LAIV), using the "Flu Fighters" dataset. The overarching goal is to identify baseline immune features capable of classifying participants into "high" or "low" vaccine responder categories.

Workflow phases:

Phase 1: data configuration & initial inspection

Prepare the dataset for analysis by uploading it into PANDORA, examining its structure, identifying missing data patterns, and visualizing initial variable distributions and correlations.

PANDORA Tools Utilized: Workspace, Data Overview, Correlation.

Outcome: A foundational understanding of the dataset's characteristics and quality.

Phase 2: defining vaccine responders

Categorize participants into distinct immune response groups (e.g., "high" vs. "low" responders) based on post-vaccination outcome variables. This establishes the target variable for subsequent predictive modeling.

PANDORA Tools/Methods Utilized: t-SNE Analysis (for data-driven clustering) or manual definition based on external biological thresholds.

Outcome: A new 'ResponderStatus' variable classifying each participant.

Phase 3: confounding variable assessment

Evaluate whether potential confounding variables (e.g., age, sex, study year) are differentially distributed across the defined responder groups, which could bias downstream analyses.

PANDORA Tools Utilized: t-SNE Analysis (visualizing group distributions).

Outcome: Assessment of potential confounding to ensure the robustness of predictive findings.

Phase 4: predictive modeling setup

Configure and initiate machine learning models within PANDORA, using baseline immune measurements as predictors for the 'ResponderStatus' outcome defined in Phase 2.

PANDORA Tools Utilized: Predictive (SIMON interface).

Outcome: Trained predictive models ready for evaluation.

Phase 5: predictive model evaluation & interpretation

Assess the performance of the trained models using appropriate metrics (e.g., AUC) and to identify the most influential baseline features driving the predictions using explainable AI techniques.

PANDORA Tools Utilized: Predictive (Exploration: Metrics, ROC Curve Analysis, Variable Importance, Model Interpretation).

Outcome: Identification of the optimal predictive model(s) and key predictive biomarkers.

Phase 6: synthesis of findings

Consolidate all analytical results, interpret the biological significance of the top predictors, and formulate a comprehensive report on the model's performance and findings.

PANDORA Tools/External Analysis: Review of PANDORA outputs, potential pathway enrichment analysis (external), biological literature review etc..

Outcome: A complete analytical report with actionable insights.

Last updated

Was this helpful?