githubEdit

3COVID Pitch workflow

Exploring post SARS-CoV-2 infection immune trajectories and predicting durable immunity

This example workflow will answer the key immunological questions using the SARS-CoV-2 dataset and the PANDORA software. The following six phases will walk you through the analysis for this example:

chevron-rightPhase 1: Data overviewhashtag

Prepare the dataset for downstream analysis by uploading it to PANDORA to assess the different data types within the dataset, view their distributions for preliminary exploration, and inspect missing values, which are especially common in longitudinal studies.

PANDORA Tools Utilized: Workspacearrow-up-right, Data Overviewarrow-up-right (Distribution Plot, Table Plot)

Outcome: A foundational understanding of the dataset's characteristics and quality.

chevron-rightPhase 2: Multivariate exploratory analysishashtag

In this phase, the analysis addresses the first objective, "Visualize the trajectories of diverse immune responses over 6 months". Principal Component Analysis (PCA) allows investigation of how individuals cluster based on their overall immune profile and whether this relates to features such as disease severity, changes over time, or responder status. Correlation analysis helps understand the relationships between different immune measurements across all samples and timepoints.

PANDORA Tools Utilized: PCA Analysisarrow-up-right, Correlationarrow-up-right

Outcome: Uncover patterns in the data that reveal insights on the trajectory of immune responses over 6 months, and relationships between immune parameters over time as it relates to disease severity and responder status.

chevron-rightPhase 3: Data pre-processinghashtag

This phase isolates the specific data needed for the supervised task: predicting the 6-month outcome from early data.

Tools Utilized: Python, R, or Excel

Outcome: A new filtered dataset in a form that the predictive ML models can use effectively to predict durability, and determine early immune signatures that can predict the durability of a person's immune response to SARS-CoV-2 infection.

chevron-rightPhase 4: Predictive modelling hashtag

Configure and initiate machine learning models within PANDORA, using early immune measurements (28 days pso) as predictors for the Responder outcome where a high responder is defined as anti-N Ab titer ≥ 1.4 = High responder (seropositive).

Tools Utilized: Predictive (SIMON) Interfacearrow-up-right

Outcome: Trained predictive models ready for evaluation.

chevron-rightPhase 5: Model evaluationhashtag

Assess the performance of the trained models using appropriate metrics (e.g., AUC), and utilize explainable AI techniques to identify the most influential early immunological features driving the predictions.

PANDORA Tools Utilized: Predictive arrow-up-right(Exploration: Metrics, ROC Curve Analysis, Variable Importance)

Outcome: Identification of the optimal predictive model(s) and key predictive immunological parameters.

chevron-rightPhase 6: Resultshashtag

Consolidate all analytical results, interpret the biological significance of the top predictors, and formulate a comprehensive report on the model's performance and findings as it relates to the key objectives.

PANDORA Tools: Review PANDORA Outputs from prior analysis

Outcome: A complete report that interprets analytical results for real immunological insights.

Last updated

Was this helpful?