Phase 2: Define responders

In this phase of the workflow you will define the outcome variable for later use in predictive analysis.

Classify participants into immune response groups using unsupervised clustering (Example 1) or predefined biological thresholds (Example 2). The resulting responder labels will guide further analysis and visualization of immune response patterns.

This phase presents two distinct methods to create the ResponderStatus column. Choose one path, or potentially run both for comparison.

Example 1: Multivariate Clustering (Using integrated immunaut package)

Navigate to Discovery -> t-SNE Analysis

Expand Column Selection
- Select all *fold_change variables

Expand Cluster Settings
- Set Target Clusters Range to between 2 and 4

Experimental Options

Feel free to experiment and observe the effects of other t-SNE side panel settings, such as:

Section Column Selection
- Grouping Variable, Color Variable
Section Clustering Settings
- Clustering Algorithm, K, Pick 'Best Cluster' Method
Section t-SNE Settings
- Perplexity, Exaggeration Factor, Theta, Maximum Iterations, Learning Rate (Eta)
Section Dataset Settings
- Dataset analysis type
Section Theme Setting
- Theme, Color, Legend position, Font size, Point size, Ratio, Plot size

Click the Plot Image Button

Navigate to Clustered t-SNE analysis to visualize clusters

Navigate to Dataset Analysis
- Based on the heatmap, note the distinguishing features between clusters
- In this case:
  - Cluster 1: Upregulated cellular response and IVPM binding
  - Cluster 2: Upregulated antibody response

Click Actions -> Save to workspace
- Enter a desired file name for the new dataset and click ok
- This saves a new dataset to your dashboard with an added column pandora_clusters - you can now continue using this newly created dataset.

Example 2: Manual Definition Based on Biological Thresholds (Requires manual pre-processing)

Define Responder Status Rule

Define "High Responders" as anyone with h1_hai_gmt_fold_change >= 4 OR h3_hai_gmt_fold_change >= 4
1. This rule is based on a commonly accepted threshold in immunology for high responders, based on an antibody titer increase of fourfold or more¹.

Implement the Rule

Use any tool like Python, R, Excel, etc on the dataset. For this example, Excel is used
Create a new column called ResponderStatus

Search for variable h1_hai_gmt_fold_change in the Excel sheet

Filter by h1_hai_gmt_fold_change ≥ 4

Define high responders
1. Set filtered rows under ResponderStatus to 1 to indicate high responders.

Remove filter

Repeat steps 3 -6 for h3_hai_gmt_fold_change

Filter ResponderStatus column to view rows not equal to 1

Define low responders
1. Set the filtered row values for ResponderStatus to 0 to indicate low responders

Save the .csv file under a new name

Verify Definition

Launch PANDORA
Upload your new .csv file with the added ResponderStatus column to the Workspace

Select the file and navigate to Discovery -> Data Overview
Expand Column Selection
1. Select the ResponderStatus column & another column of choice
2. Click the Plot Image button

Check the distribution plot to see counts of "High Responder" vs "Low Responder"
1. Here we see about an equal proportion of "High Responders" and "Low Responders," indicating suitability for use in further analysis.

References

Centers for Disease Control and Prevention. 2013. Prevention and control of seasonal influenza with vaccines. Recommendations of the advisory committee on immunization practices - United States, 2013-2014. [Published erratum appears in 2013 MMWR Recomm. Rep. 62: 906.] MMWR Recomm. Rep. 62: 1–43.

You’ve now defined the responder variable, which classifies individuals based on immune response. This classification will guide the predictive models developed later.

PreviousPhase 1: Data import NextPhase 3: Confounding check

Last updated 2 months ago

Was this helpful?