Phase 4: Predictive modeling
In this phase, you will create models to predict response classification from baseline immune measurements.
Prepare your dataset for predictive analysis by removing outcome variables that could bias results, ensuring that only baseline predictor variables remain. Then, configure and run predictive models in PANDORA using the cleaned dataset.
1. Process dataset
To ensure unbiased predictions, it's important to remove any outcome variables that aren't the designated responder. Various tools can be used for this step, but Excel is used in the example below.
Open your Flu Fighter dataset with responder columns in Excel
Search and remove all undesired outcome variables. A few examples below:
ch6_titer_v21,h3_v2_shed,h1_hai_gmt_fold_changeHelpful search terms
fold
v2
v7
Select and delete every column containing these terms.

Save as a new predictive processed .csv file
Upload the new file to PANDORA
2. Setup prediction task
Navigate to Workspace
Select the processed Flu Fighters dataset with added
ResponderStatusorClustercolumn

Navigate to Predictive -> Start

Configure Analysis Properties
Select all columns as Predictor variables
Use PANDORA's Exclude predictors for
*fold_change,v2,v7,v21or any other accidental outcome variables. There should be none if the predictive processing was completed correctly in step 1.Select
ResponderStatusorpandora_clustercolumn for ResponseSelect Preprocessing options
center,scale,medianimpute,corr,zv, andnzvSet Training/Testing dataset partition to 75% training and 25% testing

Select packages for your predictive models
For this example, select
rf,nb,glm,mlp, andC5.0

Experimental Options
When creating your own predictive models, you can experiment with the following:
Training/Testing dataset partition: Different models perform better in different partitions, and experimenting with this parameter can help generate the best model.
Packages: PANDORA has 200+ packages for predictive models, and you can even select a whole family of models with similar features.
Caution: Running too many models simultaneously on a personal computer may significantly increase processing time, and computationally intensive models may fail due to Timeout
3. Run analysis
Click the Validate data button
Click the Process button on the pop-up that appears

Monitor Progress on your PANDORA Dashboard

You’ve successfully processed your dataset to remove bias-inducing outcome variables and configured predictive models using PANDORA. Once your models have completed processing, you're ready to interpret the results and evaluate model performance in the next phase.
Last updated
Was this helpful?