# Phase 6: Results

Report the best model and its test set performance (e.g., AUC). List the top predictors identified via **Variable Importance**. Describe insights from confounder analysis (Phase 3) and **Model Interpretation** (if applicable). Discuss the biological relevance of the top predictors.

<details>

<summary>1. Combine Findings</summary>

1. Identify the top model from phase 4 by considering
   * Model performance metrics
   * ROC Curves
   * Biological relevance of top predictors in Variable Importance
   * Confounder Check (phase 3)
2. Pull together all your findings, including

   * Clustered t-SNE plots for responder classification, if applicable (Phase 2)

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FWpaHPrQKvT5BcuclGA71%2FFF_Phase6_Clustered%20tSNE%20Plot.png?alt=media&#x26;token=00412e13-b85c-4302-821c-ef903e936785" alt="" width="375"><figcaption></figcaption></figure>

   * t-SNE plots and analysis from Confounder check (Phase 3)

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FR5M6O7vdUt9wM7xaS027%2FFF_Phase%20%203_Age%20vs%20HAI%20Responder.png?alt=media&#x26;token=e0bc1a18-b91d-46f5-9a0f-330063004656" alt="" width="563"><figcaption></figcaption></figure>

   * Model performance metrics (Phase 4)

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FzBCHd03BPCdW6EiHZ5pZ%2FFF_Phase%205_Training%20Summary%20Box%20Plots.png?alt=media&#x26;token=4ed63f7c-ccb6-444a-a6c4-8694931b111c" alt="" width="375"><figcaption></figcaption></figure>

   * Training and Testing ROC Curves

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FkZBDAiEQ5lPVU5RS7zWb%2FFF_Phase%205_Combined%20ROC%20Curves%20RF.png?alt=media&#x26;token=5ffd58b2-fe0b-4563-853d-8048e912af48" alt="" width="563"><figcaption></figcaption></figure>

   * Model Interpretation plots, if applicable

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2Fps37yrLxEzKHVZ47KAY4%2FFF_Phase%205_Model%20Interp%20Heatmap%20RF.png?alt=media&#x26;token=2dcf93eb-bdf2-422b-9c5a-e9f80cfe8f08" alt="" width="375"><figcaption></figcaption></figure>

   * Variable Importance bar plot

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FF1uB7a1VURXL8Cnb2XXR%2FFF_%20Phase%205_Exploration_Variable%20Importance%20Plot_white%20background.png?alt=media&#x26;token=225a2ec0-6567-480d-8546-c51a5e2456a3" alt="" width="375"><figcaption></figcaption></figure>

   * Features across dataset dot plots for top predictive features

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FxQXBIy1FZlMrcaordXqB%2FFF_%20Phase%205_Exploration_Features%20Across%20Dataset%20Plot.png?alt=media&#x26;token=0b291037-cef4-4c00-93f5-590eee0840fa" alt="" width="375"><figcaption></figcaption></figure>

</details>

<details>

<summary>2. Analyze GO Terms &#x26; Biological Themes</summary>

The GO terms present in your dataset are a result of pathway enrichment analysis, which is a powerful tool external to PANDORA that helps identify biological themes from gene expression. You can use GO term databases to identify GO terms to uncover overall biological themes in responder groups and model prediction.

* Pathway Enrichment Analysis Tools:
  * clusterProfiler in R
  * DAVID
  * Metascape
  * Enrichr
* GO term databases

  * GO
  * KEGG
  * Reactome

GO Terms alongside predictive variables can be used to identify biological themes using the following workflow:

1. Identify GO terms from your top predictors
   1. Open the Gene Ontology Resource [webpage](https://geneontology.org/)
   2. Search for all your top GO predictive terms in the form GO:#
      1. i.e. `GO:0070206`, `GO:1903214`
   3. Click term history to see ancestor chart, child terms, and co-occurring terms
   4. Create a list of all biological processes and themes related to your GO Terms
2. Check the expression levels of baseline terms in responder groups
   1. Select your predictive processed dataset from the Workspace (This dataset should only contain baseline features and your responder columns)
   2. Navigate to **Discovery** -> **Start** -> **Hierarchical Clustering**
   3. Configure Clustering **Column Selection**

      1. Select your Responder column for the **Columns**
      2. Set **First (n) rows** such that it is larger than the total number of baseline features

      <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2F7Jgd2f8fxSEhiPXKe1Ao%2FFF_Phase%206_Clustering%20Column%20Selection.png?alt=media&#x26;token=9baa7144-1c2f-4df9-b7a5-56b3c28f098e" alt="" width="375"><figcaption></figcaption></figure>
   4. Configure Clustering **Display Options**

      1. Enable **Grouped display**
      2. Select the responder column for **Grouped column**

      <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2FTDE3YxFP8tOmtI2Jy9IM%2FFF_Phase%206_Clustering%20Display%20Options.png?alt=media&#x26;token=ec1c7ff0-e304-439a-a91a-6d7a92056cc4" alt="" width="375"><figcaption></figcaption></figure>
   5. Click **Plot image**
3. Analyze the resultant heatmap

   1. Take note on how the expression of top predictive variables varies among the responder classes.
   2. With biological themes in mind from predictive variables and top GO terms, consider the biological themes among responder classes.

   <figure><img src="https://1845146574-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZMrkCA3Bqd62Gp0kAk79%2Fuploads%2F1LSaT27XLesYfZWUcQOp%2FBaseline%20Feature%20Responder%20Group%20Heatmap.png?alt=media&#x26;token=a16fc293-e861-45cb-ae6c-1f1f0b34d98c" alt="" width="375"><figcaption></figcaption></figure>
4. Make plots reflecting biological themes (optional)
   1. Outside of PANDORA, you may create additional plots, such as radar plots, reflecting the different immune profiles of responder classes based on the baseline or fold change expression levels of features in each class.

</details>

You've now identified and analyzed your strongest model through consideration of model performance, biological interpretation, and confounder analysis. By pulling all your analysis together, you have now created a comprehensive picture of your model to draw biologically relevant insights from.
