Dataset Description
This dataset contains a mixture of categorical clinical parameters (clinical symptoms, disease severity) and numerical immunological parameters (immunoglobin, T cells, memory B cells, antibodies) taken over a period of 6 months, denoted by timepoint and days post onset of symptoms, along with basic demographics (age, sex) for each donor.
For more details about each variable, see the variable legend for this dataset.

Questions to answer
The goal of this workflow is to use this COVID Pitch dataset to answer two important questions:
Are there certain immune parameters that can explain the disease severity experienced by individuals and that are dependent on time post SARS-CoV-2 infection?
Can we utilize certain immune parameters measured early after infection to predict whether an individual builds a durable immune response to SARS-CoV-2?
As this study was conducted at the beginning of the COVID-19 pandemic, these questions were critical for determining correlates of protection for this disease, understanding immune profiles from various degrees of infection and innovating measures for future cases. Below, we explain the broad groups of variables in this dataset that allowed to find answers to these questions.
Clinical Parameters
These variables consist of clinical symptoms most commonly associated with SARS-CoV-2 infection, along with the severity of disease experienced by the donors
📋Clinical Symptoms
Common symptoms for SARS-CoV-2 infection:
FeverCoughChange or loss of tasteAnosmia: Complete loss of sense of smellFatigueShortness of breathNasal congestionSore throatMyalgia: Muscle pain or sorenessArthralgia: Joint painHeadacheDiarrhoeaVomitingNauseaChest painAnorexia: Excessive weight lossAsthma
😷Disease Severity
Asymptomatic: Donor presents on symptoms while infectedMild: Donor presents with moderate symptoms upon infectionSevere: Donor presents with more extreme symptoms upon infection
Immunological Parameters
These variables comprise of various immunological assays that quantify immune parameters such as T cells, memory B cells, antibodies and associated processes to obtain a comprehensive view of the cellular and humoral adaptive immune responses.
🛡️Antibodies
Pseudo-Neutralisating Antibodies (
pseudoNA Abs): Concentration of neutralizing antibodies to inhibit infection by a SARS-CoV-2 pseudovirus determines efficacy of the donor's antibodiesAntibody-Dependent Effector Functions: Effector functions such as monocyte phagocytosis (
ADMP) and NK cell activation (ADNKA) give insight to how antibodies are neutralizing the virusImmunoglobin Assays: Provides measurements of antibodies specific to SARS-CoV-2 proteins both in the mucous membranes (
S-IgA) and circulating in the blood (N-IgG,S-IgG)Meso Scale Discovery (MSD) Assays: Gives insight to whether the donor's antibodies provide protection against various coronavirus strains (e.g. 229e, NL63) and other respiratory viruses (e.g. MERS)
🦠T cells
SARS-CoV-2 protein-specific T cell responses: Conveys proteins of the SARS-CoV-2 virus the T cells of the donor reacts to most by measuring concentration of T cells specifically responding to each protein (e.g. ORF3, nsp3b, S1)
T cell proliferative responses: Provides insight into which SARS-CoV-2 proteins are most targeted by the CD4 and CD8 cells by measuring their proliferation for each protein
🧪B cells
Spike-specific IgG+ Memory B cells: Concentration of memory B cells specific to seasonal coronavirus stains gives insight to the durable immunity the body has to the various viruses
Responder Status
The seropositivity of the donors at 6 months post symptoms onset determined whether the donor was a low or high responder. This seropositivity was calculated by the titer of the anti-nucleocapsid-specific antibodies. A titer of greater than or equal to 1.4 indicated seropositivity. We use the corresponding variable, Responder, as the feature that machine learning algorithms predict using early timepoint immune parameters.
Demographic and Time variables
Last updated
Was this helpful?