A machine-learning approach to identify asthma subtypes based on molecular signatures

  • Home
  • Medicine Blogs
  • A machine-learning approach to identify asthma subtypes based on molecular signatures

A new study published in the Journal of Allergy and Clinical Immunology determines whether protein expression in nasal fluid samples could assist in the diagnosis of asthma subtypes.  

Study: Multidimensional endotyping using nasal proteomics predicts molecular phenotypes in the asthmatic airways. Image Credit: Ljupco Smokovski/Shutterstock

Asthma subtypes

Asthma is a heterogeneous disease that is often characterized into several different phenotypes; however, the distinction between these subtypes is often descriptive and not based on the presence of specific biomarkers.

The diagnosis of non-type 2 immune response (T2) asthma, for example, remains a challenge due to the lack of readily accessible and measurable biomarkers. As a result, there are no biologics or small-molecule treatments that are currently available for the treatment of non-T2 asthma. Thus, there remains an urgent need for non-invasive approaches that can identify molecular asthma phenotypes, which will ultimately improve the treatment outcomes of these patients.

Nevertheless, advances in genetic sequencing technologies have led to the use of data-driven statistical and machine learning (ML) approaches to categorize different diseases, including asthma. In fact, unsupervised clustering has successfully identified asthma phenotypes and underlying patterns of clinical variables.

About the study

The current study involved 60 adults with severe asthma who were enrolled in the endotype of non-eosinophilic asthma (ENDANA) study. Nasal fluid was collected from all study participants using an absorbent fibrous matrix procedure followed by proteomic analysis.

Statistical testing, Shapley values, and unsupervised clustering were performed for the stratification of patients and biomarker identification. Finally, the nasal fluid proteomics-based ENDANA clusters validation was carried out using Unbiased Biomarkers for the Prediction of Respiratory Disease Outcomes (U-BIOPRED) transcriptomic data.

Study findings

About 70% of the study participants had late-onset asthma, 41.67% exhibited fast lung function decline and fixed airway obstruction, 50% were atopic, and 25% experienced near-fatal asthma. The induced sputum cellular profile of the nasal samples revealed that 17 participants had eosinophilic asthma, three had paucigranulocytic asthma, four had mixed eosinophilic/neutrophilic asthma, and 36 had neutrophilic asthma.

Subsequent screening of the 20 most important proteins according to previous publications was done using Shapley values, which clustered nasal proteomics data and stratified these results into two clusters denoted “0” and “1.” The most significant hit was the scaffold protein (PPP1R9B), which was associated with many regulatory pathways but not asthma.

Cluster 1 patients had fewer nasal polyps, significant airway obstruction, small airway disease, increased oxidative status levels, decreased diffusing capacity, and were more likely to be current smokers. The presence of air thickening did not differ between clusters 1 and 0; however, this may be due to the greater degree of air resistance, air trapping, and small airways disease observed in cluster 1.

Moreover, 41 differentially expressed proteins allowed the researchers to further distinguish the two clusters. The most efficient discriminative protein was BACH1.

Two severe asthma clusters of X1 and X2 were also identified from patients who belong to the U-BIOPRED cohort to closely match patients from the ENDANA study based on their transcriptomic or proteomic data, respectively.  

Cluster X2 patients had a different profile of clinical variables as compared to cluster X1 patients. Comparatively, patients in cluster X1 exhibited similar characteristics to those in ENDANA cluster 1, including fewer nasal polyps, reduced lung function, significant airway obstruction, small airway disease, increased airway resistance, and air trapping. A total of 32 pathways were differentially enriched between clusters X2 and X1.

Taken together, this cross-validation approach confirmed that the same ML model could be used to identify a similar cluster of asthma patients on both nasal fluid proteomics and transcriptomic nasal brushing data.

The pathways associated with the 41 differentially expressed ENDANA proteins were associated with intracellular signal transduction and cytokine receptor binding. Various immune responses to infection were also identified in this analysis, some of which included interferon (IFN), tumor necrosis factor (TNF), toll-like receptor (TLR), and interleukin 10 (IL-10).

Of these 41 proteins, 28 were similarly dysregulated in both the U-BIOPRED X1 and X2 clusters. This suggests that the molecular features in both the ENDANA and U-BIOPRED clusters were closely related.

Notably, five of the 41 identified proteins overlapped those identified in previous studies, all of which were highly enriched in U-BIOPRED patients diagnosed with severe neutrophilic asthma. Interestingly, these proteins were also highly enriched in patients with mixed granulocytic, neutrophilic, and inflammasome-driven phenotypes. Although further studies are needed, the identification and measurement of these five proteins in nasal samples may offer a non-invasive approach to diagnosing severe neutrophilic asthma.


The ML model approach used in the current study successfully identified patients with severe asthma from both proteomic and transcriptomic data, with overlapping features observed in both cohorts. This proof-of-concept study extends previous findings confirming the ability to use differentially expressed proteins or genes identified in nasal samples to identify distinct asthma subtypes that can be used for both the diagnosis and development of targeted treatments.


The sample size of the current study was small. Furthermore, the study findings cannot be generalized, as both the ENDANA and U-BIOPRED data sets mostly comprised individuals of White European descent.

Journal reference:

  • Agache, I., Shamji, M. H., Kermani, N. Z., et al. (2023). Multidimensional endotyping using nasal proteomics predicts molecular phenotypes in the asthmatic airways. Journal of Allergy and Clinical Immunology. doi:10.1016/j.jaci.2022.06.028.

#machinelearning #approach #identify #asthma #subtypes #based #molecular #signatures
Also Read:

Free Lifetime VPN

Stream Free Latest Tv-shows and Movies Online

Watch Live Channels

Leave a Comment