Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Authors: Zhang, Haocheng; Ai, Jing-Wen; Yang, Wenjiao; Zhou, Xian; He, Fusheng; Xie, Shumei; Zeng, Weiqi; Li, Yang; Yu, Yiqi; Gou, Xuejing; Li, Yongjun; Wang, Xiaorui; Su, Hang; Xu, Teng; Zhang, Wenhong

Link to paper:

Journal/ Pre-Print: Clinical Infectious Diseases

Tags: Bioinformatics, Immunology/Immunity, Microbiology

Research Highlights

1. Compared to SARS -CoV-2 negative pneumonia patients, SARS-CoV-2 positive sputum microbiota shows reduced alpha diversity.

2. A host transcriptional signature shows 36 differentially expressed genes in nasal swabs and sputum specimens of SARS-CoV-2 positive vs SARS-CoV-2 negative pneumonia patients.

3. Integration of this host transcriptional signature into a classifier leads to improved COVID-19 diagnosis vs assessing SARS-CoV-2 positivity alone.


This study characterises the transcriptional profile of the microbiota and host responses in either nasopharyngeal or sputum samples (but not both) from COVID-19 positive vs negative pneumonia patients. Microbiome analysis determined that SARS-CoV-2 positivity was associated with reduced alpha diversity in sputum, but not nasopharyngeal, samples. Analysis of the host response led to the identification of 36 genes differentially expressed in both the nasopharynx and sputum samples. Regression of this signature against infection status led to the production of a classifier with improved ability to predict SARS-CoV-2 infection and disease severity. The robustness of these results is brought into doubt by failure to adequately report coverage statistics for the host and metatranscriptome.

Impact for SARS-CoV2/COVID19 research efforts

· Understand the role of the nasal or sputum microbiota and simultaneous host response to SARS-CoV2/COVID19

· Develop diagnostic tools for SARS-CoV2/COVID19

Study Type

· In silico study / bioinformatics study

· Patient Case study

Strengths and limitations of the paper


The first study characterizing the microbiome and host response simultaneously by applying metatranscriptomic analysis on nasal and sputum samples.

Standing in the field:

This study provides evidence of a distinct sputum microbiome in SARS-CoV-2 positive vs. negative pneumonia patients, which is accompanied by changes in the host transcriptional signature. However, sequence coverage of the microbiome appears to be highly variable and is not adequately reported in the text, making it very difficult to assess the robustness and accuracy of the microbial taxa detected. Changes in the lung microbiome in BAL samples in pneumonia vs healthy controls previously reported, although no characteristics to separate SARS-CoV-2 vs other pneumonia (in this case community acquired) identified (

Appropriate statistics:

No reporting of the number of sequence reads mapping to the (non-viral) microbiome. Variation in the number of reads mapping to the SARS-CoV-2 reference genome (from 2 to 19million) suggest this could be highly variable. Reported differences in alpha diversity could therefore be an artefact, resulting from the fact that sampling of the microbiome was inadequate and variable between individuals.

Similarly, no reporting of the number of reads mapping to the host genome means it is difficult to assess the robustness of the detection of genes selected for inclusion in the elastic-net regression. 

Viral model used: -

Human SARS-CoV-2 vs non-Covid-19 pneumonia patients. Suspected Covid-19 cases defined based on clinical and epidemiological data. Confirmed Covid-19 cases defined by detection of SARS-CoV-2 RNA by RT-PCR or next generation sequencing.


(If verified) the host genetic signature harbours potential to be used, in combination with the detection of SARS-CoV-2 RNA, as a diagnostic tool with high sensitivity and specificity.

Main limitations:

· Actual samples sizes analysed small:

o 38 SARS-COV2 positive and 75 SARS-COV2 negative patients

o Due to further subdivision in nasal swabs (NS) and sputum samples (SP) only 24 and 14 samples of SARS-COV2 positive patients were analysed respectively in comparison to SARS-COV2 36 negative NS and 39 negative sputum samples.

· Healthy age-matched controls missing and vital in setting up diagnosis tools.

· It is unclear if the classifier created applying the 36 DEGs relies on the variables of age and sex, as excluding both variables (supplementary graph) shows a similar diagnostic performance.

· Reporting of ‘genes expressed’ by SARS-CoV-2 takes no account of the replication strategies employed by coronaviruses (i.e. positive-sense RNA viruses).

· Sequencing of the metatranscriptome appears to be shallow (10million single-end 75bp reads) for robust detection of the nasopharyngeal microbiome. This is potentially exacerbated by the fact that sequenced libraries may also contain low-complexity carrier RNA added during viral RNA extraction protocol.