Skip to main content

Comparative genomic analysis of Fusobacterium nucleatum reveals high intra-species diversity and cgmlst marker construction

Abstract

Background

Fusobacterium nucleatum is a one of the most important anaerobic opportunistic pathogens in the oral and intestinal tracts of human and animals. It can cause various diseases such as infections, Lemierre's syndrome, oral cancer and colorectal cancer. The comparative genomic studies on the population genome level, have not been reported.

Results

We analyzed all publicly available Fusobacterium nucleatums’ genomic data for a comparative genomic study, focusing on the pan-genomic features, virulence genes, plasmid genomes and developed cgmlst molecular markers. We found the pan-genome shows a clear open tendency and most of plasmids in Fusobacterium nucleatum are mainly transmitted intraspecifically.

Conclusions

Our comparative analysis of Fusobacterium nucleatum systematically revealed the open pan-genomic features and phylogenetic tree based on cgmlst molecular markers. What’s more, we also identified common plasmid typing among genomes. We hope that our study will provide a theoretical basis for subsequent functional studies.

Introduction

Fusobacterium nucleatum is a Gram-negative bacterium, one of the most important anaerobic opportunistic pathogens in human and animals, and is mainly found in the oral and intestinal tracts [1]. Fusobacterium nucleatum can cause periodontal disease, acute necrotizing gingivitis, oral cancer, ulcerative colitis, crohn's disease and colorectal cancer, and even changes in the local inflammatory environment [2]. Fusobacterium nucleatum can lead to overgrowth of non-functional tissues, hence the name "oncobacterium" [3]. Fusobacterium nucleatum is highly toxic as it produces lipopolysaccharides (LPS), endotoxins and haemolysins [4]. Although it is part of the normal microbiota of human tissues, it can invade tissues following surgical or accidental trauma, oedema, hypoxia and/or tissue destruction and is highly pathogenic [5].

The biological functions of Fusobacterium nucleatum are currently being studied in depth. It is one of the few non-spore-producing anaerobic species that uses amino acid catabolism to provide energy, using glutamate, histidine and aspartate [6, 7]. Its metabolism naturally increases the pH of its local environment by consuming amino acids and releasing ammonia, thereby enabling the growth of acid-sensitive bacteria such as Porphyromonas gingivalis [8]. Fusobacterium nucleatum has an outer membrane with a large number of proteins on its outer cell surface, and specific interactions can be found between the bacteria and various complementary structures on the surface of the host cell [9]. This adhesion is mediated by adhesion factors. This adherence is important for the colonisation and establishment of infection in susceptible hosts. Adhesion A (FadA) is a bacterial hair adhesion protein that has recently been shown to be required for bacterial attachment and invasion of the gingival epithelium and endothelium [10]. It is conserved in the genus Fusobacterium that inhabits the oral mucosa and is important for cell binding [11]. It has been demonstrated that Fusobacterium nucleatum is an important contributor to oral biofilm development [11]. In addition to oral diseases, Fusobacterium nucleatum has been reported to be associated with a variety of intestinal diseases [3]. A meta-analysis showed that Fusobacterium nucleatum’s DNA was more likely to be detected in colorectal tumour tissue compared to adjacent healthy tissue and control tissue [12]. Its DNA was also higher in colorectal polyp tissue compared to healthy tissue in the control group [12]. In another study, Fusobacterium nucleatum was shown to mediate the development of colon cancer and the concomitant metastasis of the tumour [13]. In summary, the studies of Fusobacterium nucleatum have focused on biological mechanisms, but comparative genomic studies in this species, particularly the population genome level, have not been reported. Meanwhile, due to pubmlst database does not contain mlst gene markers for Fusobacterium nucleatum, so the development of cgmlst molecular markers with high resolution for this species is required.

In this study, we collected all publicly available Fusobacterium nucleatums’ genomic data for a comparative genomic study, focusing on the pan-genomic features, virulence genes and plasmid genomes of the species and developed cgmlst molecular markers for the species, with the aim of providing a theoretical basis for subsequent identification and in-depth functional studies of Fusobacterium nucleatum.

Materials and methods

Public data acquisition and quality control

The Fusobacterium nucleatums’ genomic data included in this study for genomic analysis were all downloaded from the NCBI database (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/Fusobacterium_nucleatum/). Phenotype information was also obtained from the NCBI database. The genomic data were downloaded and evaluated for quality of assembly and core gene content using QUAST (version 5.2.0) [14] and BUSCO (version 5.4.3) [15] software, respectively, with > 90% integrity, < 5% contamination and < 500 scaffolds. The above software used default parameters in the analysis.

Genome annotation and pan-genome construction

The filtered genomes were used to construct a pangenome of Fusobacterium nucleatum. The genomes were firstly annotated using Prokka (version 1.14.6) [16] and the Fusobacterium nucleatum pangenome was constructed using Roary (version 3.10.2) [17] based on the genome annotation file (gff3 file). We classified core/ unique genes by using Roary with default parameters (95% identity for blastp and 99% of isolates a gene must be in to be core). We drew pan genome plot of Fusobacterium nucleatum by using Pan-GP [18]. Both Prokka and Roary used default parameters in the analysis. The core gene set and cgmlst molecular marker construction were based on the gene_presence_absence.csv file from Roary's results. Functional annotation of the core gene set was enriched using KAAS [19] (https://www.genome.jp/tools/kaas/). The phylogenetic tree was constructed by using the ‘core_gene_alignment.aln’ from Roary result. We used the Fasttree to generate the phylogenetic tree with 1000 replications [20]. The evolutionary tree of the genomes was visualized using iTOL [21].

Key genes prediction and evolutionary tree construction

Virulence genes were predicted by using blastp based on the VFDB database [22] (http://www.mgc.ac.cn/VFs/). Resistance genes were predicted by the CARD database [23] (http://arpcard.mcmaster.ca). The parameters of blastp were e value 1e-5, similarity 60%, qcov 60% and tcov 60%. The FadA gene protein sequence was extracted based on the annotation results from Prokka. Multiple sequence alignment of the FadA protein sequence was performed using mafft, evolutionary tree construction was performed by MEGA [24] and evolutionary tree annotation was performed using iTOL.

Plasmid genome prediction and genomic analysis

Plasmer software (https://github.com/nekokoe/plasmer) and Platon were used to perform plasmid prediction on the whole genome sequence of Fusobacterium nucleatum after quality control [25]. The predicted sequences were verified by blast against a non-redundant nucleic acid library from NCBI. The circular represent map of the plasmids was visualized using CGview [26].

Results

Summary of genomic data

A total of 105 Fusobacterium nucleatums’ genome sequences were downloaded from the NCBI, quality controlled and evaluated for core genes. Finally, 93 genomes were selected for further analysis (Additional file 5: Table S1). The number of scaffolds ranged from 1 to 379, with a maximum N50 value of 2,653,055 bp, a minimum value of 8,680 bp and an average genome size of 2,369,555 bp. We collected the metadata of the downloaded strains (Additional file 6: Table S2) and found that Fusobacterium nucleatum was isolated mainly from the oral cavity (N = 33) and the intestine (N = 16), excluding the majority of strains with unknown phenotypes.

Pan-genomic characterization of Fusobacterium nucleatum

A total of 93 Fusobacterium nucleatum strains were included in the first pangenome analysis. The Fusobacterium nucleatum genome contains a total of 21,139 gene families, of which 516 are core (present in more than 95% of the genome). The number of variable gene families is 20,623. According to Fig. 1, the pan-genome shows a clear open tendency, and the size of the pan-genome continues to increase with the number of genomes included in the analysis, showing a continuous upward trend. The number of emerging gene families in the pan-genome increases with the number of genomes, and in turn the size of the pan-genome will expand. The heatmap of gene presence-absence matrix showed two distinct clades in Fusobacterium nucleatum (Additional file 1: Figure S1).

Fig. 1
figure 1

The pan genome plot of Fusobacterium nucleatum. A. Conserved genes and Total genes. B. New genes and Unique genes

Based on the core gene set, we constructed a cgmlst molecular marker (N = 298) for Fusobacterium nucleatum (Additional file 7: Table S3A) and a phylogenetic tree for 93 strains based on this markers (Additional file 2: Figure S2). The phylogenetic tree showed that there were no obvious clades of Fusobacterium nucleatum and, based on the known meta information, the strains from the oral cavity as well as the intestine were scattered and did not aggregate significantly. Functional enrichment of these 298 genes showed that they were mainly derived from the Ribosome and ABC transporters pathways (Additional file 3: Figure S3). Notably, we also attempted to construct separate cgmlst molecular markers from the oral cavity and intestine (Additional file 7: Table S3BC), and the Venn diagram shows that these two types of markers share 384 genes, while the oral cavity (N = 16) and intestine (N = 161) each retain a small number of cgmlst genes (Additional file 4: Figure S4).

Bioinformatic analysis of virulence genes and FadA gene

We examined the virulence genes in the genomic data of 93 Fusobacterium nucleatum strains based on the VFDB database (Fig. 2). A total of 11 virulence genes were found to be present in the genome, notably groEL, clpP and acpXL were found to be present in 93 strains with copy number 1. tufA was present in most strains, while other virulence factors such as cap8E, neuB and wbtE were present in a small number of strains. In addition, we also predicted drug-resistant genes for these strains and found that the majority of Fusobacterium nucleatum did not carry those genes, but were present in only a few strains.

Fig. 2
figure 2

Heatmap of virulence related genes in Fusobacterium nucleatum

We also analyzed the Fusobacterium nucleatum genomes for the FadA genes, a hair adhesion protein that is important for cell binding. We found that 90 of these strains contained the FadA gene in their genome sequences and, based on the FadA protein sequence, we constructed a phylogenetic tree that showed three distinct clades of the FadA gene, with strains from the oral and intestinal tracts in each of the three clades (Fig. 3A). In addition, we investigated the upstream and downstream structure of the FadA gene and found that the upstream and downstream structure of the FadA gene is relatively conserved in Fusobacterium nucleatum genomes, with the FadA gene surrounded by ABC transporter permease and Peptidylprolyl isomerase, and upstream and downstream genes such as EnvC and NAD kinase (Fig. 3B).

Fig. 3
figure 3

Genomic analysis of FadA genes in Fusobacterium nucleatum. A. The phylogenetic tree of FadA genes. B. the genomic structure of FadA genes

Plasmid prediction and genomic analysis of Fusobacterium nucleatum

We have used the newly developed plasmid prediction tool Plasmer to predict the genome sequences of Fusobacterium nucleatum. In total, we found plasmid sequences in the genomes of 42 strains. We then filtered plasmid sequences with high quality genomes for subsequent analysis (number of contigs < 3) and validated the plasmids based on the NCBI non-redundant nucleic acid library. In total, we identified 17 strains with relatively complete plasmid sequences present. Of these plasmid genomes, 13 are known, and in addition we identified four unreported sequences of around 15 K in length, which we speculate are likely to be newly discovered plasmid sequences (Table 1). Among the known plasmid genomes, five strains carry plasmid type 7–1, while other plasmid types include 4–8, pFN3 and pCT15E1. 7–1 plasmid has a genome size of 6.3 K and contains a total of seven mRNA-encoding genes, most of which are putative proteins, with no resistance or virulence genes identified (Fig. 4).

Table 1 The predicted plasmids of Fusobacterium nucleatum
Fig. 4
figure 4

The circular representation map of 7–1 plasmid

Discussion

Most studies on Fusobacterium nucleatum have focused on its biological functions and the genome of individual bacteria, but the population genomes of this species has not been reported. In this study, a pan-genomic characterization of Fusobacterium nucleatum was constructed for the first time based on the genomic data of about 100 strains publicly available to provide a panoramic view from a population perspective. From a pan-genomic view, the core gene family of this species was 516, accounting for 23% of the total number of genes per strain on average. The low proportion of core gene families, combined with the results in Fig. 1, show that the total number of genes as well as the number of unique genes in this species did not show a flat trend, suggesting that the genome is very plastic and that the available number of strains may not allow a complete assessment of the overall pan-genomic trend of Fusobacterium nucleatum.

The development of a set of molecular markers for Fusobacterium nucleatum identification is important as several studies have reported the association of this species with the development of various diseases such as infections, Lemierre's syndrome, oral cancer and colorectal cancer [3]. As the Fusobacterium nucleatum genome has not been well studied, no previous studies have designed mlst molecular markers for this species and no ST typing for this strain has been included in the pubmlst database. With the development of sequencing technology, more and more genomic data of Fusobacterium nucleatum will be available. In this study, the first attempt was made to construct the cgmlst molecular markers for this species. Compared with traditional mlst markers, cgmlst markers has the advantages of good universality and high resolution. Since the genome size and phenotypic information of Fusobacterium nucleatum are currently inadequate, the clades and the corresponding phenotypic association studies need to be strengthened.

In this study, the virulence genes of Fusobacterium nucleatum were studied in detail. Three virulence genes, groEL, clpP and acpXL, were found to be present in each strain. groEL was shown to be involved in the adhesion or invasion of various target cells or tissues [27]. clpP is a serine protease involved in proteolysis [28], while acpXL is an acyl carrier protein [29]. They play a role in the adhesion and invasion of Fusobacterium nucleatum. In addition, virulence genes such as Elongation factor Tu, Glucose-1-phosphate thymidylyltransferase and Type 8 capsular polysaccharide synthesis protein were also present in some of the strains. Fusobacterium nucleatum has previously been reported to produce β-lactamases [30], which were not found in our study, and this may be related to individualized differences in strains and numbers.

In this study, we used the software named Plasmer in github (https://github.com/nekokoe/plasmer) to perform plasmid prediction on Fusobacterium nucleatum genomic data. The results showed that 13 of the high-quality plasmid predictions were identical to plasmids in known public databases, indicating the high accuracy of the software. Overall, the plasmid genomes of Fusobacterium nucleatum averaged under 20 k, with most plasmids coming from this species and few from other bacteria, which may indicate that plasmid are mainly transmitted intraspecifically. In addition, no resistance or virulence genes were detected in these plasmids.

However, there are some shortcomings in this study. Firstly, only the genomic functions of Fusobacterium nucleatum at strain level are explored, without combining metagenomic data to reveal the abundance of this species in the microbial community and its interactions with other species. In addition, the transcriptional expression of key genes of this species, such as the FadA gene, has not been demonstrated. These issues will be elucidated in subsequent studies.

Conclusion

Our comparative analysis of Fusobacterium nucleatum based on publicly available data reveals a distinct open tendency of the pan-genome and identifies cgmlst molecular markers for this species. We systematically analyzed the virulence gene profile and focused on the upstream and downstream structure and evolutionary relationships of the FadA gene. In addition, we predicted the plasmid sequences in Fusobacterium nucleatum and identified common plasmid typing among them. In conclusion, we hope that our study will provide a theoretical basis for subsequent functional studies and clinical applications of Fusobacterium nucleatum.

Availability of data and materials

The data set analyses during the current study are available in the NCBI database.

References

  1. McIlvanna E, et al. Fusobacterium nucleatum and oral cancer: a critical review. BMC Cancer. 2021;21(1):1212.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Han YW. Fusobacterium nucleatum: a commensal-turned pathogen. Curr Opin Microbiol. 2015;23:141–7.

    Article  CAS  PubMed  Google Scholar 

  3. Brennan CA, Garrett WS. Fusobacterium nucleatum - symbiont, opportunist and oncobacterium. Nat Rev Microbiol. 2019;17(3):156–66.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wilson M. Biological activities of lipopolysaccharides from oral bacteria and their relevance to the pathogenesis of chronic periodontitis. Sci Prog. 1995;78(Pt 1):19–34.

    CAS  PubMed  Google Scholar 

  5. Fan Z, et al. Fusobacterium nucleatum and its associated systemic diseases: epidemiologic studies and possible mechanisms. J Oral Microbiol. 2023;15(1):2145729.

    Article  PubMed  Google Scholar 

  6. Boiangiu CD, et al. Sodium ion pumps and hydrogen production in glutamate fermenting anaerobic bacteria. J Mol Microbiol Biotechnol. 2005;10(2–4):105–19.

    CAS  PubMed  Google Scholar 

  7. Bakken V, Hogh BT, Jensen HB. Utilization of amino acids and peptides by Fusobacterium nucleatum. Scand J Dent Res. 1989;97(1):43–53.

    CAS  PubMed  Google Scholar 

  8. Takahashi N, et al. Acid tolerance and acid-neutralizing activity of Porphyromonas gingivalis, Prevotella intermedia and Fusobacterium nucleatum. Oral Microbiol Immunol. 1997;12(6):323–8.

    Article  CAS  PubMed  Google Scholar 

  9. Zhang Z, et al. Porphyromonas gingivalis outer membrane vesicles inhibit the invasion of Fusobacterium nucleatum into oral epithelial cells by downregulating FadA and FomA. J Periodontol. 2022;93(4):515–25.

    Article  CAS  PubMed  Google Scholar 

  10. Rubinstein MR, et al. Fusobacterium nucleatum promotes colorectal carcinogenesis by modulating E-cadherin/beta-catenin signaling via its FadA adhesin. Cell Host Microbe. 2013;14(2):195–206.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Meng Q, et al. Fusobacterium nucleatum secretes amyloid-like FadA to enhance pathogenicity. EMBO Rep. 2021;22(7):e52891.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Mima K, et al. Fusobacterium nucleatum in colorectal carcinoma tissue and patient prognosis. Gut. 2016;65(12):1973–80.

    Article  CAS  PubMed  Google Scholar 

  13. Rubinstein MR, et al. Fusobacterium nucleatum promotes colorectal cancer by inducing Wnt/beta-catenin modulator Annexin A1. EMBO Rep. 2019. https://0-doi-org.brum.beds.ac.uk/10.15252/embr.201847638.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Gurevich A, et al. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29(8):1072–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Seppey M, Manni M, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness. Methods Mol Biol. 2019;1962:227–45.

    Article  CAS  PubMed  Google Scholar 

  16. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30(14):2068–9.

    Article  CAS  PubMed  Google Scholar 

  17. Page AJ, et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31(22):3691–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Zhao Y, et al. PanGP: a tool for quickly analyzing bacterial pan-genome profile. Bioinformatics. 2014;30(9):1297–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Moriya Y, et al. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35:W182-5.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010;5(3):e9490.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Liu B, et al. VFDB 2022: a general classification scheme for bacterial virulence factors. Nucleic Acids Res. 2022;50(D1):D912–7.

    Article  CAS  PubMed  Google Scholar 

  23. Alcock BP, et al. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res. 2020;48(D1):D517–25.

    CAS  PubMed  Google Scholar 

  24. Kumar S, et al. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Schwengers O, et al. Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores. Microb Genom. 2020. https://0-doi-org.brum.beds.ac.uk/10.1099/mgen.0.000398.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Grant JR, Stothard P. The CGView Server: a comparative genomics tool for circular genomes. Nucleic Acids Res. 2008;36:W181-4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lee HR, et al. Fusobacterium nucleatum GroEL induces risk factors of atherosclerosis in human microvascular endothelial cells and ApoE(-/-) mice. Mol Oral Microbiol. 2012;27(2):109–23.

    Article  CAS  PubMed  Google Scholar 

  28. Gaillot O, et al. The ClpP serine protease is essential for the intracellular parasitism and virulence of Listeria monocytogenes. Mol Microbiol. 2000;35(6):1286–94.

    Article  CAS  PubMed  Google Scholar 

  29. Bourassa DV, et al. The lipopolysaccharide lipid A long-chain fatty acid is important for Rhizobium leguminosarum growth and stress adaptation in free-living and nodule environments. Mol Plant Microbe Interact. 2017;30(2):161–75.

    Article  CAS  PubMed  Google Scholar 

  30. Nord CE. Mechanisms of beta-lactam resistance in anaerobic bacteria. Rev Infect Dis. 1986;8(Suppl 5):S543–8.

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

No foundation support.

Author information

Authors and Affiliations

Authors

Contributions

QHZ, AD, CS, KXL, SNH, and ZLH performed the experiments, analyzed the data and wrote the manuscript. ZLH conceptualized and designed the study. AD, CS, KXL, SNH provided material and samples. All authors reviewed, edited and approved the manuscript.

Corresponding author

Correspondence to Zilong He.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

The heatmap of gene presence-absence matrix in Fusobacterium nucleatum.

Additional file 2: Figure S2.

The phylogenetic tree of Fusobacterium nucleatum based on cgmlst markers. (Red represents mouth isolates, dark green represents gut isolates, blue represents other isolates and green represents unknown isolates.).

Additional file 3: Figure S3.

Functional enrichment of cgmlst marker genes.

Additional file 4: Figure S4.

Venn diagram of cgmlst markers from mouth and gut isolates of Fusobacterium nucleatum.

Additional file 5: Table S1.

The basic statistics of Fusobacterium nucleatum genomes.

Additional file 6: Table S2.

Metadata of Fusobacterium nucleatum genomes.

Additional file 7: Table S3.

(A) cgmlst marker of Fusobacterium nucleatum genomes. (B) cgmlst marker of Fusobacterium nucleatum genomes isolated from mouth. (C) cgmlst marker of Fusobacterium nucleatum genomes isolated from gut.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Q., Dovletgeldiyev, A., Shen, C. et al. Comparative genomic analysis of Fusobacterium nucleatum reveals high intra-species diversity and cgmlst marker construction. Gut Pathog 15, 43 (2023). https://0-doi-org.brum.beds.ac.uk/10.1186/s13099-023-00570-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1186/s13099-023-00570-z