New methods for the discovery of natural products from understudied and uncultivated bacterial phyla
Natural products and their derivatives account for the majority of approved drugs. New antibiotics are urgently needed to combat increasing antibiotic resistance and fatalities caused by it. Traditional cultivation-based approaches, like the Waksman platform, led to the discovery of several antibiotics and other drugs but were ultimately hampered by rediscovery of the same molecules. The marine environment, and in particular marine sponges, have long been a source of natural products including approved drugs and drugs currently undergoing clinical development. Bacteria associated with marine sponges are often responsible for the production of natural products rather than the eukaryotic organism itself. This work used methods including metagenomic sequencing and metagenomic libraries to identify, analyse and target putative natural products encoded in the genomes of bacteria, mainly those associated with marine sponges. Initially a bioinformatic workflow (MetaSing) was developed for the analysis of metagenomic sequencing datasets with the main aim being the attribution of biosynthetic gene clusters (BGCs) to individual metagenome-assembled genomes (MAGs). MetaSing was used to analyse 16 sponge metagenomes from various geographic locations resulting in the identification of 643 MAGs and 2,670 BGCs, 70.79% of which were attributed to a MAG. Phyla identified as encoding significant biosynthetic potential include Proteobacteria and Nitrospirota as well as lesser-studied phyla that are difficult to cultivate, such as the Candidatus Latescibacterota, Verrucomicrobia and Acidobacteriota. A metagenomic cosmid library derived from the marine sponge M. hentscheli was then sequenced using Illumina short reads and Nanopore long reads to quantify the capture of biosynthetic potential and develop a method for efficient recovery of BGCs. This analysis demonstrated that while close to half the BGCs are completely captured in the cosmid library, several are incompletely captured or not captured at all. Conversely, novel BGCs not identified from metagenomic sequencing were identified in the cosmid library sequencing data. The BGC recovery method was then applied to recover three acidobacterial BGCs from two marine sponge cosmid libraries and 11 lasso peptide BGCs from New Zealand soil libraries. Isolated cosmids were retrofitted and conjugated into heterologous expression hosts but only one BGC elicited putative production of non-native compounds.