Serratus is a large scale viroinformatics platform for uncovering the total genetic diversity of Earth's virome. Originating with the goal of uncovering novel coronaviruses[1] that may have been incidentally sequenced by other researchers, the project expanded to encompass all RNA viruses, those which encode a viral RNA-dependent RNA polymerase (RdRp).
Stable release | v210110
/ January 10th 2023 |
---|---|
Operating system | Linux, web-based |
Type | Bioinformatics |
License | code, GPLv3. data, cc0 |
Website | serratus |
By the end of 2020 there were approximately 15,000 distinct RNA virus sequences known from public databases, measured by the number of distinct RdRp (greater than 10% difference in amino acid sequence). Using a bioinformatics workflow optimized for large-scale cloud computing, the research team analyzed 5.7 million freely available sequencing datasets (20.4 petabytes of raw data) in the Sequence Read Archive (SRA) in only 11 days and a computing cost of US$23,900.[2] This analysis yielded 132,000 novel viral RdRp, representing nearly an order of magnitude increase in the known genetic diversity of RNA viruses.[3]
Within the database, RNA viruses are classified according to their RdRp palmprint,[4] a type of molecular barcode. The palmprint can be used as a computationally efficient index for the identification of which SRA sequencing runs contain a particular RNA virus. Such an index allows for targeted analysis of raw sequencing datasets from which novel RNA viruses can be characterized.[5]
All Serratus data are freely-available under the INDSC release policy.
References
edit- ^ Pennisi, Elizabeth. "New dangers? Computers uncover 100,000 novel viruses in old genetic data". www.science.org. Science. Retrieved 13 January 2023.
- ^ Pelley, Lauren. "Supercomputer helps Canadian researcher uncover thousands of viruses that could cause human diseases". CBC. Retrieved 13 January 2023.
- ^ Edgar RC, Taylor J, Lin V, Altman T, Barbera P, Meleshko D; et al. (2022). "Petabase-scale sequence alignment catalyses viral discovery". Nature. 602 (7895): 142–147. Bibcode:2022Natur.602..142E. doi:10.1038/s41586-021-04332-2. PMID 35082445. S2CID 221141152.
{{cite journal}}
: CS1 maint: multiple names: authors list (link) - ^ Babaian, Artem; Edgar, Robert (13 October 2022). "Ribovirus classification by a polymerase barcode sequence". PeerJ. 10: e14055. doi:10.7717/peerj.14055. ISSN 2167-8359. PMC 9573346. PMID 36258794.
- ^ Cabrera Mederos, Dariel; Debat, Humberto; Torres, Carolina; Portal, Orelvis; Jaramillo Zapata, Margarita; Trucco, Verónica; Flores, Ceferino; Ortiz, Claudio; Badaracco, Alejandra; Acuña, Luis; Nome, Claudia; Quito-Avila, Diego; Bejerman, Nicolas; Castellanos Collazo, Onias; Sánchez-Rodríguez, Aminael; Giolitti, Fabián (October 2022). "An Unwanted Association: The Threat to Papaya Crops by a Novel Potexvirus in Northwest Argentina". Viruses. 14 (10): 2297. doi:10.3390/v14102297. ISSN 1999-4915. PMC 9610017. PMID 36298852.