Carbohydrate Structure Database
This article's use of external links may not follow Wikipedia's policies or guidelines. (March 2024) |
Carbohydrate Structure Database (CSDB) is a free curated database and service platform in glycoinformatics, launched in 2005[2] by a group of Russian scientists from N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences. CSDB stores published structural, taxonomical, bibliographic and NMR-spectroscopic data on natural carbohydrates and carbohydrate-related molecules.
Content | |
---|---|
Description | Natural carbohydrate structures with NMR, bibliographic and biological annotations. |
Data types captured | carbohydrate structures and related data |
Organisms | |
Contact | |
Research center | Zelinsky Institute of Organic Chemistry |
Authors | Philip V. Toukach, Ksenia S. Egorova, Yuri A. Knirel, et al. |
Primary citation | Carbohydrate Structure Database [1] |
Release date | 2005 |
Access | |
Website | http://csdb.glycoscience.ru/ |
Download URL | export feature in web-interface |
Tools | |
Web | |
Miscellaneous | |
Versioning | yes |
Data release frequency | annual |
Version | 1 (merged) |
Curation policy | yes (manual and automatic) |
Overview
editThe main data stored in CSDB are carbohydrate structures of bacterial, fungal, and plant origin. Each structure is assigned to an organism and is provided with the link(s) to the corresponding scientific publication(s), in which it was described. Apart from structural data, CSDB also stores NMR spectra, information on methods used to decipher a particular structure, and some other data.[1][3] CSDB provides access to several carbohydrate-related research tools:
- Simulation of 1D and 2D NMR spectra of carbohydrates (GODDESS: glycan-oriented database-driven empirical spectrum simulation).[4][5][6]
- Automated NMR-based structure elucidation (GRASS: generation, ranking and assignment of saccharide structures).[7]
- Statistical analysis of structural feature distribution in glycomes of living organisms[8][9]
- Generation of optimized atomic coordinates for an arbitrary saccharide[10] and subdatabase of conformation maps.
- Taxon clustering based on similarities of glycomes (carbohydrate-based tree of life)[8]
- Glycosyltransferase subdatabase (GT-explorer)[11][12]
History and funding
editUntil 2015, Bacterial Carbohydrate Structure Database (BCSDB) and Plant&Fungal Carbohydrate Structure Database (PFCSDB) databases existed in parallel. In 2015, they were joined into the single Carbohydrate Structure Database (CSDB).[1] The development and maintenance of CSDB have been funded by International Science and Technology Center (2005-2007), Russian Federation President grant program (2005-2006), Russian Foundation for Basic Research (2005-2007,2012-2014,2015-2017,2018-2020), Deutsches Krebsforschungszentrum (short-term in 2006-2010), and Russian Science Foundation (2018-2020).
Data sources and coverage
editThe main sources of CSDB data are:
- Scientific publications indexed in the dedicated citation databases, including NCBI Pubmed and Thomson Reuters Web Of Science (approx. 18000 records).
- CCSD (Carbbank [13]) database (approx. 3000 records).
The data are selected and added to CSDB manually by browsing original scientific publications. The data originating from other databases are subject to error-correction and approval procedures.[14] As of 2017, the coverage on bacteria and archaea is ca. 80% of carbohydrate structures published in scientific literature [1] The time lag between the publication of relative data and their deposition into CSDB is about 18 months. Plants are covered up to 1997, and fungi up to 2012.[15] CSDB does not cover data from the animalia domain, except unicellular metazoa. There is a number of dedicated databases on animal carbohydrates, e.g. UniCarbKB [16] or GLYCOSCIENCES.de Archived 2021-02-11 at the Wayback Machine.[17]
CSDB is reported as one of the biggest projects in glycoinformatics.[18][19][20][21][22][23][24] It is employed in structural studies of natural carbohydrates[25][26][27] and in glyco-profiling.[28] The content of CSDB has been used as a data source in other glycoinformatics projects.[29][30][31][32]
Deposited objects
edit- Molecular structures of glycans, glycopolymers and glycoconjugates: primary structure, aglycon information, polymerization degree and class of molecule. Structural scope includes molecules composed of residues (monosaccharides, alditols, amino acids, fatty acids etc.) linked by glycosidic, ester, amidic, ketal, phospho- or sulpho-diester bonds, in which at least one residue is a monosaccharide or its derivative.
- Bibliography associated with structures: imprint data, keywords, abstracts, IDs in bibliographic databases
- Biological context of structures: associated taxon, strain, serogroup, host organism, disease information. The covered domains are: prokaryotes, plants, fungi and selected pathogenic unicellular metazoa. The database contains only glycans originating from these domains or obtained by chemical modification of such glycans.
- Assigned NMR spectra and experimental conditions.
- Glycosyltransferases associated with taxons: gene and enzyme identifiers, full structures, donor and substrates, methods used to prove enzymatic activity, trustworthiness level.
- References to other databases
- Other data collected from original publications
- Conformation maps of disaccharides derived from molecular dynamics simulations.
Interrelation with other databases
editCSDB is cross-linked to other glycomics databases,[33][34] such as MonosaccharideDB, Glycosciences.DE Archived 2021-02-11 at the Wayback Machine, NCBI Pubmed, NCBI Taxonomy, NLM catalog, International Classification of Diseases 11, etc. Besides a native notation, CSDB Linear,[35] structures are presented in multiple carbohydrate notations (SNFG,[36] SweetDB,[37] GlycoCT,[38] WURCS,[39] GLYCAM,[40] etc.). CSDB is exportable as a Resource Description Framework (RDF) feed according to the GlycoRDF ontology.[41][42]
External links
edit- CSDB web site
- CSDB usage examples
- CSDB technical documentation
- CSDB Linear (structure encoding notation)
- Carbohydrate databases registered in NAR collection
- Carbohydrate databases in the recent decade (lection)
References
edit- ^ a b c d Toukach Ph.V.; Egorova K.S. (2016). "Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts". Nucleic Acids Research. 44 (D1): D1229–D1236. doi:10.1093/nar/gkv840. PMC 4702937. PMID 26286194.
- ^ Toukach F.V.; Knirel Y.A. (2005). "New database of bacterial carbohydrate structures". Glycoconjugate Journal. 22 (4–6): 216–217.
- ^ Harvey D.J. (2015). "Analysis of carbohydrates and glycoconjugates by matrix-assisted laser desorption/ionization mass spectrometry: An update for 2011-2012". Mass Spectrometry Reviews. 36 (3): 255–422. doi:10.1002/mas.21471. PMID 26270629.
- ^ Kapaev R.R.; Egorova K.S.; Toukach Ph.V. (2014). "Carbohydrate structure generalization scheme for database-driven simulation of experimental observables, such as NMR chemical shifts". Journal of Chemical Information and Modeling. 54 (9): 2594–2611. doi:10.1021/ci500267u. PMID 25020143.
- ^ Kapaev R.R.; Toukach Ph.V. (2015). "Improved carbohydrate structure generalization scheme for 1H and 13C NMR simulations". Analytical Chemistry. 87 (14): 7006–7010. doi:10.1021/acs.analchem.5b01413. PMID 26087011.
- ^ Kapaev R.R.; Toukach Ph.V. (2016). "Simulation of 2D NMR Spectra of Carbohydrates Using GODDESS Software". Journal of Chemical Information and Modeling. 56 (6): 1100–1104. doi:10.1021/acs.jcim.6b00083. PMID 27227420.
- ^ Kapaev R.R.; Toukach Ph.V. (2018). "GRASS: semi-automated NMR-based structure elucidation of saccharides". Bioinformatics. 34 (6): 957–963. doi:10.1093/bioinformatics/btx696. PMID 29092007.
- ^ a b Egorova K.S.; Kondakova A.N.; Toukach Ph.V. (2015). "Carbohydrate structure database: tools for statistical analysis of bacterial, plant and fungal glycomes". Database. 2015: ID bav073. doi:10.1093/database/bav073. PMC 4559136. PMID 26337239.
- ^ Herget S.; Toukach Ph.V.; Ranzinger R.; Hull W.E.; Knirel Y.; von der Lieth C.-W. (2008). "Statistical analysis of the Bacterial Carbohydrate Structure Data Base (BCSDB): Characteristics and diversity of bacterial carbohydrates in comparison with mammalian glycans". BMC Structural Biology. 8: ID 35. doi:10.1186/1472-6807-8-35. PMC 2543016. PMID 18694500.
- ^ Chernyshov I.Y.; Toukach Ph.V. (2018). "REStLESS: Automated Translation of Glycan Sequences from Residue-Based Notation to SMILES and Atomic Coordinates". Bioinformatics. 34 (15): 2679–2681. doi:10.1093/bioinformatics/bty168. PMID 29547883.
- ^ Toukach Ph.V.; Egorova K.S. (2017). "CSDB_GT: a new curated database on glycosyltransferases". Glycobiology. 27 (4): 285–290. doi:10.1093/glycob/cww137. PMID 28011601.
- ^ Egorova K.S.; Knirel Y.A.; Toukach Ph.V. (2019). "Expanding CSDB_GT glycosyltransferase database with Escherichia coli". Glycobiology. 29 (4): 285–287. doi:10.1093/glycob/cwz006. PMID 30759212.
- ^ Doubet S.; Albersheim P. (1992). "CarbBank". Glycobiology. 2 (6): 505–507. doi:10.1093/glycob/2.6.505. PMID 1472756.
- ^ Egorova K.S.; Toukach Ph.V. (2012). "Critical analysis of CCSD data quality". Journal of Chemical Information and Modeling. 52 (11): 2812–2814. doi:10.1021/ci3002815. PMID 23025661.
- ^ Egorova K.S.; Toukach Ph.V. (2013). "Expansion of coverage of Carbohydrate Structure Database (CSDB)". Carbohydrate Research. 389: 112–114. doi:10.1016/j.carres.2013.10.009. PMID 24680503.
- ^ Campbell M.P.; Packer N.H. (2016). "UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations". Biochimica et Biophysica Acta (BBA) - General Subjects. 1860 (8): 1669–1675. doi:10.1016/j.bbagen.2016.02.016. PMID 26940363.
- ^ Lütteke T.; Bohne-Lang A.; Loss A.; Goetz T.; Frank M.; von der Lieth C.-W. (2006). "GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research". Glycobiology. 16 (5): 71R–81R. doi:10.1093/glycob/cwj049. PMID 16239495.
- ^ Rigden D.J.; Fernández-Suárez X.M.; Galperin M.Y. (2016). "The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection". Nucleic Acids Research. 44 (D1): D1–D6. doi:10.1093/nar/gkv1356. PMC 4702933. PMID 26740669.
- ^ Aoki-Kinoshita K.F. (2013). "Using databases and web resources for glycomics research". Molecular & Cellular Proteomics. 12 (4): 1036–1045. doi:10.1074/mcp.R112.026252. PMC 3617328. PMID 23325765.
- ^ Frank M.; Schloissnig S. (2010). "Bioinformatics and molecular modeling in glycobiology". Cellular and Molecular Life Sciences. 67 (16): 2749–2772. doi:10.1007/s00018-010-0352-4. PMC 2912727. PMID 20364395.
- ^ Artemenko N.V.; McDonald A.G.; Davey G.P.; Rudd P.M. (2012). "Databases and Tools in Glycobiology". Therapeutic Proteins. Methods in Molecular Biology. Vol. 899. pp. 325–350. doi:10.1007/978-1-61779-921-1_21. ISBN 978-1-61779-920-4. PMID 22735963.
- ^ Lütteke T. (2012). "The use of glycoinformatics in glycochemistry". Beilstein Journal of Organic Chemistry. 8: 915–929. doi:10.3762/bjoc.8.104. PMC 3388882. PMID 23015842.
- ^ Zhulin I.B. (2015). "Databases for Microbiologists". Journal of Bacteriology. 197 (15): 2458–2467. doi:10.1128/JB.00330-15. PMC 4505447. PMID 26013493.
- ^ Yamada K.; Kakehi K. (2011). "Recent advances in the analysis of carbohydrates for biomedical use". Journal of Pharmaceutical and Biomedical Analysis. 55 (4): 702–727. doi:10.1016/j.jpba.2011.02.003. PMID 21382683.
- ^ Fontana C.; Zaccheus M.; Weintraub A.; Ansaruzzaman M.; Widmalm G. (2016). "Structural studies of a polysaccharide from Vibrio parahaemolyticus strain AN-16000". Carbohydrate Research. 432: 41–49. doi:10.1016/j.carres.2016.06.004. PMID 27392309. S2CID 23129802.
- ^ Potekhina N.V.; Shashkov A.S.; Senchenkova S.N.; Dorofeeva L.V.; Evtushenko L.I. (2012). "Structure of hexasaccharide 1-phosphate polymer from Arthrobacter uratoxydans VKM Ac-1979(T) cell wall". Biochemistry (Moscow). 77 (11): 1294–1302. doi:10.1134/S0006297912110089. PMID 23240567. S2CID 9699031.
- ^ Chapot-Chartier M.P.; Vinogradov E.; Sadovskaya I.; Andre G.; Mistou M.Y.; Trieu-Cuot P.; Furlan S.; Bidnenko E.; Courtin P.; Péchoux C.; Hols P.; Dufrêne Y.F.; Kulakauskas S. (2010). "Cell surface of Lactococcus lactis is covered by a protective polysaccharide pellicle". Journal of Biological Chemistry. 285 (14): 10464–10471. doi:10.1074/jbc.M109.082958. PMC 2856253. PMID 20106971.
- ^ Walsh I.; Zhao S.; Campbell M.; Taron C.H.; Rudd P.M. (2016). "Quantitative profiling of glycans and glycopeptides: an informatics' perspective". Current Opinion in Structural Biology. 40: 70–80. doi:10.1016/j.sbi.2016.07.022. PMID 27522273.
- ^ Ranzinger R.; York W.S. (2015). "GlycomeDB". Glycoinformatics. Methods in Molecular Biology. Vol. 1273. pp. 109–124. doi:10.1007/978-1-4939-2343-4_8. ISBN 978-1-4939-2342-7. PMID 25753706.
- ^ Ranzinger R.; Herget S.; von der Lieth C.-W.; Frank M. (2011). "GlycomeDB - a unified database for carbohydrate structures". Nucleic Acids Research. 39 (Database issue): D373-376. doi:10.1093/nar/gkq1014. PMC 3013643. PMID 21045056.
- ^ Aoki-Kinoshita K.F.; et al. (2016). "GlyTouCan 1.0 - The international glycan structure repository". Nucleic Acids Research. 44 (D1): D1237-1242. doi:10.1093/nar/gkv1041. PMC 4702779. PMID 26476458.
- ^ Campbell M.P.; Ranzinger R.; Lütteke T.; Mariethoz J.; Hayes CA.; Zhang J.; Akune Y.; Aoki-Kinoshita K.F.; Damerell D.; Carta G.; York W.S.; Haslam S.M.; Narimatsu H.; Rudd P.M.; Karlsson N.G.; Packer N.H.; Lisacek F. (2014). "Toolboxes for a standardised and systematic study of glycans". BMC Bioinformatics. 15 (Suppl 1): Suppl 1:S9. doi:10.1186/1471-2105-15-S1-S9. PMC 4016020. PMID 24564482.
- ^ Ranzinger R.; Herget S.; Wetter T.; von der Lieth C.-W. (2008). "GlycomeDB - integration of open-access carbohydrate structure databases". BMC Bioinformatics. 9: ID 384. doi:10.1186/1471-2105-9-384. PMC 2567997. PMID 18803830.
- ^ Toukach Ph.V.; Joshi H.; Ranzinger R.; Knirel Y.; von der Lieth C.-W. (2007). "Sharing of worldwide distributed carbohydrate-related digital resources: online connection of the Bacterial Carbohydrate Structure DataBase and GLYCOSCIENCES.de". Nucleic Acids Research. 35 (Database issue): D280–D286. doi:10.1093/nar/gkl883. PMC 1899093. PMID 17202164.
- ^ Toukach Ph.V.; Egorova K.S. (2020). "New features of CSDB Linear, as compared to other carbohydrate notations". Journal of Chemical Information and Modeling. 60 (3): 1276–1289. doi:10.1021/acs.jcim.9b00744. PMID 31790229. S2CID 226214957.
- ^ Varki A.; et al. (2015). "Symbol Nomenclature for Graphical Representations of Glycans". Glycobiology. 25 (12): 1323–1324. doi:10.1093/glycob/cwv091. PMC 4643639. PMID 26543186.
- ^ Loss A.; Bunsmann P.; Bohne A.; Loss A.; Schwarzer E.; Lang E.; von der Lieth C.-W. (2002). "SWEET-DB: an attempt to create annotated data collections for carbohydrates". Nucleic Acids Research. 30 (1): 405–408. doi:10.1093/nar/30.1.405. PMC 99123. PMID 11752350.
- ^ Herget S.; Ranzinger R.; Maass K.; von der Lieth C.-W. (2008). "GlycoCT - a unifying sequence format for carbohydrates". Carbohydrate Research. 343 (12): 2162–2171. doi:10.1016/j.carres.2008.03.011. PMID 18436199.
- ^ Tanaka K.; Aoki-Kinoshita K.F.; Kotera M.; Sawaki H.; Tsuchiya S.; Fujita N.; Shikanai T.; Kato M.; Kawano S.; Yamada I.; Narimatsu H. (2014). "WURCS: the Web3 unique representation of carbohydrate structures". Journal of Chemical Information and Modeling. 54 (6): 1558–1566. doi:10.1021/ci400571e. PMID 24897372.
- ^ Kirschner K.N.; Yongye A.B.; Tschampel S.M.; González-Outeiriño J.; Daniels C.R.; Foley B.L.; Woods R.J. (2008). "GLYCAM06: a generalizable biomolecular force field. Carbohydrates". Journal of Computational Chemistry. 29 (4): 622–655. doi:10.1002/jcc.20820. PMC 4423547. PMID 17849372.
- ^ Ranzinger R.; Aoki-Kinoshita K.F.; Campbell M.P.; Kawano S.; Lütteke T.; Okuda S.; Shinmachi D.; Shikanai T.; Sawaki H.; Toukach Ph.V.; Matsubara M.; Yamada I.; Narimatsu H. (2015). "GlycoRDF: An ontology to standardize Glycomics data in RDF". Bioinformatics. 31 (6): 919–925. doi:10.1093/bioinformatics/btu732. PMC 4380026. PMID 25388145.
- ^ Aoki-Kinoshita K.F.; Bolleman J.; Campbell M.P.; Kawano S.; Kim J.; Lütteke T.; Matsubara M.; Okuda S.; Ranzinger R.; Sawaki H.; Shikanai T.; Shinmachi D.; Suzuki Y.; Toukach Ph.V.; Yamada I.; Packer N.H.; Narimatsu H. (2013). "Introducing glycomics data into the Semantic Web". Journal of Biomedical Semantics. 4 (1): ID 39. doi:10.1186/2041-1480-4-39. PMC 4177142. PMID 24280648.