Chemical database
A chemical database is a
Types of chemical databases
Bioactivity database
Bioactivity databases correlate structures or other chemical information to bioactivity results taken from
Name | Developer(s) | Initial release |
---|---|---|
ScrubChem | Jason Bret Harris | 2016[1][2] |
PubChem-BioAssay | NIH
|
2004[3][4] |
ChEMBL | EMBL-EBI
|
2009[5] |
Chemical structures
Literature database
Chemical literature databases correlate structures or other chemical information to relevant references such as academic papers or patents. This type of database includes STN, Scifinder, and Reaxys. Links to literature are also included in many databases that focus on chemical characterization.
Crystallographic database
NMR spectra database
Reactions database
Most chemical databases store information on stable molecules but in databases for reactions also intermediates and temporarily created unstable molecules are stored. Reaction databases contain information about products, educts, and reaction mechanisms.
Thermophysical database
Thermophysical data are information about
- phase equilibria including vapor–liquid equilibrium, solubility of gases in liquids, liquids in solids (SLE), heats of mixing, vaporization, and fusion.
- caloric data like heat of formation and combustion,
- transport properties like thermal conductivity
Chemical structure representation
There are two principal techniques for representing chemical structures in digital databases
- As connection tables / adjacency matrices / lists with additional information on bond (edges) and atom attributes (nodes), such as:
- As a linear string notation based on breadth first traversal, such as:
- WLN, InChI
These approaches have been refined to allow representation of
Search
Substructure
Chemists can search databases using parts of structures, parts of their
Conformation
Search by matching 3D conformation of molecules or by specifying spatial constraints is another feature that is particularly of use in drug design. Searches of this kind can be computationally very expensive. Many approximate methods have been proposed, for instance BCUTS, special function representations, moments of inertia, ray-tracing histograms, maximum distance histograms, shape multipoles to name a few.[9][10][11][12][13]
Giga Search
Databases of synthesizable and virtual chemicals are getting larger each year, therefore the ability to efficiently mine them is critical for drug discovery projects. MolSoft's MolCart Giga Search (http://www.molsoft.com/giga-search.html) is the first ever method designed for substructure search of billions of chemicals.
Descriptors
All properties of molecules beyond their structure can be split up into either physico-chemical or
Similarity
There is no single definition of molecular similarity, however the concept may be defined according to the application and is often described as an
Chemicals in the databases may be
In
Registration systems
Databases systems for maintaining unique records on chemical compounds are termed as Registration systems. These are often used for chemical indexing, patent systems and industrial databases.
Registration systems usually enforce uniqueness of the chemical represented in the database through the use of unique representations. By applying rules of precedence for the generation of stringified notations, one can obtain unique/'
A key difference between a registration system and a simple chemical database is the ability to accurately represent that which is known, unknown, and partially known. For example, a chemical database might store a molecule with
Registration systems also preprocess molecules to avoid considering trivial differences such as differences in halogen ions in chemicals.
An example is the
List of Chemical Cartridges
List of Chemical Registration Systems
Web-based
Name | Developer(s) | Initial release |
---|---|---|
CDD Vault | Collaborative Drug Discovery | 2018[26][27][28] |
Adroit Repository[29] | Adroit DI[30] | 2023[31][32] |
Tools
The computational representations are usually made transparent to chemists by graphical display of the data. Data entry is also simplified through the use of chemical structure editors. These editors internally convert the graphical data into computational representations.
There are also numerous algorithms for the interconversion of various formats of representation. An open-source utility for conversion is
SELECT * FROM CHEMTABLE WHERE SMILESCOL.CONTAINS('c1ccccc1')
Algorithms for the conversion of
See also
- Biological database
- Beilstein database and Dortmund Data Bank
- BindingDB
- ChEBI
- ChEMBL
- Chemisches Zentralblatt Structural Database
- ChemSpider
- Collaborative Drug Discovery
- Comparative Toxicogenomics Database
- Computational Chemistry List
- DrugBank
- List of chemical databases
- List of software for molecular mechanics modeling
- LOLI Database
- NMR spectra database
- PubChem
- SPRESI database
- Colocalization Benchmark Source
References
- ^ http://www.scrubchem.org
- S2CID 73493315.
- ^ "PubChem". pubchem.ncbi.nlm.nih.gov.
- PMID 27899599.
- ^ "ChEMBL Database".
- S2CID 17268751
- ^ PMID 20298518.
- PMID 17266630.
- .
- PMID 16045295.
- PMID 16997139.
- S2CID 96794688.
- S2CID 12540483.
- PMID 20298518.
- .
- ^ "BIOVIA Direct - BIOVIA - Dassault Systèmes®".
- ^ "JChem Engines | ChemAxon".
- ^ "Chemistry – Oracle Cartridge | Inside Informatics".
- PMC 2867114.
- ^ "Small Molecule Drug Discovery Software". Small Molecule Drug Discovery Software.
- ^ "BIOVIA Chemical Registration - BIOVIA - Dassault Systèmes®". www.3ds.com.
- ^ "Register". Archived from the original on 2021-12-10. Retrieved 2021-03-13.
- ^ "Scilligence RegMol | Scilligence". 6 June 2016.[permanent dead link]
- ^ "Compound Registration". chemaxon.com.
- ^ "Signals Notebook - PerkinElmer Informatics". perkinelmerinformatics.com.
- ^ "CDD Vault Update: CDD Vault is Now an ELN". 16 February 2018.
- ^ "CDD Electronic Lab Notebook (ELN)". 14 August 2019.
- ^ "Electronic Lab Notebooks: What they are (And why you need one)". 4 August 2019.
- ^ "Review of SDF Pro from Adroit DI. June 2023 – Macs in Chemistry". 2023-11-05. Retrieved 2024-03-11.
- ^ "Adroit DI main page". adroitdi.com. Retrieved 2024-03-10.
- ^ "Adroit DI's SDF Pro: The Fast and Affordable Solution to Storing, Sorting and Wrangling 10 Million Molecules in Seconds". www.businesswire.com. 2023-05-16. Retrieved 2024-03-10.
- ^ "Best of the Best Entity Registration". 20Visioneers15. Retrieved 2024-03-10.