Home to the Chemical Reaction Database
The chemical reaction database (CRD) is a collection of chemical reactions drawn from the scientific literature and patent literature. Work in progress, for now emphasis on organic reactions
Includes search options for catalysts and ligands, all data normalized with calculated ratio's for each reaction component.
The current database size is over 1.37 million reaction records, over 1.5 million compounds and 396 reaction types (with 827K reactions attributed). The virtual stockroom has 1922 common and less common reagents and 90 solvents.
2024: added dataset of USPTO 2023 only (137K entries). Including reagents and solvents. 10.6084/m9.figshare.22491730.v1
2025: added full dataset (1.37M entries). Including reagents and solvents. m9.figshare.28230053.v1
New datasets
- Greensporone C, S. Narendra et al. 2024 reaction data | DOI
- (15S)-Prostaglandin A2, J. Lackner et al. 2024 reaction data | DOI
- Ethyl Plakortide Z, N. Jamey and L. Ferrié 2024 reaction data | DOI
- Hypersampsone M, A. E. Samkian, S. C. Virgil, and B. M. Stoltz 2024 reaction data | DOI
- Asprenol B, M. C. Benda et al., 2024 reaction data | DOI
- Hyperfirin, J. A. König, B. Morgenstern, and J. Jauch, 2024 reaction data | DOI
- Oridamycin A, M. Murmu et al. 2024 reaction data | DOI
- Borrecapine, R. Lavernhe, Q. Wang, J. Zhu 2024 reaction data | DOI
- Poison Dart-Frog Alkaloid (-)-209D, Kuei-Chen Chang et al. 2024 reaction data | DOI
- Formicamycin H, G. Hu et al. 2024 reaction data | DOI
- Rhodocoranes I and J, C. A. Vincent et al. 2024 reaction data | DOI
- (+)-dehydrodeoxybrevianamide E, M. Nandy et al. 2024 reaction data | DOI
- Preaustinoid A, X. Li et al. 2024 reaction data | DOI
- 6,12-Guaianolide C1 epimers, K. Mazaraki et al. 2024 reaction data | DOI
- β-Levantenolide, S. Ghosh et al. 2024 reaction data | DOI
Example organic reaction
The blog
Recently in the blog: The year in review and alkyne protiodesilylation by the numbers
Example organic reaction
Organic reactions by year
Organic reactions in the database by year
Main datasets
Currently the database contains 4 main datasets. The first is the USPTO dataset 1976-2016 as compiled by Daniel Lowe but with data enhancing. The second dataset is also mined from USPTO (2017 to present) but with custom programming and with the aid of Oscar4 software or ChatGPT and the Opsin service. The third dataset is derived from the academic literature (anything with a DOI), progressing at a snails pace (is manual labour). Occasional use of Decimer and Clipboard-To-SMILES Converter. The CJHIF dataset (academic literature) is also included, a total amount of 3.2 million records but only a fraction included thus far. Additional SMILES to IUPAC conversion by STOUT. Reaction images by SmilesDrawer. Reaction types calculated with RDKit.