Transform chemical structure images into machine-readable SMILES with state-of-the-art AI
π Use DECIMER | π Documentation | π¬ Discussions | π Publications
DECIMER (Deep lEarning for Chemical IMagE Recognition) is an open-source, production-ready platform that revolutionizes chemical structure extraction from scientific literature. Powered by cutting-edge transformer-based deep learning, DECIMER automatically identifies, segments, and converts chemical structures into SMILES representations with remarkable accuracy.
graph LR
A[π PDF/Images] --> B[π Segmentation]
B --> C[π― Detection]
C --> D[π§ Recognition]
D --> E[β
SMILES]
style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px
style C fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
style D fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
style E fill:#d4edda,stroke:#155724,stroke-width:3px
|
|
|
|
| Requirement | Minimum | Recommended |
|---|---|---|
| π» RAM | 8 GB | 16 GB+ |
| πΎ Storage | 10 GB | 20 GB+ |
| π³ Docker | Latest | Latest |
| π Browser | Chrome 90+ | Chrome/Edge Latest |
π§ Linux / macOS
# Clone the repository
git clone https://github.com/Steinbeck-Lab/DECIMER.ai
cd DECIMER.ai/
cp .env.example .env # Creates an environment file
# β οΈ IMPORTANT: For systems with less than 32GB RAM
# Edit docker/app/supervisor.conf to reduce resource allocation
# See https://github.com/Steinbeck-Lab/DECIMER.ai/wiki for details
# Build and launch
docker compose build --no-cache
docker compose up -d
# Monitor startup (optional)
docker compose logs -f supervisorπ For Apple Silicon (M1/M2/M3):
docker compose -f docker-compose.apple_silicon.yml build --no-cache
docker compose -f docker-compose.apple_silicon.yml up -dπͺ Windows
- Install Docker Desktop
- Configure resources in Docker Desktop settings (4+ CPU cores, 8+ GB RAM)
- Run as Administrator:
git clone https://github.com/Steinbeck-Lab/DECIMER.ai
cd DECIMER.ai\
cp .env.example .env
# Run the automated build script
build-windows.batAlternative manual approach:
docker-compose -f docker-compose.windows.yml build --no-cache
docker-compose -f docker-compose.windows.yml up -dπ‘ Pro Tip: For better performance, consider using WSL2
- Open your browser to
http://localhost:80 - Wait 5-10 minutes for model initialization β±οΈ
- Upload a PDF or image containing chemical structures
- Download your results as SMILES strings and mol files! π
π First-Time Setup: The initial startup loads several large neural network models. Subsequent starts will be much faster.
![]() Detects and extracts chemical structures from documents using Mask R-CNN π¦ Repository β’ π Paper |
![]() Converts structure images to SMILES using Vision Transformers π¦ Repository β’ π Paper |
![]() Distinguishes chemical structures from other images with CNNs π¦ Repository |
|
|
|
|
| Metric | Value | Details |
|---|---|---|
| π― Accuracy | >95% | On printed structures |
| β‘ Speed | ~5s/structure | Including segmentation |
| π Scalability | 1000s/day | With proper hardware |
| π Formats | PDF, PNG, JPEG, WebP, HEIC | Multiple input types |
| Resource | Description |
|---|---|
| π Installation Guide | Detailed setup instructions for all platforms |
| π§ Configuration | Customizing your DECIMER instance |
| π Troubleshooting | Common issues and solutions |
| π API Reference | Programmatic access guide |
| π‘ Best Practices | Optimization tips and tricks |
If DECIMER.ai powers your research, please cite our work:
π Primary Citation
@article{rajan2023decimer,
title = {DECIMER.ai: An open platform for automated optical chemical
structure identification, segmentation and recognition in
scientific publications},
author = {Rajan, Kohulan and Brinkhaus, Henning Otto and
Agea, Maria Inmaculada and Zielesny, Achim and
Steinbeck, Christoph},
journal = {Nature Communications},
volume = {14},
number = {1},
pages = {5045},
year = {2023},
publisher = {Nature Publishing Group},
doi = {10.1038/s41467-023-40782-0}
}π Additional Publications
@article{rajan2024advancements,
title = {Advancements in hand-drawn chemical structure recognition through
an enhanced DECIMER architecture},
author = {Rajan, Kohulan and Brinkhaus, Henning Otto and
Zielesny, Achim and Steinbeck, Christoph},
journal = {Journal of Cheminformatics},
volume = {16},
number = {1},
pages = {78},
year = {2024},
doi = {10.1186/s13321-024-00872-7}
}@article{rajan2021segmentation,
title = {DECIMER-Segmentation: Automated extraction of chemical structure
depictions from scientific literature},
author = {Rajan, Kohulan and Brinkhaus, Henning Otto and
Sorokina, Maria and Zielesny, Achim and Steinbeck, Christoph},
journal = {Journal of Cheminformatics},
volume = {13},
number = {1},
pages = {20},
year = {2021},
doi = {10.1186/s13321-021-00496-1}
}@article{rajan2021transformer,
title = {DECIMER 1.0: deep learning for chemical image recognition
using transformers},
author = {Rajan, Kohulan and Zielesny, Achim and Steinbeck, Christoph},
journal = {Journal of Cheminformatics},
volume = {13},
number = {1},
pages = {61},
year = {2021},
doi = {10.1186/s13321-021-00538-8}
}@article{rajan2020decimer,
title = {DECIMER: towards deep learning for chemical image recognition},
author = {Rajan, Kohulan and Zielesny, Achim and Steinbeck, Christoph},
journal = {Journal of Cheminformatics},
volume = {12},
number = {1},
pages = {65},
year = {2020},
doi = {10.1186/s13321-020-00469-w}
}We welcome contributions from the community! Whether you're fixing bugs, adding features, or improving documentation, your help is appreciated.
- π Report Bugs: Open an issue
- π‘ Suggest Features: Start a discussion
- π Improve Docs: Submit pull requests for documentation
- π§ Fix Issues: Check out our good first issues
- β Star the Project: Show your support!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with clear, descriptive commits
- Test thoroughly
- Push to your fork (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
- π¬ Discussions: For questions, ideas, and community interaction
- π Issues: For bug reports and feature requests
- βοΈ Email: For direct support and collaboration inquiries
This project is licensed under the MIT License, making it free for both academic and commercial use.
MIT License
Copyright (c) 2025 Kohulan @ Steinbeck Lab
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
[Full license text in LICENSE file]
π Maintained by the Kohulan @ Steinbeck Group
Natural Products Cheminformatics Research Group
Institute for Inorganic and Analytical Chemistry
Friedrich Schiller University Jena, Germany
| Project | Description |
|---|---|
| π΄ COCONUT | Open Natural Products Database |
| π DECIMER Segmentation | Structure Detection Library |
| π§ DECIMER Transformer | Image-to-SMILES Model |
| π― DECIMER Classifier | Chemical Image Classification |
Funded by Carl Zeiss Foundation and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under the ChemBioSys (Project INF) - Project number: 239748522 - SFB 1127.
Made with β€οΈ and β for the global chemistry community
Democratizing access to chemical knowledge, one structure at a time
Β© 2025 Steinbeck Lab, Friedrich Schiller University Jena


