The DevDigitizer project aims to build a state of the art Optical Character Recognition Software for Sanskrit/ Samskritam (Devanagari Script). The project is commited to developing novel document analysis, computer vision, deep learning and search algorithms through persistent research, inorder to build a robust and highly accurate Sanskrit OCR system.
The Vision of the DevDigitizer project is to facilitate digitization and preservation of ancient indian texts on Science, Math, Literature, Poetry etc... written in Sanskrit (Devanagari Script). Digitization of ancient Manuscripts will increase the ease of access to these documents for further research and study.
The Software is currently being refactored for public use and will be made availble for use very soon.
- Numpy
- Tensorflow
- Keras
- OpenCV
- Flask
The dataset used for this work is available in the following github repo : https://github.com/avadesh02/Sanskrit-letter-dataset/blob/master/README.md
If you want to cite DevDigitizer in your papers, please use the following bibtex line:
Avadesh, Meduri, and Navneet Goyal. "Optical Character Recognition for Sanskrit Using Convolution Neural Networks." In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 447-452. IEEE, 2018.