Sowmik Sarker

Automating Birth Registration Certificate Scanning with Machine Learning

3412

Automating Birth Registration Certificate Scanning with Machine Learning

In a recent project, I had the opportunity to work on an exciting solution designed to streamline the registration process for individuals under the age of 18. The goal was to automatically scan and extract data from birth registration certificates so that new customers could register quickly and easily via a mobile app. Here’s a closer look at the technologies and approaches we used to bring this system to life.

High Level Design(HLD) of OCR

The Core Problem

Birth registration certificates are essential documents, but manually inputting data from these certificates is both time-consuming and prone to errors. The challenge was to create an automated solution that could accurately scan certificates in both Bangla and English, extract relevant data, and pass that information seamlessly to a backend system for customer registration.

Key Technologies

To implement this solution, we relied on a set of powerful tools and frameworks, each playing a critical role in different stages of the project.

  • C++: For building the core components of the machine learning client.
  • NVIDIA Triton Inference Server: Managed the deployment and inference of our machine learning models, ensuring efficient and scalable processing.
  • OpenCV: Used for image processing and preparation, essential for Optical Character Recognition (OCR).
  • JNI (Java Native Interface): Enabled communication between our native C++ code and Java-based middleware.
  • Reactive Spring Boot: This powered the backend middleware, which handled customer app requests and provided the necessary data flow between the machine learning client and the app.
  • Tesseract model is used to scan image data inside ML server.
  • MinIO is used to log documents for further usage.

OCR and Text Classification

One of the most critical parts of the project was Optical Character Recognition (OCR). We used Tesseract, an open-source OCR engine, for extracting text from images of the birth certificates. The certificates contained text in two languages: Bangla and English.

To ensure accurate text recognition and data extraction, we implemented two separate OCR models.

  1. Bangla Text Model: Specialized for handling documents in the Bangla language.
  2. English Text Model: Focused on English text recognition.

The system first classified the text type based on the detected script and then processed the data using the appropriate OCR model.

Tesseract OCR process flow visualization

Client-Middleware Communication

The client-side system was responsible for processing the scanned certificate and extracting relevant data such as the customer’s name, date of birth, and other necessary details. The NVIDIA Triton Inference client was used to manage communication between the machine learning client and the server.

Once the text data was extracted, it was passed to a middleware layer built using Reactive Spring Boot. This middleware acted as the interface between the customer app and the backend services, efficiently handling requests and sending responses based on the OCR results.

Project Outcome

By leveraging a combination of machine learning, OCR, and efficient server-client communication, we developed a robust solution that automates the process of scanning and extracting information from birth registration certificates. This system has made customer registration faster and more efficient, reducing manual data entry and improving accuracy.


This project was a great learning experience, especially in integrating multiple technologies to create a cohesive system. If you're interested in more technical details or have any questions about the implementation, feel free to reach out!



Share this post on...

0
javaspring-bootdesign-patternjavascripttypescriptdockerk8sdata-structuresalgorithmsdatabaseoraclescylladb