Arsalan Younus.
Back to Projects

Handwriting Localization (DBNET)

Automated detection of handwritten text regions in mixed print-and-handwriting documents, feeding accurate regions to downstream OCR instead of whole-page guesses.

The Business Problem

Scanned documents mix printed forms with handwritten notes. Downstream OCR needs accurate handwritten regions. Feeding whole pages or using rule-based layout failed on varied handwriting styles, skew, and overlap with printed content.

The client needed reliable handwritten region detection across diverse form layouts without flooding the pipeline with false positives.

The Technical Solution

I built a deep learning pipeline using DBNET and EAST to detect and localize handwritten text regions. The pipeline ingests scanned page images, outputs bounding regions for handwritten zones, and handles varying orientations and overlap with printed content.

Models were tuned per document type (DBNET vs EAST selection based on layout characteristics), balancing recall vs precision for downstream recognition.

The Scalability Factor

Deployed as a Docker container on AWS Kubernetes within the document intelligence pipeline. Independent deployment means model retraining and rollout do not affect other pipeline stages.

Business Impact

Handwritten regions are detected at 99% accuracy and fed reliably into the recognition step, improving end-to-end extraction.

The pipeline feeds consistently into the rest of the extraction pipeline at production volume.

Built with

DBNET
EAST
PyTorch
OpenCV
Deep Learning
AWS
Docker
Handwriting Localization (DBNET) screenshot 1
View