Explore our successful data science projects and solutions
Using image embeddings to identify duplicate articles across digitized newspaper archives.
Improved AWS Lambda OCR processing speed by 100x using computational geometry techniques.
Advanced line segmentation for improved OCR accuracy on degraded historical texts.
Automatically detecting newspaper columns using spectral analysis to improve OCR layout understanding.
Grouping millions of historical portrait photos by identity using deep learning embeddings.
Restoring faded, scratched, and stained historical photographs using deep learning-based inpainting and colorization.
Automatically extracting page boundaries from warped, unevenly lit historical book scans.
Detecting and isolating individual articles in dense, multi-column historical newspaper layouts.