The existing OCR pipeline was checking word containment in article boxes using inefficient polygon operations, costing $325K annually in AWS Lambda costs with slow processing times.
Implemented a ray-tracing algorithm from computer vision to quickly determine word-to-article containment relationships, reducing computational complexity from O(n²) to O(n log n).