Historical book scans often contained curved pages, shadows, or bleed-through text from opposing pages, causing traditional edge-detection methods to fail (30% error rate). Manual cropping was prohibitively slow.
Trained a U-Net model to predict page masks, combining gradient-based preprocessing (for edge hints) and geometric post-processing (for smooth quadrilateral fitting). Special handling for gutter shadows and folded corners.