Engineered an automated pipeline to process unstructured data from hundreds of pages of police reports and legal filings, reducing manual document review time by approximately 80%.
Implemented Tesseract OCR to extract text from scanned PDFs/images and developed a search algorithm to identify specific names and surrounding relevant data.
Built with