Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Human or LLM the trick with messy inputs from scanned sources is having robust sanity combs that look for obvious fubar's and a means by which end data users can review the asserted values and the original raw image sources (and flag for review | alteration).

At least in my past experience with volumes of transcribed data for applications that are picky about accuracy.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: