Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Article says this misses important details, eg data that might be in the image.
 help



very bad take. with most modern multomodal models you get way better performance then going to text first

it's a cost/latency trade-off in production + very use-case dependent



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: