Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is trivial to detect a spider from human traffic based on requests alone. Lying about the UA would just be bad press for them.


If it's really trivial as you say, Google's reCAPTCHA and similar products like hCAPTCHA would instantly have no reason to exist.


Bot intentionally trying to look human =! Spider

A spider will generally have a pretty predictable route through a web site.


The various CAPTCHA implementations are primarily designed to prevent bot submissions, not spiders.


Some of them yes. But not all. Try for example to browse a Cloudflare protected site from Tor and you will be hit with a constant barrage of captchas even though you are only doing GET requests.


Yes, huristicly, a tor browser is more likely to be nefarious than a regular browser user. Note the use of huristisc - such as IP address - not related to user agent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: