Using Proxies for crawl to not get blocked

Hi All,

We are running into an issue where our instance gets blocked and caught by recaptcha request. I wanted to know if Yacy supports using proxies for crawlers, and if I can parse blocked request that end in recapcha to trigger a change of the proxy IP?

The way I think this would work is if a page fails with a certain Fail-Response I could set up a script to update the proxy ip through cli/api to use a different IP address when crawling.

1 Like

Yes you can use a remote proxy / crawl through a proxy.

1 Like


would it be awfully evil to set the user-agent of crawler, to one of Google’s?..