Is there a way to change the 120 pages per minute per host when I crawl a machine in intranet or localhost?
Forgot to mention that in my use case scenario, I want to crawl both localhost and internet urls, and I have millions of pages to be crawled on localhost. Using file:// or smb:// is not an option because in my case I want local files to be visible inside browser.
If anyone has similar issue, I finally used “webportal” as use case, and I made the following changes in file defaults/yacy.network.webportal.unit file:
network.unit.domain = any network.unit.remotecrawl.speed = 600
This seems to solve my issue, on the other hand I am not sure if YaCy will behave “well” to remote sites if they do not have set a reasonble crawl speed in their robots.txt.
If anyone has any better idea I would like to know.
I.e., is it possible to set different crawling speeds for intranet, localhost and internet at the same profile? I am not sure if setting different agents (Yacy intranet and Yacy internet) seems to have much effect