Hi. I tried yacy (again). Here’s a short review. I thought you might be interested.
Strange, used YaCy for several months. What “spam”?
I guess I haven’t booted it up lately as we switched internet providers.
I was using a Sprint wireless hot spot but recently switched to Spectrum cable but have not opened a port on the router to run YaCy as yet.
Prevoiusly though, not too long ago, running YaCy, There were no ‘spam’ search results whatsoever. Nothing I saw as, or considered spam anyway.
No advertising, no click bate or fake results. Don’t know what the reviewer or the commenter are talking about. Could one or the other, or someone post a screenshot of such spam search results?
The results a person gets using YaCy will be far different, and in my opinion or personal experience with it, generally better and more varied and comprehensive that what can be had with a generic online search engine, but it depends a great deal on how a person configures it and what settings they use or don’t use and if they have taken the time to send YaCy out to spider or crawl websites on subjects they want to have indexed.
For example, I like Stirling engines and have a few forums I participate on that have been around for many years. By spidering a few sites YaCy will index those entire sites and all the links to other Stirling Engine sites people have posted since those forums opened on the internet many years ago… And all the sites those sites are connected to, etc.
Then when I do another search I getback a treasure trove of links to information on the subject, that I would never get with any other search engine.
I’ve even spidered the internet archive or “Wayback Machine” on the subject and so can find information on the subject on websites that don’t exist anymore.
IMO if someone gets “spam” results with YaCy, I suspect they don’t know how to use the program, but to be fair, as mentioned, I haven’t fired it up lately so perhaps something has changed.
OK, I just fired up YaCy on my Windows laptop.
Searched for stirling engines. Here are the top-first page results:
Mostly all highly relevant results about Stirling Engines. No advertising, no spam, no artificially elevated results from marketing firms trying to sell junk, just page after page of mostly good search results.
Also I can search for some obscure aspect of the Stirling Engine, like the “displacer”.
An online search engine would have no idea what that is and would return all kinds of garbage, but because I have spidered Stirling Engine sites, my local YaCy index is filled with relevant information about Stirling Engine displacers.
Searching for displacer on Google, to be fair, I found one result on the third page about the Stirling Engine displacer. But I guess there is something in Dungeons and Dragons and the World of Warcraft video game called “the displacer” so, all in all, crap, as far as my intended search is concerned.
One thing I see no signs of whatsoever (While using YaCy) is a lot of (any) “spam”.
The review alleges: “The common index is highly vulnerable to be flooded with spam, adverts (think e. g. CNAME trackers) and exploits of all kinds. It’s hard to imagine how independent peers could even try to maintain a meaningfull common index.”
Nonsense. Deleting irrelevant or spam results is easily done, virtually effortless. URL’s of garbage websites can be deleted from the index.
What you get from the YaCy’s “common index” is mostly hand picked sites that other uses have personally evaluated and bookmarked.
The readers comment: “Then I connected it to the rest of the internet and pulled in search results from the YaCy network, and instantly the exact same search terms started pulling up spam.”
That is also nonsense. I used YaCy for a long while without indexing anything locally and found the results from the network quite delightful and refreshing. As I said, other users personal “bookmarks” so-to-speek. no advertising, no “spam” but a lot of good websites on my personal topics of interest that I never saw on any online search engine because they all mostly cater to advertisers and “affiliates” or the top most visible websites; youtube, amazon, ebay etc.
I don’t even know how anyone using YaCy could get SPAM into the index, shared or otherwise, I don’t think that is even possible.
Spam is generated by programs for advertisers. YaCy contains no such spam generating programs. Using YaCy, I have never in the past six months or so, using it more or less continually, encountered any spam.
“The common index is highly vulnerable to be flooded with spam, adverts (think e. g. CNAME trackers) and exploits of all kinds.” - No.
Because the YaCy common index is maintained mostly by people that use the program, care about it, maintain it themselves and certainly don’t profit from any online advertising, which is what generates the majority of such exploits, tracking cookies and spyware.
Eliminating that sort of thing (advertising and what goes along with it) conflicts with the goals of the big online search portals that depend on tracking for advertising purposes. YaCy has none of that.
CNAME trackers are mentioned as a prominent issue or problem. What?
CNAME trackers are third party cookies some big websites with lots of advertising allow advertisers to but on their websites disguised as coming from their own domain to fool ad blockers. CNAME tracking is a browser vulnerability.
What does that have to do with YaCy? Nothing!
It has to do with advertisers on big major websites like walmart.com, weather.com, cnn.com, etc. Much more likely to encounter this with the big major web portals / search engines that draw their revenue from advertising.
This is interesting. Came across the OP’s YaCy instance, apparently, on https://yacy.myops.de/
I guess it must be anyway.
Did a search for Stirling Engines there and got these results: first page:
Now I did, over time, contribute lots of Stirling Engine websites to the shared index but not from BitChute. Again, some new interesting, relevant, and varied results. And where’s all that horrible spam??? I still don’t see any.
This “short review” appears to me to be completely misleading and completely false. The basic YaCy interface could not be simpler. True, there are tons of features available in the administration area for people who want to dive deeper into customizing, configuring and optimizing their search engine and potentially much more, but how is that a bad thing.
Having said all that, more detailed documentation would be helpful. More video tutorials on how to use all those features would be nice. There is a whole lot to learn about the inner workings of YaCy, but it doesn’t do a lot of hand holding for people who don’t know how a modern search engine works.
What would make YaCy and the YaCy shared index better would mostly involve just bringing more people up to speed on how to use it effectively.