Criticism and solutions

1: Bad results.

For “teen titan” i keep getting “teen” and separately “titan” results. Yacy works fine with one-worded queries like “spirou” - so try make it better with multi-worded; Not only results with static phrases - words can vary in form and position - but they remain “related”.

Eg. for form variation - i need to be served also with plural “teen titans” results as well (and vice-versa). I think the “related positions” is something more complex than a dictionary. Inspect what other engines do.

Until that is solved, id rather consider yacy useless. I cant recommend it to anyone cuz of the terrible search results.

2: Implement spelling correction. That means query-replacement.

3: Instead of the modifiers infront of searchquery, better make simple checkboxes. These modifiers are indeed changing the results i get, but they are changing them only partially (as for what i have tested). Thats why i consider them as not working. And all these query filters under “Ranking and Heuristics” (too complex for me to bother - insufficient description), make them user-friendly. Place a drop menu with these profiles on the query page.

4: Crawlers. Needs following settings:

  • Use N cores count
  • Set priority of task
  • Pause/Continue (also between sessions)
  • Start on PC inactivity (if battery level above N%)
  • Maybe automatically distribute a global schedule between users. 1st user gets to cover and update 100% of the WWW. 2nd and third 50% (overlapping with 1st). 4-7 get 25% and so on. Thats my idea for having maximum indexing available. And the autocrawler needs to be On by default (prob only on inactivity, but also ask user for him to donate, say, 1 core for constant crawling). Then based on user activity, more active users should get bigger % and vice-versa. Then implement a reputation scheme based on the %. And maybe a toggle for letting the lower-users share indeces with higher ones, just to save them time. This whole thing could be a bad idea, im not sure, or it could be a great one.
  • And i havent seen the option “download sitemaps form other peers”, which should be followed by filters.

5: Get checkboxes to query in: title, description, title-attributes, < header >, < body >

6: Implement a fallback - a button to query the big engines with the same phrase. And maybe display results in yacy’s interface?

7: If you are not willing to make drastic changes to search-results and user-friendliness, id recommend you to abandon yacy.

Sorry for being harsh, but just a few steps are left for it to become a great engine. Id be happy one day to spread the word.

  1. use "
    “word 1 word 2”
  2. you can pause crawler
  3. You can fork the project and submit your changes for evaluation

The search & match indeed has room for improvement, but:
Do you have a better search engine?

Most available open-source engines only provide a tiny subset of what YaCy offers.