Non-English characters

Hello All,

Thank you for this great software. I liked it :slight_smile:

I want to use it to search Turkish documents.

When i try a search with keyword like İstanbul it turns no result because of İ.

Is there a solution for this.

Thank you in advance.

Hi ufukayyildiz,

good question, I tried it and instantly I was also not find anything. However I am pretty sure that YaCy is able to process all UTF-8 characters. Therefore I was looking for web pages that actually contain İstanbul as you spell it and I found which I put into my crawler to verify that it is possible to find this page again with the word İstanbul. And it worked!

So - there is no problem with YaCy with such letters. There is simply not enough content to find. Just put in the web pages that you want to be found!

Hi @Orbiter

Thank you so much.

I am trying search pdf files. Could it be related PDF files?

Could you please check the video on the link?

Thanks again.

Are these characters interchangeable?
Eg. I would expect the same results both for İstanbul (with the Turkish İ) and Istanbul (as everyone else would type it outside of Turkey).

1 Like

Based on my experience they are not interchangeble.
eg try to search for ‘Árvíztűrő tükörfúrógép’ vs ‘Arvizturo tukorfurogep
The first brings this exact match besides others: but for the other there are 0 results.

Is there any way to change this?