In 2016 I started developing a database structure to store statistics of yacy peers since the original project yacystats.de shut down. But then I had to stop for personal reasons.
From time to time I ran a yacy instance on my private server. Unfortunately, it sent my private internet to the graveyard and I hab shut it down again (damn Fritzbox).
About a month ago I started to develop a new database structure version and had a really good start with the scripts that are fetching the data, importing into the database, creating statistics and so on.
The new site I’ve put together is 2 weeks online now.
Here is what I have:
- The network stats that can be collected from one peer installation that shows current statistics like ppm, qph, links, words etc.
- The “seedlist” with all “public” peers
- And… index browser pages…
The first two things should be clear. Some values and the official names of the peers. I am collecting these pages every hour.
But what I was really interested in over the last two days was the index browser page. It shows all web pages that had been indexed. If you collect this page from every peer you can create an overall index of every website.
I could write a lot more about it but now I’d like to hear your thoughts and maybe ideas!?
Thanks & greetings