A Distributed Search Engine for the Distributed Web

By David Hawig Last updated May 23, 2019

While search neutrality might be open for discussion, it is pretty clear that Google’s centralized search engine with a market share of above 90 % and quarterly earnings of above 30 billion dollars are far from ideal. Monopolies not only are economically inefficient but also increase the chance of censorship and search bias.

If it comes to finding information on the distributed web, a centralized search engine seems counterintuitive, because it goes against the principles underlying the distributed web. That is why we are currently working hard to create the first fully functional, completely distributed search engine for our project Dweb.page.

Problem

Despite the earlier mentioned downsides of current search engines, we believe multiple reasons have led to difficulties in changing the existing model. At the same time, a distributed and fully transparent search engine for the Dweb comes with a set of challenges:

Speed: The speed of the distributed search engine needs to be at least as high as the current solutions, and there are a lot of problems with the transaction times based on distributed ledgers.
Device independence: Today more and more people are using mobile phones; the distributed search engine needs to run on PCs and mobile phones without any centralized backend.
Indexing: How to collect, parse, and store data to facilitate fast and accurate information retrieval in a distributed way and still ensure that people don’t create fake search entries?
Availability: How to ensure that distributed data is still available when requested? Especially since the data can be hosted locally and therefore only be available in certain time slots.
Monetization and incentives: How to finance the storage and continuous development of the tool? Without this monetization part figured out, it will be difficult for decentralized solutions to compete with existing centralized ones for example regarding human talent or partnerships/integrations, etc.

A potential solution

To ensure high speed and feeless transactions, it was clear from the beginning that distributed ledger technologies which are limited by either one of the two performance issues were not an option. Therefore, we chose the combination of IPFS and IOTA. IPFS is fulfilling the obvious role of a fast and distributed way to share and host files, whereas IOTA provides the necessary distributed database layer. It is important to notice here that the database only uses a part of the IOTA technology which is already fully functional and independent of future research work (e.g., regarding the coordinator).

This combination allows us to provide an experience which works on all kinds of devices. We even had a prototype running in the Internet Explorer. The unique feature is that we can deliver a fully distributed experience without the additional installation of any software since all the code is running inside a simple, completely open source web page, which by itself is distributed on IPFS. It also means every single user will run their own search engine, which is the ultimate distribution.

Inspired by this distributed interface, we are working on the following concept for a distributed search engine:

The distributed and personalized search engine

We assume two types of users, who we call Authors and Consumers (one person could fulfill both roles though).