Query Processing: Peer to Peer Networks

Peer-to-peer (P2P) networks are an important emerging technology in distributed computing. While the commercial viability of P2P networks is still in doubt, there is no question that P2P networks are phenomenally successful as a mechanism for file sharing. Despite their popularity, the current technologies and applications of today's P2P
networks are quite primitive. There are two major weaknesses displayed by today's popular P2P networks: inefficient network protocols, and impoverished query languages.
The first of these problems has been the subject of intense research in the last few years. To overcome the scaling problems with unstructured P2P systems, a number of groups have proposed structured P2P designs. These proposals support a Distributed Hash Table (DHT) functionality in which lookups can be resolved in log n (or n x for small x) overlay routing hops for an overlay network of size n hosts. These schemes are also robust to the unpredictable nature of the P2P environment, tolerating dynamic failures and additions of nodes to the network.
DHTs promise robustness and scalability for P2P networks. However, as hash tables, DHTs support only exact match lookups. This is fine for fetching files or resolving domain names, but presents an even more impoverished query language than the original, unscalable P2P systems, which supported substring search. Hence in solving the first weakness above, DHTs have aggravated the second.
We propose to enhance the limited query functionality in P2P networks by studying the design and implementation of complex query facilities over DHTs. Our goals are twofold. First, we wish to bring the traditional functionality of P2P systems - file sharing - to a scalable, robust DHT implementation. Second, we hope to push query functionality well beyond current file sharing search, while still maintaining the scalability of the DHT infrastructures.We believe that this agenda can be spread via file sharing applications, but we also foresee more powerful and perhaps more commercially viable applications of rich P2P query processing.