A novel data centric information retrieval protocol for queries in delay tolerant networks
Information Retrieval (IR) systems aim to retrieve data that satisfies certain requirements and constitute an important service in many types of networks, including Delay/Disruption Tolerant Networks (DTNs). In current DTN based IR systems, the data that satisfies a query is assumed to be stored on a single node. Therefore, once a node receives a query in which it has the corresponding data, the query can be resolved completely. However, in scenarios where a query requires data from multiple nodes, these IR systems may fail. Henceforth, in this paper, we propose Distributed Data-Centric Information Retrieval (DDC-IR), a data centric IR system that supports all query types; e.g., continuous and complex. More importantly, it is designed specifically to operate in DTNs. It also incorporates a new packet, aka Query Reply Packet, that includes both a query and one or more replies. We show how this packet facilitates efficient query resolution and enables data centric routing. In addition, it uses caching so that nodes store popular queries that has the effect of speeding up query resolution. We have conducted an extensive simulation study to compare DDC-IR to state of the art IR systems using the popular Random Waypoint model and a trace-file containing student movements on a campus. The results show that DDC-IR is able to resolve 50 % more queries and has an 80 % lower buffer occupancy level than existing IR systems. We also tested DDC-IR in networks with varying sizes. For networks with 100 nodes, DDC-IR is able to resolve queries while current IR systems fail to resolve any queries. In particular, when the number of nodes increases, current IR systems fail to resolve any queries, whilst DDC-IR is able to resolve complex and continuous queries. The influence of the number of sub-queries on query resolution time is also studied. Specifically, when the number of sub-queries in a complex query increases from five to nine, DDC-IR uses 50 % more time to resolve a query. In comparison, prior IR systems fail to resolve any queries.