Dynamic query scheduling for online integration of semistructured data
In data integration systems a user request issued at a central site is decomposed into a number of sub-requests, which later on are processed at the remote sites. The results are sent back to a central site for data integration and the results of integration are returned to a user. Data integration systems often failed to show its best performance due to unpredictable data arrival rate. Traditionally, data integration requires the complete results from the remote sites to be available at a central site before final computations begin. An online integration system starts and continues the computations at the central site shortly after every piece of data is received from the remote sites. Execution of online integration plan in static scheduling strategy causes poor performance of data integration system as unnecessary computations are executed in some circumstances. This paper proposes a dynamic scheduling for online integration plans, which employs data increment monitoring system which is able to dynamically change the data integration plans whenever it is necessary.