Publication Details

This article was originally published as: Getta, JR, Optimization of online data integration, 7th International Baltic Conference on Databases and Information Systems 2006, 3-6 July 2006, 91-97. Copyright 2006 IEEE.


Online data integration is a process of continuous consolidation of data transmitted over the wide area networks with data already stored at a central site of a multidatabase system. The continuity of the process requires activation of data integration procedure each time a new portion of data is received at a central site. Efficient implementation of online data integration needs a new system of elementary operations on the increments and/or decrements of data and the intermediate results of integration. This work shows how to derive a new system of elementary operations for online data integration from a system of base operations on the data containers. In particular, we define a new system of online operations based on the system of binary operations of relational algebra. The paper analyses the properties of the new system and describes the transformations of global data integration expressions into the collections of online data integration plans. It is presented how the system can be used for the comprehensive analysis and optimization of online data integration plans. The optimization techniques described in the paper include reduction of input data increments, identification and elimination of intermediate materializations, and reduction of fixed size arguments in online data integration plans.



Link to publisher version (DOI)