Static optimization of data integration plans in global information systems
Global information systems provide its users with a centralized and transparent view of many heterogeneous and distributed sources of data. The requests to access data at a central site are decomposed and processed at the remote sites and the results are returned back to a central site. A data integration component of the system processes data retrieved and transmitted from the remote sites accordingly to the earlier prepared data integration plans. This work addresses a problem of static optimization of data integration plans in a global information system. Static optimization means that a data integration plan is transformed into more optimal form before it is used for data integration. We adopt an online approach to data integration where the packets of data transmitted over a wide area network are integrated into the final result as soon as they arrive at a central site. We show how data integration expression obtained from a user request can be transformed into a collection of data integration plans, one for each argument of data integration expression. This work proposes a number of static optimization techniques that change an order operations, eliminate materialization and constant arguments from data integration plans implemented as relational algebra expressions.