Degree Name

Doctor of Philosophy


School of Computing and Information Technology


It is very common that the large and multinational business organizations distribute their operational database systems over the wide area networks. The data integra- tion from the distributed and highly autonomous operational database systems has to overcome many organizational and technical problems. Materialized view is the per- manently stored relational table that contains integrated data from other data sources. Materialized view is an important technique applied to the implementation of data warehousing approach for data integration. In this approach, the data warehouse system extracts, transforms and loads data from heterogeneous sources into a single materialized view, so data residing in different sources are combined into a unified view.

As the future work, we also consider to integrate sampling techniques into the group strategies. With this enhancement, firstly, the data warehouse does not require to maintain the statistics about the remote data source. More importantly, it provides the exibility that the synchronization could run in arbitrary frequencies as desired. Moreover, considering various pricing models for the emerging trend of the cloud com- puting environment, as the future work, it would be helpful to design a mechanism that is used to balance the trade-offbetween data precessing that is measured by CPU time and RAM usage and communication costs that is measured by the traffic allot- ment. According to the pricing model, a set of control parameters should be identified and defined, such that the process of delta extraction can be configurablein different settings according to the underlying pricing model. It also seems promising to extend our current optimization techniques to support NoSQL databases.