Chapter Objectives
In this chapter, you will learn to:
• identify the key challenges and approaches for data and process integration;
• understand the basic mechanisms of searching unstructured data within an organization and across the World Wide Web;
• define data quality as a multidimensional concept and understand how master data management (MDM) can contribute to it;
• understand different frameworks and standards for data governance;
• highlight more recent approaches in data warehousing, data integration, and governance.
Opening Scenario
Things are going well at Sober. The company has set up a solid data environment based on a solid relational database management system used to support the bulk of its operations. Sober's mobile app development team has been using MongoDB as a scalable NoSQL DBMS to handle the increased workload coming from mobile app users and to provide back-end support for experimental features the team wants to test in new versions of their mobile app. Sober's development and database team is already paying attention to various data quality and governance aspects: the RDBMS is the central source of truth, strongly focusing on solid schema design and regular quality checks being performed on the data. The NoSQL database is an additional support system to handle large query volumes from mobile users in real-time in a scalable manner, but where all data changes are still being propagated to the central RDBMS. This is done in a manual manner, which sometimes leads to the two data sources not being in agreement with each other. Sober's team therefore wants to consider better data quality approaches to implement more robust quality checks on data and make sure that changes to the NoSQL database are propagated to the RDBMS system in a timely and correct manner. Sober also wants to understand how their data flows can be better integrated with their business processes.
In this chapter we will look at some managerial and technical aspects of data integration. We will zoom in on data integration techniques, data quality, and data governance. As companies often end up with many information systems and databases over time, the concept of data integration becomes increasingly important to consolidate a company's data to provide one, unified view to applications and users.
Review the options below to login to check your access.
Log in with your Cambridge Higher Education account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.