Data governance and data quality in big data context

An opinion piece by Pankaj Upadhyay, Vice President - Data Science, BI & Analytics, Maveric Systems

The banking industry is a prime example of how data controls the world. The volume of data generated is enormous and continues to grow with the advent of digital technologies. As customers and businesses increasingly gravitate towards digital banking, the dependency on Big Data grows. According to a study by IDC, the worldwide revenue for big data and analytics solutions is expected to reach $260 billion by 2022. The benefits of Big Data in banking are pretty clear. From improving customer experience to delivering superior analytics, to streamlining internal processes, and enhancing cybersecurity – it has grown to become an integral part of growing banks.

While we understand the explosion of data, the banking sector has its own set of roadblocks and challenges to big data implementation.

  • Working with a legacy infrastructure

Time and again, modern banking solutions are hindered due to their reliance on legacy infrastructure. The monoliths are not built to process the constant influx of data that comes with big data use cases. Modernization or overhauling the legacy infrastructure is an expensive affair that many banks are not ready to dive into.

  • Connecting data silos

Banking products and services are spread across departments, and so is their data. Collating data from siloed banking operations is an uphill task that results in redundant processes and inaccuracies in data aggregation.

  • Data usability issues

With so many different types of data, banks are struggling to collate reliable and usable data. According to a whitepaper by Deutsche Bank, 62 percent of banks agree big data is critical for success. Yet, only 29 percent reported to be getting enough business value from data. The challenge comes in sorting through volumes of irrelevant data lakes and in incorporating new types of data to an already overburdened system.

  • Data privacy concerns

Over the years, big data has come under scrutiny for its privacy boundaries. Despite the use of anonymized data, it has been proven repeatedly that data can be re-identified. Customers can be easily identified this way and be vulnerable to malicious players.

Integrating data quality into your data governance strategy

In recent years, the explosion of digital banking solutions has placed data quality and data governance at the core of their data strategy. Although used interchangeably, data quality and data governance are two different aspects of data management. While data quality refers to the accuracy, consistency, completeness, integrity, and timeliness of data ; data governance is the systemic management of data produces via services or business processes through a well-defined, designed, and structured framework. The rules and policies govern the aspect of data ownership, data processes, and data technologies that are used by the business. In simple terms, data governance provides a framework for managing data quality.

Strong data governance is becoming crucial for banks these days. With proper data governance practices an organization can improve performance, reduce costs, alleviate data issues, and mitigate data breaches. However, effectively deploying a data governance framework is a challenging task. This is complicated further in the BFSI sector owing to the nature of data held.

  • Understand regulations and compliance.

Ensuring compliance with all regulations is an essential component of a data governance program. A major aim for regulatory and legal bodies is the ability to monitor, control, store, search, retrieve, and analyze structured and unstructured data. While Europe’s General Data Protection Regulation (GDPR) emphasizes on consumer data privacy, initiatives of open banking and Payment Service Directive 2 (PSD2) raise questions of data ownership and accountability. Whereas, the Markets in Financial Instruments Directive II (MiFID II) emphasize establishing data lineage procedures for reliable reporting. Without a robust enterprise-wide data governance framework, it would be impossible to meet all the requirements of the evolving regulatory and legal demands.

  • Monitor key metrics

Integrating a data governance strategy is not a one-time effort. Rather it relies on data inputs to create a long-term, sustainable governance program. To this effect, every data governance strategy should have a quantitative assessment of how efficient it is – right from implementation, to process changes, to improving corporate mandates and standards.  For financial institutions, this boils down to two key metrics – data quality and policy adherence. Policy adherence is required to mitigate error-based data losses incurred from employees, vendors, or users handling data.

Sustaining a governance program

To ease data governance programs one can split the strategy into 3 main pieces: Rules, Enforcement, and Management of data. The core objectives of data governance program should be closely aligned to the people, process, and technology of an organization; enabling them to leverage data as an organizational asset. To achieve this business and IT leaders must work together to design a governance program focusing on:

  • Building governance infrastructure and technology
  • Defining the processes and business rules
  • Developing common and standard data domain definitions
  • Developing architecture practices and standards
  • Clearly defining responsibility and accountability
  • Monitoring and improving data quality with a metrics-based strategy

The enforcement for data governance initiatives is a top-down approach wherein the management guides the organization to efficient information management. Ensure that all stakeholders are on board and understand the impact of governance policies. As the data explosion continues, the priorities for BFSI leaders to drive next-generation data governance and data quality would include:

  • Governing the use of data, access, and privacy

For any data governance plan, the roles of data owners and stewards should be well-defined. Leaders must define who in their organization should have access to their data, across which software tools, and regularly audit access to data. The main aspects to keep in mind are:

  • Who: gets to access, touch, and create data in an organization?
  • What: type of data is allowed in your operations? (keeping in mind the compliance and regulatory requirements)
  • When: does the data become obsolete? Understand the data retention policies as per defined guidelines.
  • Where: should you store the data?


  • Extending data governance to advanced analytics and technologies

The rapid evolution of digital technology, artificial intelligence, blockchain, API interfaces, and other emerging technologies is changing the way consumers interact with a product/service and how data is generated. The availability of new types of unstructured data from advanced analytics and technologies raises questions about how, what, can, and should organizations use the data. For instance, data from facial recognition software can range from selfies to CCTV footage. How much of the data that a user puts out is valuable? Can this be used for creating customer profiles? Should organizations use the data at all? While the regulations around these technologies are murky, organizations must prepare for adapting their data governance strategy to the potential use cases.

  • Building a robust framework

Data governance is a broad category and its implementation varies based on the organization or industry. But as a start, one must include policies, rule, procedures, and structures for data management. It is important to remember that good data governance provides clarity and improves data quality. To build a robust framework Gartner lays out seven principles:

  • Clearly define accountability and decision rights
  • Incorporate a trust model of governance
  • Inculcate a culture of collaboration
  • Ensure transparency, ethics, and open decision-making processes
  • Keep risk and security always in mind
  • Invest in training data owners and employees
  • Align your data governance program with business value and desired outcome.