5 minute read 13 Jul 2021

How do you combine quality and quantity to make the most of big data?

5 minute read 13 Jul 2021

Upvote

Show resources

Artificial intelligence is a strategic opportunity for financial institutions – but they need to invest in data quality.

As the volume of data explodes, big data remains a huge topic on banks’ board agendas. Data-driven technology promises greater efficiency, cost savings and more customized solutions. With this in mind, embracing AI technologies is no longer a choice but a strategic imperative for the financial industry.

The paradigm shift in the way banks work is not without its challenges, though. Technological readiness is a key prerequisite for the applications that are transforming the banking landscape. CEOs need to invest generously now to prepare for the future, including in data quality management. After all, any added value from data – like a predictive model allowing a company to optimize its business parameters – hinges on the quality of the underlying data. Even the largest data set is useless if it’s riddled with unaddressed issues. Good data quality management is built on proper classification and understanding of the possible quality issues, meticulous inspection of data quality among the different data sources, and intelligent approaches to fix gaps and errors at their source.

Amid all the euphoria at the potential of data and analytics, it’s important to remember that quality is just as important as quantity.

Jean-Noël Ardouin

Partner, Financial Services Consulting, Risk & Actuarial | Switzerland

Let’s start with the basic question of what makes good quality data. While it ultimately depends, all applications reply on data that is fit for (its) purpose. To meet this requirement, data needs to be accurate, relevant, consistent and accessible.

Accurate

Accuracy refers to the intrinsic data quality. The data should be correct and complete.
Relevant

Relevance is about the contextual data quality. The data should be useful for the purpose, i.e. data quality must be considered within the context of the task at hand.
Consistent

Consistency relates to representational data quality. In other words, the data should be presented in such a way that it can be interpreted, easily understood and standardized.
Accessible

Accessibility means that the required data is available to the right users at the right time.

Any breach of these data quality principles results in a data quality issue. An appropriate data quality management must detect and, if possible, remediate any such issues as they appear at any stage in the data lifecycle.

Data quality issues are not always easy to differentiate from outliers that are attributable to a specific business reason. That’s why it’s important to have solid background knowledge of the business before attempting any data quality management exercise. In our experience, data quality issues tend to fall into the following categories:

Missing data, i.e., data gaps with random, completely random or non-random causes
Global duplicates, i.e., double entries for what should be one distinct entry
Local (field-related) quasi-duplicates, i.e., unintentionally near-identical input in free text fields
Local (field-related) outliers, i.e., values which deviate significantly from what we can reasonably expect
Global outliers/anomalies, i.e., atypical data points other than those with an underlying business cause

Artificial intelligence (AI) promises exciting potential not only in mining data to deliver valuable insights but also in addressing the quality of data. One subset of AI is machine learning (ML), an algorithmic system that can recognize patterns and learn without the need for explicit programming. Machine learning is a useful way to address data quality problems and support active data quality management at every stage of the data lifecycle.

At the creation phase, for example, data quality machine learning (DQ-ML) methods may be applied to onboarding when the basic data (i.e., client name, client type, client gender, address, country, client specific features) is collected and entered into the financial institution’s IT system. If the missing values corresponding to required information should normally not be allowed by the process, machine learning methods can be applied to detect inconsistencies in the client profile (e.g., between the provided addresses and countries), therefore improving the quality of data for any subsequent aspect of customer relationship management or flagging any suspicious/erroneous client data.

Later in the data lifecycle, DQ-ML methods, using various unsupervised and supervised techniques as well as natural language processing, may be developed in order to detect inconsistencies, duplicates and missing values in transaction data. This enables more accurate monitoring of suspicious transactions, for example, and can help financial institutions flag potential compliance issues in areas like money laundering, market manipulation or insider trading.

Another example is the area of credit risk models, where the discriminatory power of a model computing the probabilities of default can be improved by applying DQ-ML methods. By detecting (and potentially also remediating) data quality issues, organizations benefit from significantly improved model performance and can also, for example, calculate their regulatory capital more accurately.

AI enables computers to perceive, learn, reason. Today, the prerequisite of good data quality remains but the possibility of addressing quality issues directly at source using AI is increasing.

Summary

For all the euphoria at the potential of data and analytics, it’s important to remember that quality is just as important as quantity. In our era of big data, proper data quality management should be high on the board agenda. Many financial institutions still face various data quality issues in datasets across their front-to-back value chain. We explore the most common issues and look at how to tackle them.

Acknowledgements

We thank Thibaut Vernay for his valuable contribution to this article.

About this article

Jean-Noël Ardouin

By Jean-Noël Ardouin

Partner, Consulting, Risk & Actuarial in Financial Services | EY Switzerland

Committed to delivering exceptional client service. Passionate about teaming and coaching. Husband, father and avid trail runner.

Related topics AI Financial Services Banking and capital markets Data and decision intelligence Intelligent automation in financial services

Upvote

EY refers to the global organization, and may refer to one or more, of the member firms of Ernst & Young Limited, each of which is a separate legal entity. Ernst & Young Limited is a Swiss company with registered seats in Switzerland providing services to clients in Switzerland.

EY | Assurance | Consulting | Strategy and Transactions | Tax

About EY

EY is a global leader in assurance, consulting, strategy and transactions, and tax services. The insights and quality services we deliver help build trust and confidence in the capital markets and in economies the world over. We develop outstanding leaders who team to deliver on our promises to all of our stakeholders. In so doing, we play a critical role in building a better working world for our people, for our clients and for our communities.

EY refers to the global organization, and may refer to one or more, of the member firms of Ernst & Young Global Limited, each of which is a separate legal entity. Ernst & Young Global Limited, a UK company limited by guarantee, does not provide services to clients. For more information about our organization, please visit ey.com.

EYG/OC/FEA no.

ED MMYY

This material has been prepared for general informational purposes only and is not intended to be relied upon as accounting, tax, or other professional advice. Please refer to your advisors for specific advice.

Topics

General

People

How do you combine quality and quantity to make the most of big data?

Show resources

Artificial intelligence is a strategic opportunity for financial institutions – but they need to invest in data quality.

Accurate

Relevant

Consistent

Accessible

Summary

Acknowledgements

Do you have any questions?

Contact us | RFP for Switzerland

Welcome to EY.com

Topics

General

People

Trending

How do you combine quality and quantity to make the most of big data?

Show resources

Artificial intelligence is a strategic opportunity for financial institutions – but they need to invest in data quality.

Summary

Acknowledgements

Request for proposal (RFP)

Do you have any questions?

Contact us | RFP for Switzerland