AML Model Governance: Bridging the Domain Gap
September 11th, 2020
Financial crime and compliance teams are continuously trying to improve money laundering detection while ensuring better alignment with regulator expectations. One particular area hindering the success of this is the perpetual concern for transparency and partnership with model governance teams. Ten years ago, when the OCC 2011-12 supervisory for “Model Risk Management” was created to ensure model risk policies, procedures, and practices were effective, AML programs had it all sorted out – any time a threshold was changed, it was documented. But with AML teams now leveraging machine learning to supplement rules, the game has changed and a domain gap challenge has surfaced.
AML Model Governance: What it Looks Like
Typically, the role of those involved in the governance process are part of teams dedicated to model governance, compliance and operations. Each team requires a different set of artifacts when playing their part, especially when providing evidence to regulators in the event of an audit.
- Model Governance teams require comprehensive documentation to authorize the deployment of the model and support regulatory model documentation requirements.
- Compliance teams require tools for monitoring the performance of the models as well as for justifying the validity of changes.
- Operations teams require tools for monitoring models’ inputs (data quality) and models’ outputs (volumes impacting operational readiness).
There are two broad categories of AML models and it’s important to distinguish between them.
Engineered models are defined by subject matter experts with complex logic and turned into executable code by software engineers. Examples of this are Know Your Customer (KYC) risk evaluation rules or transaction monitoring detection rules. Data used to develop and test these models is often mocked-up data created to test all possible use cases.
Machine learning models are built by applying machine learning techniques on a data set that represents real-world activity. While this data set can be a sample, it has to be real, and in most cases comes directly from the source systems.
The main difference between engineered models and machine learning models is in the nature of the data used for building, testing and enhancing the model. This has a fundamental impact on governance and model validation approaches.
While governance and control take place throughout the software lifecycle, here we’ll focus specifically on the model development phase aspects. During this phase, careful attention must be given to model validation preparation. Once the model is developed, teams must measure whether the model is fit for purpose. The goal is to provide all documentation and evidence required for governance teams, but with a different approach for each type of model.
- For engineering models with out-of-the-box rules, scoring factors and other models with a predefined logic, test data must cover all use cases, and a test is created to ensure the outputs match the expected results. Both functional and performance aspects should be tested, with large amounts of data covering all uses cases. Ideally, once the model is deployed into production, a set of dashboards should be created to monitor performance. They’ll show the evolution of performance over time (e.g.: alert volumes and true-positive rates month by month), and a breakdown by rule to understand the trends at the rule level, and where to focus calibration efforts.
Traditionally, model governance around these models is structured around a number of best practices that are risk-oriented and answer basic business questions such as “Am I covering all my customers?”, “Is anything interesting happening under the line?” or “Will applying these new thresholds suppress true-positives?”.
- For machine learning models, development requires a wide range of controls and validation metrics required to justify the choices made for each step of development. By using a technology that integrates the code with reports showing how the model is performing, the team is able to understand the validity of their choices at each step. Once the model is deployed to production, performance should be measured against different metrics, with the ability to compare the current ‘champion’ model to ‘challenger’ models, verifying the current model is still the one with optimum performance. As with engineering models, dashboards should be created to monitor both their effectiveness and provide governance team with a way to perform ongoing model review while triggering model calibration when necessary.
Model governance for these models is about providing proof they’re working in an optimum fashion, using widely accepted statistical measures. Data scientists working with these models are fully aware of these measures and use them to compare one version of a model to another, to choose the optimal one.
AML Model Governance and Machine Learning
Since machine learning is still new to the AML space and comprehension of their application within the space is limited, the challenges emerge. Data scientists often do not have backgrounds in the compliance function and are unfamiliar with the business aspects and goals of the AML program, and those with backgrounds in the compliance function who have that level of understanding, often do not have the data science background needed to interpret complex statistical metrics. This means a data scientist can optimize a model to improve standard statistical metrics yet miss basic business goals while the compliance analysts see the machine learning model as a black box, regardless of how much “industry standard proof” is provided the advanced statistics.
Bridging the domain gap requires a solution that combines both the two types of measurements, statistical and business, into a model governance flow where the explanation for how the machine learning model impacts the basic AML business questions and goals, is made clear. The business goals should be translated into simple metrics, such as ‘precision’ and ‘recall’, so the Machine Learning model development can improve these metrics while meeting the basic statistical quality metrics.
Applying this approach instills confidence within all teams that the models are meeting the overall objectives of the AML program and all the evidence needed to justify choices exist. When undertaking this for the first time, there are three key takeaways to consider:
- Be clear about your ultimate business objectives when introducing new models and convert them into calculable metrics.
- Ensure the introduction of new or updated models is measured against these ultimate business objective metrics, in addition to standard statistical metrics.
- Ensure all model governance documentation includes a mix of statistical and business metrics so that whoever reviews it will find justifications pertaining to their specific area of domain expertise and will be able to validate your approach.