The rate at which data science techniques are developing and being adopted is increasing faster than insurers are able to develop their own understanding of the risk governance and ethics needed.
To make matters more challenging, two distinct groups operate within most insurers on the frontline of data science, often in conflict rather than in harmony: data science teams practising using cutting-edge techniques without the necessary understanding of their organisation’s risk frameworks, and insurance leaders who have limited experience with the latest advanced analytics. As a consequence, this internal disconnect leaves insurers and individuals that work for them exposed to risk.
Finding the right balance between governance and control, whilst still advancing the adoption of data science and the value that it creates, has become the magic middle ground upon which insurers have set their sights.
Bias
As increasingly complex models are used, a key risk for insurers to consider is bias - an issue so far not fully understood and appreciated by many firms – and how best to address the problems it creates. When individuals or groups of individuals are differentiated from others based on particular characteristics, insurers need to understand why. Is the bias due to the data collected not representing the entire population? Is it caused by potentially flawed human decision-making which is represented in the data collected? Or was the bias introduced due to the artificial intelligence (AI) and machine learning models trained on the data? Is the inherent model form being used responsible for reinforcing the existing bias or even creating new biases?
The ability to detect hidden biases is essential to enabling appropriate strategies to measure, monitor and manage bias. Instead of thinking about bias at every stage of the model building process - when an insurer first explores their data, when they build a model and when model outputs are used to impact a business decision – the risks are all too often considered as an afterthought by data scientists.
Choosing the right algorithm that will help an insurer find the optimum balance between interpretability, transparency and predictive power is another essential capability. There are a number of custom algorithms being developed in the market at the moment. For example, Layered Gradient Boosting Machines capture the same predictive accuracy of a GBM, whilst providing a much greater level of transparency and interpretability.
Open source risk
In recent years, open source adoption has seen unprecedented growth. While open source allows incredible flexibility and innovation, it also exposes an insurer to more risk, particularly relating to governance and security. Besides the potential for malicious code hiding in open source packages, key person dependency is another risk created by having either just one individual or a small team responsible for building and maintaining code.



