Opening the black box: how ‘explainable AI’ can help us understand how algorithms work

When you visit a hospital, artificial intelligence (AI) models can assist doctors by analysing medical images or predicting patient outcomes based on historical data. If you apply for a job, AI algorithms can be used to screen resumés, rank job candidates and even conduct initial interviews. When you want to watch a movie on Netflix, a recommendation algorithm predicts which movies you’re likely to enjoy based on your viewing habits. Even when you are driving, predictive algorithms are at work in navigation apps like Waze and Google Maps, optimising routes and predicting traffic patterns to ensure faster travel.

In the workplace, AI-powered tools like ChatGPT and GitHub Copilot are used to draft e-mails, write code and automate repetitive tasks, with studies suggesting that AI could automate up to 30% of worked hours by 2030.

But a common issue of these AI systems is that their inner workings are often complex to understand – not only for the general public, but also for experts! This limits how we can use AI tools in practice. To address this problem and to align with growing regulatory demands, a field of research known as “explainable AI” has emerged.

AI and machine learning: what’s in a name?

With the current move toward integration of AI into organisations and the widespread mediatisation of its potential, it is easy to get confused, especially with so many terms floating around to designate AI systems, including machine learning, deep learning and large language models, to name but a few.

In simple terms, AI refers to the development of computer systems that perform tasks requiring human intelligence such as problem-solving, decision-making and language understanding. It encompasses various subfields like robotics, computer vision and natural language understanding.

One important subset of AI is machine learning, which enables computers to learn from data instead of being explicitly programmed for every task. Essentially, the machine looks at patterns in the data and uses those patterns to make predictions or decisions. For example, think about an e-mail spam filter. The system is trained with thousands of examples of both spam and non-spam e-mails. Over time, it learns patterns such as specific words, phrases or sender details that are common in spam.

Different expressions used to designate a wide range of AI systems.

Deep learning, a further subset of machine learning, uses complex neural networks with multiple layers to learn even more sophisticated patterns. Deep learning has been shown to be of exceptional value when working with image or textual data and is the core technology at the basis of various image recognition tools or large language models such as ChatGPT.

Regulating AI

The examples above demonstrate the broad application of AI across different industries. Several of these scenarios, such as suggesting movies on Netflix, seem relatively low-risk. However, others, such as recruitment, credit scoring or medical diagnosis, can have a large impact on someone’s life, making it crucial that they happen in a manner that is aligned with our ethical objectives.

Recognising this, the European Union proposed the AI Act, which its parliament approved in March. This regulatory framework categorises AI applications into four different risk levels: unacceptable, high, limited and minimal, depending on their potential impact on society and individuals. Each level is subject to different degrees of regulations and requirements.

Unacceptable risk AI systems, such as systems used for social scoring or predictive policing, are prohibited in the EU, as they pose significant threats to human rights.

High-risk AI systems are allowed but they are subject to the strictest regulation, as they have the potential to cause significant harm if they fail or are misused, including in settings such as law enforcement, recruitment and education.

Limited risk AI systems, such as chatbots or emotion recognition systems, carry some risk of manipulation or deceit. Here it is important that humans are informed about their interaction with the AI system.

Minimal risk AI systems include all other AI systems, such as spam filters, which can be deployed without additional restrictions.

The need for explainability

Many consumers are no longer willing to accept companies blaming their decisions on black-box algorithms. Take the Apple Card incident, where a man was granted a significantly higher credit limit than his wife, despite their shared assets. This sparked public outrage, as Apple was not able to explain the reasoning behind the decision of its algorithm. This example highlights the growing need for explainability in AI-driven decisions, not only to ensure customer satisfaction but also to prevent negative public perception.

For high-risk AI systems, Article 86 of the AI Act establishes the right to request an explanation of decisions made by AI systems, which is a significant step toward ensuring algorithmic transparency.

However, beyond legal compliance, transparent AI systems offer several other benefits for both model owners and those impacted by the systems’ decisions.

Transparent AI

First, transparency builds trust: when users understand how an AI system works, they are more likely to engage with it. Secondly, it can prevent biased outcomes, allowing regulators to verify whether a model unfairly favours specific groups. Finally, transparency enables the continuous improvement of AI systems by revealing mistakes or unexpected patterns.

But how can we achieve transparency in AI?

In general, there are two main approaches to making AI models more transparent.

First, one could use simple models like decision trees or linear models to make predictions. These models are easy to understand because their decision-making process is straightforward. For example, a linear regression model could be used to predict house prices based on features like the number of bedrooms, square footage and location. The simplicity lies in the fact that each feature is assigned a weight, and the prediction is simply the sum of these weighted features. This means one can clearly see how each feature contributes to the final house price prediction.

However, as data becomes more complex, these simple models may no longer perform well enough.

This is why developers often turn to more advanced “black-box models” like deep neural networks, which can handle larger and more complex data but are difficult to interpret. For example, a deep neural network with millions of parameters can achieve a very high performance, but the way it reaches its decisions is not understandable to humans, because its decision-making process is too large and complex.

Explainable AI

Another option is to use these powerful black-box models alongside a separate explanation algorithm to clarify the model or its decisions. This approach, known as “explainable AI”, allows us to benefit from the power of complex models while still offering some level of transparency.

One well-known method is counterfactual explanation. A counterfactual explanation explains the decision of a model by identifying minimal changes to the input features that would lead to a different decision.

For instance, if an AI system denies someone a loan, a counterfactual explanation might inform the applicant: “If your income would have been $5,000 higher, your loan would have been approved”. This makes the decision more understandable while the used machine learning model can still be very complex. However, one downside is that these explanations are approximations, which means there may be multiple ways to explain the same decision.

The road ahead

As AI models become increasingly complex, their potential for transformative impact grows – but so does their capacity to make mistakes. For AI to be truly effective and trusted, users need to understand how these models reach their decisions.

Transparency is not only a matter of building trust but is also crucial to detect errors and ensure fairness. For instance, in self-driving cars, explainable AI can help engineers understand why the car misinterpreted a stop sign or failed to recognise a pedestrian. Similarly, in hiring, understanding how an AI system ranks job candidates can help employers avoid biased selections and promote diversity.

By focusing on transparent and ethical AI systems, we can ensure that technology serves both individuals and society in a positive and equitable way. Läs mer…