Notice there’s no question whether bias exists; it does. But rather the question is how does it get introduced and why should IT care?
The hype around AI could not be higher right now. Interest is piqued, demand is overwhelming, and everyone is scrambling to find “the killer app” for their market.
But under the hype, there are concerns being voiced, with good reason. It is fairly simple to introduce bias into AI, and that bias is raising alarms in some circles.
To understand how bias is introduced into AI, it’s necessary to have a basic understanding of how AI models are trained.
Depending on who you ask and how pedantic they want to be, you’ll get different answers on how many different learning methods there are. And indeed, the methods, algorithms, and models used today are extensive and, in many cases, beyond comprehension to those not deeply steeped in the field. But it’s important to understand, at a high level, how models are trained because ultimately that is how bias is introduced. Keeping that in mind, there are three basic ways to train AI models:
Okay, now to the real topic—how bias can be introduced into these systems.
The answer, and I’m sure you’ve already figured it out, is based on the fact that humans are often involved in the training process.
The easiest way to bias supervised learning is to poison the data, as it were, by mislabeling data. For example, if I’m classifying animals, mislabeling a “dog” as a “cat” can result in misidentification at high enough scale. A risk with labeling is intentional mislabeling with the goal of corrupting the output. Some mislabeling is merely the product of human judgment, such as deciding whether a panther is a cat or whether a statue of a cat counts as a cat. With reinforcement learning, positively rewarding the “wrong” answer or move in a game could potentially result in a system that intentionally gives the wrong answers or always loses.
Which for some folks might be an appealing option.
Now obviously this has implications for conversational (generative AI) such as ChatGPT, which was, according to their site, fine-tuned using “supervised learning as well as reinforcement learning” that “used human trainers to improve the model’s performance.” When you choose the “up” or “down” option to rank responses, that data can potentially be used to further fine-tune its model. You, dear reader, I assume are human. Ergo, the potential for further biasing the system exists. The reality is that ChatGPT is often flat out wrong in its answers. Feedback is necessary to further train the system so it can generate the right answer more often.
Now that’s interesting—and we could have a fascinating conversation about the ways in which we could manipulate those systems and the consequences—but the real reason I wanted to explore this topic is because the problem of bias extends to telemetry, the operational data we all want to use to drive automation of the systems and services that deliver and secure digital services.
You may recall I’ve written on the topic of data bias as it relates to telemetry and the insights 98% of organization’s are missing.
In most cases related to analyzing telemetry, models are trained using data that’s been labeled. Bias can be introduced into that system by (a) mislabeling the data or (b) not having enough diversity of data in a specific category or (c) the method used to introduce new data. The reason mislabeling data is problematic should be obvious, because it can, in large enough quantities, result in misidentification. The issue with diversity of data is that data falling outside such a narrow training set will inevitably be misclassified.
A classic example of this was an AI model trained to recognize tanks versus other types of transportation. It turns out that all the tanks were photographed in daylight, but other vehicles were not. As a result, the AI did a great job at tank versus not tank, but was actually correlating day versus night. The lack of diversity in the input set caused a biased correlation.
Even if an operational AI is relying on reinforcement learning, the lack of diversity of data is problematic because the system does not have all the “variables” necessary to determine the “next move,” as it were.
The reason an AI might not have a diverse set of data or all the variables it needs is, you guessed it, data bias. Specifically, the data bias introduced by selective monitoring, in which only *some* telemetry is ingested for analysis. For example, the impact of DNS performance on the user experience is well understood. But if a model is trained to analyze application performance without telemetry from DNS, it may claim that performance is just fine even if there’s an issue with DNS, because it has no idea that DNS is in any way related to the end-to-end performance of the app. If the “next move” is to alert someone of a performance degradation, the system will fail due to bias in data selection.
It won’t surprise you if I tell you our annual research discovered that over half of all organizations cite “missing data” as a top challenge to uncovering the insights they need.
Thus, even if organizations were all in on leveraging AI to drive operational decisions, it would present a challenge. Without a diverse data set on which to train such a system, the potential for bias creeps in.
A third way bias can be introduced is in the methods used to introduce data to the model. The most common operational example of this is using the results of synthetic testing to determine the average performance of an application, and then using the resulting model to analyze real traffic. Depending on the breadth of locations, devices, network congestion, etc. that form the dataset from synthetic testing, perfectly acceptable performance for real users might be identified as failure, or vice versa.
The risk is an erosion of trust in technology to act as a force multiplier and enable the scale and efficiency needed for organizations to operate as a digital business. Because if the AI keeps giving the ‘wrong’ answers, or suggesting the ‘wrong’ solutions, well, no one’s going to trust it.
This is why full-stack observability is not just important, but one of the six key technical capabilities needed for organizations to progress to the third phase of digital transformation: AI-assisted business.
Missing data, whether due to selective monitoring or the opinionated curation of metrics, has the potential to bias AI models used to drive operational decisions.
Careful attention to the sources and types of data, coupled with a comprehensive data and observability strategy, will go a long way toward eliminating bias and producing more accurate—and trustworthy—results.