When “Imperfect” AI Models Still Add Value in Pharma Quality

Written by Andy O'Connor | Sep 8, 2025 2:16:49 PM

Introduction: Quality mindset in an AI world

Artificial intelligence is rapidly reshaping the pharmaceutical landscape. What once seemed futuristic is now at our doorstep: clinical trial monitoring tools, digital QC assistants, generative models for documentation, predictive maintenance in manufacturing, and decision support in pharmacovigilance. Based on internal research of 174 GxP software vendors, almost half already advertise an AI offering. The flood is coming, and with it a fundamental challenge: how can we use models that may be inherently imperfect, probabilistic and opaque, yet still uphold the standards of quality and compliance that define our industry?

For many quality leaders, this feels like a contradiction. Regulators expect processes that are consistent, traceable, and well controlled. Machine learning models, by design, function through probabilistic reasoning based on data patterns. Probabilistic performance is not an anomaly but a feature of the system. In fact, if a model always produced the same outcome, it wouldn't be classified as machine learning at all. So how do we reconcile this contradiction?

The temptation of blind trust

In early conversations about GxP AI, it was common to hear simple slogans like “just trust the math” or “the model is smarter than us.” These sound reassuring but do little to help us validate models against real-world performance. Trust without verification has never been the basis of quality management. Trust is earned through demonstrated evidence, not granted blindly.

Instead, we need frameworks that allow us to measure what the model contributes, where it adds risk, and how to mitigate these risks in a controlled way. Fortunately, regulators are beginning to articulate such frameworks.

The FDA’s evolving position

In its 2025 draft guidance Considerations for the Use of Artificial Intelligence To Support Regulatory Decision-Making for Drug and Biological Products, the FDA proposed a risk-based credibility assessment framework built on three pillars:

Context of use - Define the question of interest and intended use. Context is everything: a model used to prioritize case reports in pharmacovigilance carries a very different risk profile than a model suggesting dosing decisions in a clinical trial.
Model risk - Determine risk by looking at model influence (how much sway the AI has in the overall process) and decision consequence (what happens if the model is wrong). Together, these determine whether the AI introduces material new risks we need to mitigate.
Credibility assessment plan - Establish evidence comparing the human-only baseline against a human+AI process. The FDA is explicit: do not evaluate the model in isolation. Evaluate how the system with AI performs relative to existing practice.

This comparative framing is key. A model does not need to be perfect. It needs to demonstrably improve the performance of the regulated process when paired with human oversight, while keeping risks acceptable and controlled.

The ISPE GAMP perspective

The new ISPE GAMP® Artificial Intelligence Guide echoes this view. It stresses that AI-enabled computerized systems must still align with core GAMP 5 principles: product and process understanding, life cycle approach, scalable validation activities, science-based quality risk management, and supplier involvement.

Additional AI-specific expectations are layered on top: fit for purpose data, data and model governance, and knowledge management for AI literacy. Training data quality is critical, since biased or poorly curated data will directly shape model outputs. Governance provides standardized approaches for implementing, monitoring, and updating models so that decisions remain traceable and compliant. Finally, shared vocabularies and cross-functional knowledge management build AI literacy across Quality, IT, and Operations, reducing silos and enabling consistent oversight.

We can accept statistical imperfection as long as it is controlled, well-understood and explainable:

“Explainability is the degree to which a basis for a decision or action can be explained or how an output or result was reached, in a way that a person can understand”

Why “imperfect” still matters

Let’s ground this with an example. A vendor provides a model designed to assist in quality deviation categorization. The model is correct about 80% of the time. If viewed in isolation, that sounds like a failing grade. But compared to the human-only process, where deviation categorization consistency was closer to 65% across sites with heavy delays in turnaround, the human+AI process delivered better categorization, faster triage, and a measurable reduction in backlog.

The “imperfect” AI added value because it improved the system outcome without taking away human accountability. Importantly, the comparative evaluation gave management confidence that this wasn’t a gamble but a controlled, evidence-based improvement that is explainable.

A quality mindset for GxP AI

So how should quality leaders think about this? A few principles stand out:

Anchor in context of use: Not all AI is equal. A safety-critical model requires deeper scrutiny than one supporting office productivity. Where the person in the process needs to understand the model limitations consider AI literacy and sharing a model card explaining how it is to be used.
Focus on process improvement, not just model performance: Compare human-only vs human+AI, (not human vs AI only) to focus on acknowledging and improving real-world process quality and performance.
Quantify both benefit and risk: Measure gains in efficiency, accuracy, or consistency alongside the consequences of potential failure. Risk mitigation controls for AI model use may look different to existing mitigation typically found in non-AI processes. For example monitoring for drift, periodic re-evaluation, alerts for improper use, specific downstream controls, etc.
Seek explainability through real-world impact: Perfection is not required, but measurable, explainable benefits are.
Govern systematically: Use data and model governance frameworks, scalable validation, and supplier oversight consistent with GAMP principles.

The value proposition

Quality in pharma has always been about evidence, not promises. The same applies to AI. The models may be probabilistic, but our approach to adopting them must remain structured, transparent, and documented. If we can prove and explain how that a human+AI process outperforms a human-only process, then imperfection is not seen as a deal-breaker, it is progress.

The focus for GxP AI in pharma is whether the human+AI process delivers better outcomes for patient safety, product quality, and data integrity while remaining compliant with GxP expectations. If it does, then even an imperfect model can be justified, documented, and implemented.

When it is recognized that our current baseline is already imperfect, with structured evaluation, AI can move that baseline forward.

FAQs

Q1. How is artificial intelligence being used in the pharmaceutical industry today?

A1: AI is increasingly embedded in pharmaceutical operations, including clinical trial monitoring, digital quality control assistants, generative documentation tools, predictive maintenance, and pharmacovigilance decision support. Internal research shows that nearly half of 174 GxP software vendors already advertise AI capabilities, signaling a rapid transformation.

Q2. Can AI be compliant with GxP regulations despite being probabilistic and imperfect?

A2: Yes, AI can be GxP-compliant if used within structured, risk-based frameworks. Regulatory bodies like the FDA emphasize that models do not need to be perfect. Instead, their performance should be evaluated in the context of use, with risks understood and mitigated. A human+AI process that improves outcomes while maintaining control, traceability, and compliance is acceptable; even if the model itself is not flawless.

Q3. What is the FDA’s guidance on using AI in regulated pharmaceutical decision-making?

A3: The FDA’s 2025 draft guidance outlines a three-part framework:

Context of Use: Clearly define how the model will be used and what decisions it influences.
Model Risk: Assess the impact of AI decisions and their consequences if incorrect.
Credibility Assessment Plan: Compare performance of the AI+human system against the existing human-only baseline.

This comparative, evidence-based approach emphasizes process improvement over isolated model evaluation.

Q4. What does the ISPE GAMP® guide say about AI in computerized systems?

A4: The ISPE GAMP® Artificial Intelligence Guide reinforces that AI systems must align with GAMP 5 principles such as scalable validation, science-based risk management, and supplier oversight. It adds AI-specific guidance around:

High-quality, fit-for-purpose training data
Robust data and model governance
AI literacy and shared knowledge across functions

These ensure decisions made using AI remain explainable, traceable, and compliant.

Q5. Why is explainability important when using AI in pharma quality systems?

A5: Explainability means being able to clearly describe how an AI-driven decision was made. This is essential for GxP compliance and trust. Imperfect models are acceptable if their impact is well-understood and beneficial. For example, a model improving deviation categorization from 65% to 80% accuracy, with faster turnaround, demonstrates explainable, measurable value that enhances rather than compromises quality oversight.

Want to explore how AI is impacting data integrity, validation, and regulatory expectations?

Join us on October 15 for a live webinar, in collaboration with KenX, where we’ll break it all down and answer your questions. It’s a great opportunity to get practical insights and hear what others in the field are thinking.

Click here to learn more & register.

View full post