blog

Ensuring Data Integrity for Statistical Software in GxP Environments: Top 10 Audit Requests

Written by Andy O'Connor | Jan 21, 2025 5:33:24 PM

The life sciences industry relies on statistical programming software like RStudio, Posit, SAS, and Minitab for clinical trials and regulatory submissions such as new drug applications. These tools enable pharmacometricians to conduct complex modeling, generate reproducible reports, and comply with regulatory requirements. 

Statistical analysis is often a crucial component of clinical trials and data presented for regulatory submissions. However, consideration of statistical software as a GxP computerized system is often overlooked. This neglect can have significant implications, given the recent increasing focus on data integrity in good clinical practice.

Statistical programming tools where models are custom coded are more flexible and are becoming popular, in place of software that provides models ‘out of the box’. These ‘statistical programming’ tools generally mean that a programming platform needs to be installed (and qualified) before a statistical model can be developed and validated for intended use with GxP data. For example, GAMP 5 [1] has identified that statistical programming tools can be identified as category 1 (infrastructure software) but excludes the business applications (such as statistical models) developed using these packages.

In addition, many statistical software programming tools are moving to the cloud, and leveraging more complicated statistical models such as Artificial Intelligence / Machine Learning. 

To assist you in ensuring data integrity and reliability for the statistical analysis of GxP data, we will guide you through the top 10 data integrity requests an auditor might make during an audit of a statistical software system.

Top 10 Data Integrity Requests for Statistical Software

  1. GxP Computerized Systems Inventory List
    An inventory list will immediately indicate whether the statistical programming software is within the scope of GxP practices, some common details, and if there is a programming platform independent of the business application/statistical model.  
  2. Governance SOPs for Data, Validation, and Risk
    Standard operating procedures (SOPs) can be reviewed for the following:
  • Control and management of records, demonstrating ALCOA+.
  • GxP data review processes.
  • GxP categorization for computerized systems and how this is documented.
  • Overall Risk Management of GxP Computerized Systems
  • Validation Policy
  1. SOPs for Statistical Model Generation and Validation
    Key areas of interest include:
  • Data governance practices, from collection to controlled storage and maintenance.
  • Procedures for managing and reviewing statistical models.
  • Validation processes for the statistical accuracy of model calculations.
  • In AI/ML models, what additional considerations are around training data, and to prevent issues like overfitting or bias.
  1. Validation Report for Statistical Models
    A validation report detailing the model's intended use, testing outcomes, and any referenced risk assessments. Where the software provides statistical models ‘out of the box’ as standard then the validation report of the software and the model may be together. There may be separate validation reports if the model is custom programmed by the business on statistical software. 
  2. Requirements Traceability Matrix (RTM)
    Demonstrates how user requirements for statistical models were captured and tested, linking them to validation steps and ensuring traceability.
  3. Evidence of Model and Code Review
    Where statistical models are built on statistical programming software, code review processes would help determine:
  • The statistical basis is sound.
  • Code is version-controlled and maintained.
  • Unit tests are performed as required (particularly for independently developed / open-source components).
  1. SOP(s) for Software Administration
    Software administration practices and controls (e.g., software release management, vendor management, patient data security and the protection of human subjects, user access controls, backup/restore procedures, business continuity of data, and periodic reviews.
  2. Evidence of Backup and Restore Capabilities
    The presence of successful backup and restore evidence is crucial to prevent vulnerability to data loss.
  3. Quality Management System (QMS) Review
    To review quality records related to statistical software, including deviations, corrective actions, change management.
  4. Last Completed Periodic Review
    Evidence of regular evaluations of your statistical software, confirming it remains compliant and fit for purpose. Periodic reviews will inform whether risk categorization is still correct or if additional controls now need to be applied.

Conclusion

The number one FDA finding is lack of procedures so as part of a data integrity audit ensure you have the requisite procedures to ensure that your data is controlled.

Preparing for a statistical software audit can be daunting, but a systematic approach ensures compliance while supporting the reliability of your drug development processes. 

ERA Sciences, with its expertise in digital GxP compliance, offers tools like Phanero to streamline inventory management and simplify audit preparation.

References:

  1. GAMP 5, A Risk-Based Approach to Compliant GxP Computerized Systems, Second Edition