Skip to main content

AIPCH14 — Testable & Versioned

“Version-Controlled and Behaviorally Testable”


What AIPCH14 is really asserting

AIPCH14 is not asserting that:

“The AI Product has versions or some test cases.”

It is asserting that:

The AI Product evolves through explicit, versioned states where each version is protected by reproducible, behavior-focused tests that validate its intended functionality, constraints, and outcomes — ensuring safe, controlled, and auditable evolution over time.

Versioning is not labeling.
Testing is not validation once.

This is about controlled evolution of intelligence.


The Essence (HDIP + AIPS Interpretation)

An AI Product is testable and versioned if and only if:

  1. Every change results in a new, explicitly versioned product state
  2. Each version is validated through reproducible, behavior-driven tests
  3. Evolution preserves:
    • intent (AIPCH13)
    • semantics (AIPCH12)
    • constraints (AIPCH10)

If changes:

  • are implicit
  • cannot be reproduced
  • are not validated against expected behavior

then AIPCH14 is not met, even if version numbers exist.


What Must Be Versioned

Versioning must apply to the entire AI Product, including:


1. Behavior

  • decision logic
  • output patterns
  • response characteristics

2. Specification

  • AIPROD (intent, semantics, constraints)
  • AIPDS (deployment realization)

3. Dependencies

  • underlying models
  • data sources
  • composed AI/Data Products

👉 Versioning is product-level, not component-level.


What Must Be Tested

Testing must validate:


1. Behavioral correctness

  • expected outputs for given inputs
  • consistency of decisions
  • adherence to intent

2. Constraints and policies

  • compliance rules
  • safety boundaries
  • risk thresholds

3. Quality expectations

  • accuracy or performance thresholds
  • latency expectations (AIPCH09)
  • reliability

4. Regression safety

  • no unintended degradation
  • no semantic drift
  • no violation of prior guarantees

👉 Testing is:

behavioral, not just technical


Positive Criteria — When AIPCH14 is met

AIPCH14 is met when all of the following are true:


1. Versions are explicit and traceable

Each version:

  • is uniquely identifiable
  • linked to:
    • specification (AIPROD)
    • deployment (AIPDS)
  • includes change history

There is full provenance of evolution.


2. Tests are reproducible and automated

The system:

  • executes tests automatically
  • produces consistent results
  • does not rely on manual validation

Tests are:

  • versioned
  • repeatable

3. Tests validate behavior, not just components

Tests focus on:

  • what the AI Product does
  • not just:
    • model accuracy
    • pipeline correctness

This ensures:

product-level validation


4. Version changes are gated by validation

New versions:

  • must pass defined tests
  • cannot be promoted without validation
  • are evaluated against prior versions

This ensures:

safe evolution


5. Consumers are protected from breaking changes

Versioning ensures:

  • backward compatibility (where required)
  • clear communication of changes
  • ability to select or remain on versions

This supports:

stable consumption (AIPCH11)


Negative Criteria — When AIPCH14 is not met

AIPCH14 is not met if any of the following are true:


❌ Changes are implicit or uncontrolled

Examples:

  • model updated without version change
  • behavior changes without tracking
  • no clear history

This creates unpredictability.


❌ Testing is manual or ad hoc

Examples:

  • manual validation before release
  • no automated test suite
  • inconsistent testing

This is unreliable.


❌ Tests focus only on components

Examples:

  • model accuracy only
  • pipeline execution only
  • no validation of:
    • decisions
    • outcomes
    • constraints

This misses product behavior.


❌ No regression validation exists

Examples:

  • no comparison with previous versions
  • degradation unnoticed
  • behavior drift untracked

This creates risk.


❌ Consumers are exposed to breaking changes

Examples:

  • interfaces change without versioning
  • outputs change unexpectedly
  • no control over version adoption

This breaks trust.


Edge Cases (Important Guidance for Agents)


Case 1: “Model versioned but product not versioned”

Not met

Rationale:

  • component-level versioning only
  • product behavior not tracked

Case 2: “Tests exist but not reproducible”

⚠️ Partial

Rationale:

  • validation exists
  • but not reliable or consistent

Case 3: “Full product versioning + automated behavioral tests”

Met

Rationale:

  • safe, controlled evolution
  • reproducibility ensured

Case 4: “Continuous learning without version control”

Not met

Rationale:

  • behavior changes without traceability
  • unsafe and ungovernable

Evidence Signals an Agent Should Look For


Authoritative evidence:

  • version identifiers linked to AIPROD/AIPDS
  • automated test suites
  • test results per version

Supporting evidence:

  • change logs
  • regression reports
  • version selection capability for consumers

Red flags:

  • lack of version history
  • manual testing processes
  • unexplained behavior changes
  • no regression validation

How an Agent Should Decide

Decision rule (simplified):

If the AI Product cannot evolve through clearly versioned states validated by reproducible, behavior-focused tests that protect against unintended changes, AIPCH14 is not met.


Why AIPCH14 Is Non-Negotiable

Without AIPCH14:

  • behavior becomes unpredictable
  • trust (AIPCH07) degrades
  • governance (AIPCH10) cannot be enforced reliably
  • reuse (AIPCH08) becomes unsafe

AIPCH14 enables:

  • safe and controlled evolution of AI Products
  • reproducibility and auditability
  • consumer trust in stability
  • continuous improvement without risk

Canonical Statement (for AIPS)

AIPCH14 is satisfied only when an AI Product evolves through explicitly versioned states, each validated by reproducible, behavior-focused tests that ensure consistency, constraint adherence, and safe progression without unintended degradation or breaking changes.