Steven's Knowledge

Governance Frameworks

NIST AI RMF, ISO 42001, review boards, and the organizational machinery for responsible AI

A regulation tells you what you must do. A governance framework tells you how to do it systematically. Without one, compliance is ad hoc — every team invents its own process, nothing is consistent, and auditors find gaps everywhere.

The frameworks worth knowing about fall into two buckets: external standards you can certify against, and internal structures you build to operationalize them.

NIST AI Risk Management Framework (AI RMF)

The NIST AI RMF is the US government's voluntary framework for managing AI risk. It's organized around four functions:

  1. Govern — establish policies, roles, and a culture of responsible AI. Who makes decisions? How are they documented?
  2. Map — understand the context: what does the system do, who is affected, what could go wrong?
  3. Measure — assess and track risks quantitatively. Bias metrics, performance degradation, failure rates.
  4. Manage — allocate resources to treat, transfer, or accept identified risks.

Why engineers should care

NIST AI RMF is increasingly referenced in US government procurement. If your company sells AI to federal agencies, expect RFPs to ask about your AI RMF alignment. Even outside government, it's a solid structure for thinking about risk.

The companion NIST AI RMF Playbook provides specific suggested actions for each sub-category — useful when you're translating framework language into backlog items.

ISO/IEC 42001

ISO 42001 is the international standard for AI management systems. Think of it as ISO 27001 (information security) but for AI. It specifies requirements for establishing, implementing, maintaining, and improving an AI management system.

Key elements:

  • AI policy — documented organizational commitment to responsible AI.
  • Risk assessment — systematic identification and evaluation of AI-specific risks.
  • AI system impact assessment — evaluate impacts on individuals, groups, and society.
  • Data management — controls for data quality, provenance, and appropriateness.
  • Third-party management — oversight of AI components from vendors.

ISO 42001 is certifiable, meaning you can get audited and receive a certificate. This matters for enterprise sales where customers ask "are you ISO 42001 certified?"

Internal AI Governance Structures

External frameworks are useless without internal machinery to execute them. The pieces that work:

AI Review Board

A cross-functional body that reviews AI use cases before they ship. Typical composition:

  • Engineering lead (technical risk)
  • Product manager (user impact)
  • Legal/compliance (regulatory risk)
  • Ethics/policy (societal impact)
  • Domain expert (context-specific risks)

What makes review boards fail: meeting too infrequently, having no authority to block launches, or reviewing at the wrong stage (too early = vague, too late = sunk cost).

What makes them work: tiered review (lightweight for low-risk, deep for high-risk), clear criteria for what triggers review, and fast turnaround so teams don't route around them.

Use-Case Approval Process

Not every AI application needs the same scrutiny. A practical tiered system:

  • Tier 1 (self-serve) — internal tools, non-customer-facing, low-stakes. Team fills out a short form, auto-approved unless flagged.
  • Tier 2 (lightweight review) — customer-facing but low-risk. One reviewer, decision within a week.
  • Tier 3 (full review) — high-risk, sensitive data, vulnerable populations. Full board review, may require external audit.

The classification criteria should be explicit: does it affect hiring? Credit? Health? Does it use biometric data? Does it make autonomous decisions?

Model Risk Management

Financial services have done model risk management since long before the AI hype. SR 11-7 (the Fed's guidance) lays out a framework any industry can learn from:

  1. Model development — sound theory, robust methodology, documented assumptions.
  2. Model validation — independent review by people who didn't build the model. Testing against holdout data, stress testing, sensitivity analysis.
  3. Model use — ongoing monitoring once deployed. Performance tracking, drift detection, periodic revalidation.

Translating MRM to ML systems

  • Model inventory — a registry of every model in production, its purpose, owner, risk tier, last validation date, and dependencies.
  • Challenger models — maintain alternatives that can replace the primary model if it degrades.
  • Change management — retraining, fine-tuning, or prompt changes should go through a defined review process proportional to the risk tier.

Putting It Together

A minimal viable governance stack for a mid-size AI team:

  1. Model registry — every model tracked with metadata, risk tier, owner.
  2. Use-case intake form — triages new AI applications into risk tiers.
  3. Review process — lightweight for Tier 1, full board for Tier 3.
  4. Evaluation suite — automated bias, safety, and performance checks that run before any model ships.
  5. Audit log — who deployed what, when, with what evaluation results.
  6. Periodic review cadence — quarterly re-review of high-risk systems.

Start with the registry and the intake form. Everything else builds on knowing what you have and how risky it is.

On this page