Enterprise Intelligence

How We Think About AI Risk: A Procurement-Grade Architecture

May 20268 min read

Most of the AI marketing aimed at enterprises right now is built for the early adopter. Move fast. Replace headcount. Ship the agent. The buyers we work with - regulated financial services firms, healthcare providers, councils, large professional services partnerships - sit in a different conversation. They are responsible for outcomes that the hype cycle does not account for. A drug review goes wrong, a clinical decision goes wrong, a planning consent goes wrong, a customer communication goes wrong, the consequence is not a slow-down sprint. It is a regulator visit, a press cycle, or a court file.

Calling those buyers laggards is lazy. They are exactly the customers AI needs to earn trust with, because the value they generate when AI works for them is the largest in the market. The architecture has to meet them where they are. Here is how we think about the risk model when we build for those clients.

Bounded Autonomy as a First Principle

We classify every action an AI system can take in a client deployment on a five-level autonomy scale. Level zero is read-only - the AI can observe but cannot act. Level one is suggestion-only - it drafts, a human approves before anything ships. Level two is bounded action with mandatory review queues - the AI can act on items inside a tightly defined envelope, but everything is logged and reviewed before it lands externally. Level three is autonomous within a watched lane, with anomaly detection that escalates to a human the moment behaviour drifts. Level four is unsupervised - reserved for the lowest-stakes, highest-volume internal tasks where the cost of any individual error is negligible.

The default for anything brand-attached, customer-facing, or regulator-visible is capped at level two. Forever. We do not believe responsible AI deployment for regulated enterprises will ever move that ceiling for those classes of action. The reason is not that the model could not do it. The reason is that the cost of being wrong is asymmetric. A single bad customer message, a single fabricated clinical note, a single drift in regulatory filing tone is worth more in damage than years of efficiency gains. Bounded autonomy is not a temporary phase until the AI gets better. It is the right model for the work.

Brand Integrity as a Hard Boundary

For client-facing output, we treat the client's brand voice as a non-negotiable input rather than something the model is free to interpret. The voice, tone, vocabulary, and hard-rule list (banned phrases, mandatory framings, regulatory wording) sit in a calibrated knowledge layer. Every generation passes through that layer. Drift is detected and blocked before output reaches the review queue. When a senior reviewer corrects something, the correction flows back into calibration immediately, so the same drift cannot recur.

This matters because most enterprise content failures are not factual hallucinations. They are tonal failures - language that is technically accurate but wrong for the audience, the brand, or the regulator. A fund manager telling a retail client about volatility in the same language they would use with a wholesale counterparty has produced a compliance issue, not a typo. Tonal alignment requires the same engineering rigour as factual alignment. We treat it that way.

Sandboxed Execution and No Subprocess Transports

When AI needs to take an action - hit an API, run a calculation, file a record - we do not let the model execute commands directly. Execution runs inside isolated V8 sandboxes with strict resource and network limits. External APIs are called through credential proxies that the model itself never sees the keys for. Tenant data is partitioned by row-level security at the database. The model can request that an action be performed; it cannot perform the action itself.

This architecture also means we have no subprocess transports anywhere in the executor layer. The procurement-grade phrasing matters here because of a class of vulnerabilities that has surfaced in tool-and-plugin frameworks across the industry over the last twelve months: arbitrary subprocess execution as a configuration surface. We do not have that surface. Tenant configuration is declarative manifest data with strict schema validation, never a path to spawn arbitrary processes. When enterprise procurement teams ask about MCP-class risk, that is the one-line answer. The architectural property was chosen for security, multi-tenancy, and cost reasons before any of the recent disclosures, and it carries through to procurement as a clean answer.

Audit Trail by Default

Every action the AI proposes, every input it considered, every reviewer decision, every override, every calibration event is recorded in an immutable audit trail. Seven-year retention for regulated clients as standard. The audit data is structured, queryable, and exportable. When a regulator, board, or internal compliance function needs to reconstruct why a particular AI-influenced decision was made, the answer is on file. We do not believe AI has earned trust in regulated industries until the auditability of its decisions matches the auditability of its human equivalents. We build to that bar from day one.

Why We Do Not Sell Replacement

A lot of AI vendors lead with headcount reduction. We do not. The deployments we have shipped that compound real value are ones where the AI made the existing team materially more productive - took the assembly out of senior work, took the wait time out of decision flow, took the institutional knowledge out of someone's head and into a system. The headcount that gets affected is usually back-office assembly, low-leverage work. The senior judgment becomes more valuable, not less, because it is now the input that powers the system rather than the rate-limiting step.

There is also a more practical reason. Enterprises that lay off staff to deploy AI tend to discover six to twelve months later that the institutional knowledge they let go was load-bearing. The AI that was supposed to replace it cannot, because the symbolic layer never got captured. We have seen this happen in several segments already. We do not build for it.

Working with the Tenant's Risk Posture

Different enterprises sit in different positions on the risk curve, and the architecture has to accommodate that. A clinical trials operator and a marketing agency are not the same client. We let tenants choose the autonomy ceiling for each class of action, the review depth, the escalation rules, the retention policies, and the integration surface. Tenants that need to start at suggestion-only across the board can do that. Tenants that have already built strong human-in-the-loop processes and want to delegate more aggressively can do that, with the audit trail catching anything that goes off-pattern. There is no single right answer. There is the answer that fits the tenant's actual obligation to their customers, regulators and shareholders. The platform fits to that, not the other way around.

What This Means for Procurement

When an enterprise procurement team is evaluating AI vendors, the questions they should be asking go beyond model capability. Where does the data sit. Who controls the credentials. How is execution isolated. What is the autonomy ceiling and is it adjustable per action class. Is there an audit trail and how long is it retained. What happens when the model is wrong - is the failure caught before the customer sees it. What is the architectural answer on subprocess and tool-execution risk. Can a regulator reconstruct why a specific decision was made. We have a clear answer to each of these. The answer is not slideware - it is the architecture we ship.

AI deployment in regulated industries is going to compound through the second half of this decade. The vendors that earn the trust of responsible buyers in 2026 will hold those relationships for years. The ones that lead with replacement, unbounded autonomy, and configuration-as-execution will not. We have made our bet on the responsible side of that line.

If you are inside a regulated enterprise evaluating AI platforms and want to walk through the architecture in detail, we are happy to have that conversation under NDA.

Get in touch

Ready to talk?

Tell us what you're building. We'll tell you how we can help.