Stop Treating AI Risk Like a Prompt Engineering Problem

Felix-Sebastian Cosma
3 days ago
5 min read

Updated: 2 days ago

Most companies treat AI risk like a prompt engineering problem. They write longer system prompts, add more warnings, tighten the wording, and tell the model to behave like a responsible employee. This is not useless. A better prompt can reduce confusion and give the model better context. But it does not solve the deeper problem.

AI risk is not mainly a writing problem. It is a decision problem. The moment an AI system is allowed to access data, call tools, send messages, approve actions, or trigger workflows, the prompt stops being the real boundary. The real boundary is what the system allows the AI to do.

A prompt is a request. Governance is a system. That difference matters.

Why prompts are not enough

A prompt can say, "Do not send confidential information." Governance says, "This agent cannot access confidential information unless a policy allows it." A prompt can say, "Ask a human before taking an important action." Governance says, "This action cannot execute until the right person approves it." One is an instruction. The other is a control.

This is the same reason companies do not run on motivational speeches alone. You can tell employees to be careful with money, contracts, production systems, and customer data. Serious companies still define permissions, approval chains, audit trails, and escalation rules. They do not rely on good intentions as the final protection layer.

Think about a junior employee. If he makes a serious decision he was never authorized to make, the solution is not simply to tell him to be more careful next time. The solution is to define his authority. What can he access? What can he approve? What must he escalate? What is he never allowed to do alone?

Why would we treat AI agents differently?

The risk changes when AI moves from answering to acting

A chatbot that gives a bad answer is one category of risk. An agent that sends that answer to a customer is another. An agent that updates a CRM record, approves a refund, changes a security setting, publishes a public statement, or triggers an internal workflow is operating in a different world.

This is where many AI discussions become confused. People keep asking whether the model is accurate enough, aligned enough, or smart enough. Those questions matter, but they are incomplete. Once the system can act, the more important question is whether the agent has the authority to make that decision in the first place.

A wrong answer can be corrected. A wrong action can create consequences. That is why the risk boundary should not be the model. It should be the decision layer around the model.

Prompt injection proves the point

Prompt injection is usually described as an attack on the model. That is only partially true. Prompt injection becomes dangerous because the model is connected to tools, permissions, private data, and actions. The attacker is not merely trying to make the model say something strange. He is trying to make the system do something it should not do.

If an agent reads malicious text and writes a bad summary, that is annoying. If the same agent reads malicious text and then emails confidential data, deletes a record, approves an action, or changes a workflow, that is not just a model failure. It is a governance failure.

The problem is not only that the model was tricked. The problem is that being tricked was enough to make something happen.

No serious system should depend on the inner discipline of a language model as its final line of defense. Models can misunderstand context. Models can be manipulated. Models can be overconfident. A production system has to assume that failure is possible and still prevent unauthorized consequences.

What governance should actually control

AI governance should not be reduced to a PDF policy document that nobody reads. It should control real operating questions. Who is allowed to create an agent? What data can it access? What tools can it use? What decisions can it make alone? What decisions require approval? Who approves them? What gets logged? What happens when a rule is violated?

These are not abstract compliance questions. They are practical questions. They decide whether an AI workflow is safe enough to run inside a real company.

The temptation is to avoid this work because it slows down the demo. That is exactly why it matters. Demos reward freedom. Production rewards control. A demo looks better when the agent does ten things in a row without asking for help. A real company needs to know when the agent must stop, ask, escalate, or refuse.

Control is not the enemy of AI adoption. Control is what makes adoption possible.

Approval is about authority, not stupidity

Many people talk about human approval as if it is a temporary weakness. They imagine a future where models become so intelligent that approval disappears. I do not think that is how serious organizations work.

Approval is not only about intelligence. Approval is about authority. A human does not approve an action because the AI is stupid. A human approves an action because someone must own the consequences.

Even in companies full of competent people, not every employee can sign contracts, approve payments, access payroll data, publish official statements, or change production systems. Not because everyone is incompetent. Because authority must be controlled.

The same principle applies to AI. The more capable an agent becomes, the more important its boundaries become. Capability without authority control is not maturity. It is risk with better marketing.

The enterprise does not buy magic

The AI industry likes magic because magic sells demos. A prospect sees an agent plan a task, call a tool, write a message, and complete a workflow. It feels powerful. It feels like the future. And in some ways it is.

But enterprises do not buy magic for long. They buy trust. They ask boring questions because boring questions are what keep companies alive. Who approved this? Can we see the audit trail? Can legal block it? Can security limit access? Can compliance prove that the policy was followed? Can a manager override the decision? Can we stop the agent before it causes damage?

Those questions are not obstacles. They are the market. The companies that answer them will survive the move from AI demos to AI production. The companies that ignore them will keep building impressive toys that nobody serious wants to trust.

The principle

Better prompts are useful. They are not governance. They can shape behavior, but they should not be treated as the final control layer for systems that can act.

AI risk should be treated as a decision problem. What is the AI allowed to do? Under what conditions? With what approval? With what audit trail? With whose ownership?

The question is not only what the model says. The question is what the system allows the model to do.

Autonomy requires accountability. Every decision needs an owner.