Stop Writing Defensive Prompt Boilerplate
Every AI engineering team ends up writing some version of the same thing: ad-hoc input sanitization, hand-tuned system prompt prefixes warning the model not to follow injected instructions, regex filters for known attack patterns. This code is scattered across prompt templates, duplicated between services, and tested against whatever attacks the team happened to think of.
It also does not work very well. Hand-written defenses are brittle. They break when you update your prompt. They miss attack variants. They create false confidence.
Platemail replaces all of it with a single, tested security layer. Your application sends requests to the proxy instead of directly to the model provider. The proxy applies prompt hardening (nonce-based role markers, attention dilution, tag defanging) and forwards the request. Your prompt templates stay clean. Your defensive logic lives in one place: a YAML policy file.
Tested Against Real Attacks, Not Toy Examples
The defensive templates in Platemail are tested against 2,600+ real attack cases across 13 attack categories, including direct injection, indirect injection, multi-turn attacks, encoding-based evasion, role impersonation, and payload smuggling.
This testing is continuous. When new attack techniques surface, they get added to the test suite, and the defensive templates are tuned until they hold. The result is prompt hardening language that has been refined through thousands of adversarial iterations, not a set of instructions someone guessed might work.
The Cost of Getting It Wrong
Prompt injection is not a theoretical risk. Companies have already paid real money and taken real reputational damage from it.
Both incidents share the same root cause: the model followed instructions that overrode the business logic it was supposed to enforce. Platemail's prompt hardening prevents this by ensuring injected instructions have no authority over the model's behavior, regardless of how they are phrased.
Exfiltration Prevention
Prompt injection is not only about making the model say the wrong thing. A more dangerous class of attacks uses injection to extract secrets, PII, or proprietary data from the model's context.
Platemail scans model responses for sensitive data being smuggled out. The detection is encoding-aware: it catches secrets hidden in base64, hexadecimal, ROT13, and layered combinations of these encodings. An attacker who tricks the model into encoding your API keys before outputting them still gets caught.
This runs as part of the same proxy pipeline. No additional service, no additional latency budget.
Drop-In Deployment
Platemail is a proxy. You point your API calls at it instead of directly at the model provider. That is the entire deployment.
- Works with OpenAI, Anthropic, and any OpenAI-compatible API
- No SDK changes, no code changes, no prompt template changes
- Context window advertisement: the proxy reports a slightly smaller context window to clients, so your application self-limits and the proxy always has room for its defensive tokens
- Runs anywhere you can run a Ruby process: bare metal, containers, Kubernetes
Works With Your Existing Stack
If you already use an LLM gateway or observability platform, Platemail does not replace it. The proxy handles one job: the security transform layer. Routing, caching, cost tracking, and observability stay where they are.
Platemail sits in front of these tools (or behind them, depending on your architecture) and adds the security layer they do not provide. It also emits OTEL trace annotations, so your existing observability pipeline gets visibility into what the proxy did to each request.
Read the technical details on how prompt hardening prevents injection attacks.
How It Works → GitHub