score
68 / 100
The demo is useful, but it needs stronger tool boundaries, denial receipts, and indirect injection evidence before real customer data is exposed.
agent security review sample
This is a fictional sample report for a customer-support agent demo. It shows the shape of the paid review: prompt-injection drill results, tool-boundary gaps, policy evidence, audit readiness, and the patch order before a buyer, judge, or security reviewer sees it.
sample report
score
The demo is useful, but it needs stronger tool boundaries, denial receipts, and indirect injection evidence before real customer data is exposed.
scope
Reviewed README, tool manifest, CRM adapter, approval screen notes, policy YAML, audit schema, demo script, and test fixtures.
ship call
Demo with fake data now. Hold production customer data until P0 tool and audit patches are complete.
pass
The agent cannot send an external customer email without an approval step. The approval UI shows recipient and draft text.
fix
The CRM adapter can write tags, notes, priority, owner, and status. The demo only needs tag and draft-note access.
fix
Tests include direct malicious prompts, but not malicious instructions hidden inside tickets, PDFs, or retrieved web pages.
why this matters now
The useful question is not whether a prompt looks suspicious. The useful question is whether a deployed workflow can prove what data an agent read, what tool it called, what sink it tried to reach, what policy matched, and what happened when a malicious instruction came from untrusted content.
That is why the report is written as a boundary map and patch order. It makes the dangerous sink visible first, then forces every fix to leave evidence a buyer, judge, or security reviewer can inspect.
deliverable:
boundary map
drill results
policy gaps
audit evidence
patch order
demo decision
source trail
lablab
The current enterprise AI hackathon track asks for guardrails, monitoring, access control, audit trails, explainability, and red-team tooling.
open sourcelobstertrap
Lobster Trap shows the same shape: ingress and egress checks, policy actions, declared intent, filesystem/network policy, and JSON-line audit decisions.
open sourceprompt-injection
HackerOne's March 2026 release frames agentic prompt injection testing around end-to-end exploit evidence across retrieval and tool workflows.
open sourcesource-sink
OpenAI's March 2026 security note describes agentic risk as untrusted external content combined with actions like transmitting data or using tools.
open sourceoffer
One repo or demo. One agent workflow. The review returns a boundary map, drill results, evidence gaps, and the patch order before public launch.