Agent Security Evidence Packs for 2026

practical read

The trust layer has to be visible.

Agent teams are moving from chat demos into workflows that touch private files, SaaS APIs, MCP servers, payment flows, browsers, databases, and other agents. That creates a different launch bar. A useful demo is not enough if the reviewer cannot see the boundaries.

The public TechEx / lablab enterprise AI hackathon tracks point at the same gap: agent security, AI governance, Gemini-based agents, and deployable enterprise workflows. The interesting part is not the phrase "agent security." It is the evidence buyers and judges can inspect.

What an evidence pack should show

What prompt-injection and exfiltration drills were run.
Which policy rule matched the risky step.
Whether the action was allowed, denied, rate-limited, quarantined, or sent to human review.
Which tool scope was active: read, write, network, filesystem, payment, or destructive action.
What the agent claimed it intended to do versus what the policy layer detected.
Which audit event proves the decision after the demo is over.

A2A makes identity part of security

The Agent2Agent project frames A2A around agent discovery, capability declaration, task collaboration, and secure interoperability. That means an agent-security review should not stop at prompt text. It should also ask: who is this agent, what skills does it declare, what provider owns it, and what authentication is expected before another agent delegates work to it?

Gemini belongs in the test loop, not in the leak path

Gemini and AI Studio are useful for generating adversarial drill cases, summarizing evidence, and testing long-context agent workflows. The operational mistake is putting a model key where the browser bundle, a screenshot, or a public repo can expose it. A safe demo uses AI Studio directly or keeps API keys behind a server-side route.

Policy without audit is hard to sell

A policy rule that blocks a bad action is useful during the demo. An audit event that explains the block is useful after the demo. The second part is what makes the difference for regulated workflows, enterprise buyers, and judges looking for measurable risk reduction.

The small workflow that helps most

Load the project or paste the architecture notes.
Run prompt-injection, exfiltration, tool-boundary, and human-review checks.
Generate the fix queue.
Export a policy starter, A2A Agent Card, audit schema, and Gemini drill prompt.
Keep the evidence pack with the submission or buyer handoff.

Why I built the drill kit

Agent Security Drill Kit is a browser-only version of that workflow. It is intentionally local, because early-stage teams should be able to inspect a project without uploading source code to a third-party scanner. The current version exports a review pack that can be used for hackathon submissions, launch reviews, and product demos that need a concrete trust story.

open https://tateprograms.com/agent-security-drill.html
load project
export evidence pack
patch top risks

Agent demos need evidence, not vibes.

The trust layer has to be visible.

What an evidence pack should show

A2A makes identity part of security

Gemini belongs in the test loop, not in the leak path

Policy without audit is hard to sell

The small workflow that helps most

Why I built the drill kit

Signals this page tracks.

Agent Security & AI Governance

Policy actions

Agent identity

Model testing