Back to projects

AI Security2024

AI Security and Agentic Workflow Automation

Performed adversarial review of LLM-enabled automation agents and built Python-based multi-agent security workflows for alert triage.

Surfaced prompt-injection and unsafe tool-use paths while improving throughput across a 1,000-plus-alert-per-day triage workload.

Architecture Diagram

How the system fits together

This visual is meant to show the operating shape of the project at a glance: where input begins, where decisions happen, and what the useful output surface actually is.

ScopeAI Security
Signals1,000+
Technical threat-model diagram for an agentic security workflow showing untrusted input, policy gateway, agent runtime, tool mediation, and audit outcomes.

Technical threat-model diagram for an agentic security workflow showing untrusted input, policy gateway, agent runtime, tool mediation, and audit outcomes.

Snapshot

What matters most in this project

1,000+Alerts in repetitive triage flow
Policy-firstAgent permission model
AuditableTool-use and workflow reasoning

Challenge

The hard part was not just making automation work. It was making agentic workflows useful without leaving prompt, tool-use, and permission boundaries dangerously loose.

Result

The result was a security automation flow that improved triage throughput while making agent permissions, tool-use boundaries, and auditability easier to reason about.

Approach

  • Reviewed LLM-enabled agents as systems, not just models, with attention to prompt injection, unsafe retrieval, and untrusted tool invocation.
  • Designed Python-based multi-agent workflows to automate repetitive triage and enrichment tasks in a way that reduced manual fatigue.
  • Mapped failure paths early so the workflow could stay observable and easier to constrain.

Architecture

  • Separated model reasoning, policy decisions, and tool execution so the workflow could be constrained instead of trusting the agent as a monolith.
  • Used Python agents and enrichment steps to automate repetitive triage while preserving explicit control over inputs, permissions, and outputs.
  • Kept logging and workflow state visible so decisions could be reconstructed during review or after failure.

Impact

Surfaced prompt-injection and unsafe tool-use paths while improving throughput across a 1,000-plus-alert-per-day triage workload.

  • Modeled failure paths for agentic systems that interact with tools and external context.
  • Reduced human triage load by automating repetitive detection and enrichment steps.

Tradeoffs and Decisions

  • Accepted some automation overhead in exchange for safer tool mediation and clearer audit trails.
  • Focused on constrained usefulness rather than maximum autonomy because the failure modes mattered more than the demo value.
  • Designed for human override and review instead of treating automation as a full replacement for analyst judgment.

Stack

Tools and technologies behind the work

PythonLLM SecurityAutomationDetection Engineering