query file hash search agent blog header

July 30, 2025

From Hunt to Verdict: Automating IOC Sweeps with the File Hash Search Agent

This is blog #6 in a series of 6 discussing AI Agents, the Query Security Data Mesh, and why normalized data is the differentiator in AI for Security Operations. As part of this blog series, we’re introducing the release of six mission-specific AI Agents now available in preview to Query customers. These agents are designed to assist with core SOC workflows, bringing targeted automation to key areas like triage, investigation, and response.

Background: The “Why” – The Sisyphean Task of the IOC Sweep

In cybersecurity, speed is the ultimate advantage. When a new threat emerges, validated by a file hash Indicator of Compromise (IOC), a race against time begins. The question is simple, but the answer is profoundly complex: “Are we affected?” For most Security Operations Centers (SOCs), this question triggers a laborious, manual process known as the “IOC sweep.”

An analyst must take this single piece of data—a hash—and meticulously search for it across a dozen or more security tools. They query the EDR, pivot to the SIEM, check the cloud logs, search the data lake, and consult the threat intelligence platform. Each tool has its own query language, its own interface, and its own data retention policies. This “swivel-chair” investigation is slow, prone to human error, and creates critical gaps in visibility. A missed query or a forgotten data source can be the difference between containment and catastrophe. This is the “why”: the manual IOC sweep is a foundational security task that is fundamentally broken at scale.

This is where the paradigm of mission-specific AI agents operating on a security data mesh becomes essential. Instead of moving petabytes of data into a central repository, a security data mesh—like the one powered by Query Federated Search—creates a unified semantic layer to access data where it lives. When you give a purpose-built AI agent secure access to this mesh, you empower it to perform complex analytical tasks with the speed and completeness that humans simply cannot match. The mesh provides the data access; the agent provides the automated expertise.

The “What”: Deconstructing the File Hash Search Agent

The File Hash Search Agent is a specialized AI analyst built for a single, critical purpose: to be an expert in finding file hashes across your entire digital estate. It eliminates the need for analysts to know the specific query syntax or data schema of every security tool. Instead, an analyst can simply state their intent in natural language, and the agent translates it into a precise, optimized, and comprehensive search.

Core Capabilities and Architecture

The agent’s intelligence is derived from a sophisticated architecture that combines a powerful reasoning engine with a suite of specialized tools and deep, contextual knowledge of security data.

LLM Engine: At its heart, the agent uses a state-of-the-art Large Language Model (LLM) to understand user requests, reason through complex problems, and formulate a plan of action.
Specialized Tools: The agent isn’t just a language model; it’s a practitioner equipped with the right tools for the job:
- Search Schema – (search_fsql_schema): Before writing a query, the agent can consult the underlying data schema. It knows exactly where to look for hash values, whether it’s file_activity.file.hashes.value or network_file_activity.file.hashes.value, and understands the different hash algorithms (MD5, SHA-1, SHA-256, etc.).
- Validate Query – (validate_fsql_query): Every query the agent constructs is rigorously validated for syntactic correctness before it is run. This eliminates failed searches and wasted time, ensuring every query is effective.
- Execute Query – (execute_fsql_query): The agent can autonomously execute the validated query against the security data mesh, retrieving the results for analysis.
Federated Search Integration: This is the agent’s force multiplier. It doesn’t search a siloed database. It leverages Query Federated Search to run a single, unified search across all connected security tools simultaneously. Whether the hash exists in an EDR alert on a laptop, a cloud storage log in AWS, or a proxy log from three weeks ago, the agent can find it. This architecture grounds the agent’s analysis in the complete reality of the user’s environment.
Embedded Knowledge: The agent is pre-loaded with the entire FSQL (Federated Search Query Language) documentation. It possesses an expert-level, intrinsic understanding of the query language’s syntax, functions, and best practices, ensuring the queries it builds are not only correct but also highly efficient.

The “So What”: Strategic, Operational, and Tactical Value

Understanding the agent’s mechanics is one thing; understanding its impact on security operations is another.

Drastic Reduction in Dwell Time: The agent collapses the IOC sweep timeline from hours of manual work into mere seconds of automated analysis. This speed is critical for containing threats before they can propagate.
Guaranteed Comprehensive Hunts: By querying the security data mesh, the agent ensures that no stone is left unturned. It eliminates the risk of an analyst forgetting to check a specific data source, providing a truly comprehensive search every time.
Democratization of Expertise: The agent empowers every analyst, regardless of experience level, to perform expert-level threat hunts. It encapsulates the knowledge of a senior threat hunter and makes it available on demand, elevating the capabilities of the entire team.
Unwavering Consistency and Accuracy: The agent follows its best-practice workflow for every single request. It never gets tired, never cuts corners, and never makes a typo in a query. This brings an unprecedented level of standardization and reliability to a critical security function.

The “Now What”: The Agent in Action

Let’s explore how the File Hash Search Agent transforms daily SOC workflows:

Use Case 1: Rapid CTI Triage

Scenario:
A threat intelligence alert provides a SHA-256 hash for a newly identified piece of ransomware.
Traditional Workflow:
An analyst copies the hash. They log into the EDR console and search. They log into the SIEM and search. They check the data lake. This takes 15-30 minutes, assuming they know the correct query syntax for each platform.
Agent Workflow (less than 1 minute):
The analyst asks the agent:”Search for the file hash 123abcde… across all systems for the last 48 hours.”
Outcome:
The agent immediately generates and validates a federated query, searches all connected sources, and returns a definitive “found” or “not found.” If found, it provides the analyst with the event details, affected host, user, and a timeline.

Use Case 2: Incident Response Pivot

Scenario:
During an investigation, a malicious file is found on a server. The IR team needs to know if that file exists anywhere else.
Traditional Workflow:
The IR lead dispatches analysts to search for the file’s MD5 hash in their respective toolsets. The process is slow and requires manual correlation of results.
Agent Workflow (less than 1 minute):
The IR lead asks the agent:”Execute a search for the MD5 hash 456defgh… and show me every device and user it’s associated with.”
Outcome:
The agent provides a complete list of all sightings of the malicious file, allowing the IR team to instantly understand the scope of the compromise and move to containment.

Use Case 3: Proactive Hunting with Ambiguous Data

Scenario:
A threat hunter finds a hash in a technical blog post but the algorithm isn’t specified.
Traditional Workflow:
The hunter would have to run multiple searches, treating the hash as an MD5, then a SHA-1, then a SHA-256, hoping to get a hit.
Agent Workflow (less than 1 minute):
The hunter asks the agent:”I have this hash string: 789ghijk…. I don’t know the algorithm. Can you find it?”
Outcome:
The agent, understanding its own capabilities, constructs a query that searches the raw hash value field across all algorithms, instantly providing an answer without the need for repetitive manual searches.

The Future is a Team of Specialists

The File Hash Search Agent is a powerful demonstration of the future of AI in cybersecurity. It is not a general-purpose AI, but a focused, specialized expert designed to do one job exceptionally well. The future of the modern SOC lies in building a team of these digital specialists—a Vulnerability Intelligence Agent, a Detection Triage Agent, an Asset Info Agent, and more—all working in concert on a unified foundation of federated data. This is how we move beyond simply managing data and start delivering automated, intelligent, and proactive security.

Contributed by:

Neal Bridges

CISO, Query