Project Hail Query, Part 3 of 4
Part 1 | Part 2 | Part 3 | Part 4
Security operations has a data problem that predates AI and gets worse because of it. Most SOCs are working with less than half their security data. The rest is sitting in tools and storage tiers that were never built to feed a centralized analytics platform, inaccessible when an alert fires. When AI agents run on top of that fragmented foundation, they reason about what they can see. The coverage gap doesn’t move. The AI makes the gap run faster.
The teams that will win with AI in security are the ones that solved the data problem before they deployed it. That means broad access to security data wherever it lives, a normalized data model that resolves inconsistencies before the AI touches them, and complete context. Context that exists but isn’t accessible is, for investigation purposes, context that doesn’t exist.
That’s the foundation we built. This post is about what we built on top of it.
In Project Hail Mary, Ryland Grace doesn’t save the world by finding a better version of what he already had. He survives because he encountered a completely different kind of intelligence in Rocky: one with capabilities he didn’t have, that needed his judgment in ways he didn’t expect, and that worked with him rather than for him or instead of him. What makes it work isn’t just that Rocky is smart. It’s that Rocky is constructed differently. He perceives in wavelengths Grace can’t see. He processes information at speeds Grace can’t match. And Grace brings things Rocky fundamentally lacks: context, judgment, the ability to reason about situations neither of them has encountered. They don’t do the same job at different speeds. They do different jobs that only produce a solution when combined.
That’s the design principle behind Query Workers. Not an autonomous SOC that removes your team from the picture. Not a chatbot that summarizes logs. It’s AI-powered automation that is constructed differently: it queries across every data source, correlates across system boundaries, and runs down every IOC, then hands the findings to your analysts for the judgment calls that matter most. Grace and Rocky didn’t do the same job at different speeds. They did different jobs that only worked when combined. That’s the model.
Introducing Query Workers
Query Workers is a set of automated investigation capabilities that run on the Query Security Data Mesh. A Worker receives an alert or a directive, runs a structured multi-stage investigation across your entire data environment, and produces evidence-backed findings with recommended actions and disposition. The grunt work is complete, with evidence sourced, cited, and documented, before a human analyst touches the queue.
The investigation isn’t a summary of what the alert said. It’s a full reconstruction of what happened: which systems were involved, which users or entities are implicated, what the sequence of events looks like across every relevant source, what the threat intelligence says, how severe it is, and what the evidence chain supports. Your analyst’s job is the judgment call, not starting the investigation from scratch.
The Investigation Worker handles alert triage and analysis across the full mesh. The Threat Hunting Worker takes a hypothesis and runs a four-phase hunt: data mapping, investigation, pattern discovery, detection logic generation. The Identity Threat Assessment Worker sweeps your identity infrastructure across all identity providers simultaneously, testing eight categories of identity-based attack patterns. Each produces auditable evidence, confidence levels, and recommended actions.
Here’s what that looks like against real data.
The Mesh Unlocks the Value
The thing that makes these investigations possible isn’t the AI. It’s what the AI reasons against. Your analysts are good at this work. The problem isn’t skill, it’s physics. A senior analyst investigating a single alert across 4 tools spends the first 30 to 60 minutes just accessing the data: logging into consoles, writing vendor-specific queries, normalizing fields mentally, pivoting between tabs. Multiply that by 50 to 200 alerts per shift, and most alerts never get investigated at all.
Workers change the math. Every example below required data from sources that don’t talk to each other natively, correlated through the mesh in a single session.
Tracking a Kill Chain from Inbox to Cloud
A Worker received a malware hash and hunted it across the full mesh, from initial delivery in email, to endpoint compromise in EDR, to lateral spread through collaboration platforms.
What the Worker delivered: Active multi-vector campaign — 41 compromised hosts (24 Windows, 17 Linux), 38% email bypass rate, 9 MITRE ATT&CK techniques mapped, lateral spread via SharePoint and Teams, kill chain reconstructed from delivery to persistence.
How: 24 queries across 10+ connectors (Proofpoint, CrowdStrike, CarbonBlack, O365 ATP, VirusTotal, AlienVault OTX, MISP, and 3 additional threat intel sources).
Correlated insights: 74 email events linked to 42 EDR detections linked to 44 O365 ATP alerts linked to 6 sender domains — none of which appeared together in any single tool.
Manual equivalent: Open Proofpoint, query the hash. Open CrowdStrike, query the hash. Open O365 admin, search for the hash. Open VirusTotal, AlienVault, MISP — one hash at a time. Export, normalize, correlate. That’s one IOC. This investigation mapped 9 TTPs across 6 sender domains and 41 hosts.
Finding a Compromised Identity Across Providers
An Identity Assessment Worker found a service account that normally authenticates every 15 minutes from a server had logged in once from a laptop via Chrome. It pivoted through DHCP to identify the laptop, cross-referenced the IP against other authentication events, and identified the human user active on that device minutes before and after the anomalous login.
What the Worker delivered: Service account compromise attributed to a specific employee, with the full reasoning chain: anomalous login → device type mismatch → DHCP pivot → user correlation → behavioral timeline.
How: 39 queries across 6 identity connectors (Okta, Entra ID, JumpCloud, Auth0, DHCP, device inventory), testing 8 identity attack patterns.
Correlated insights: 1 anomalous service account event correlated with 13 user authentication events on the same device, DHCP lease records, and device inventory — resolving an identity no single IdP could surface.
Manual equivalent: Open each IdP console, export sign-in logs, normalize timestamps and IP formats, cross-reference by hand. Most organizations run identity assessments quarterly because of the cost. Workers make it continuous.
Detecting Cross-Environment Spread
An overnight triage surfaced 120+ new alerts across 50+ hosts. The Worker ran 7-day lookbacks on the 5 most suspicious and found multi-stage kill chains — the same attack progression on corporate endpoints and an AWS EC2 instance on a different subnet.
What the Worker delivered: 5 confirmed kill chain users across corporate and cloud, 11 MITRE ATT&CK techniques across 5 tactics, cross-environment spread identified between 172.16.16.x corporate hosts and a 10.100.6.x cloud instance.
How: 20 queries across endpoint detection and identity sources, with per-host 7-day deep dives on priority hosts and the EC2 instance.
Correlated insights: 120+ alerts reduced to 5 confirmed compromises with structural evidence — sign-in anomaly → persistence → discovery → lateral movement — and a cloud-to-corporate correlation visible only through the mesh.
Manual equivalent: 120 alerts means an analyst either triages superficially (missing the pattern) or picks a few to investigate deeply (missing the scope). The Worker did both — breadth-first discovery, then depth on the hosts that mattered.
The judgment call is the same. The hours of prep work before it are not.
Trust Is a First Class Primitive
We made a deliberate design choice that runs counter to most of what you’ll hear at RSAC this week.
When being wrong has real consequences, the right design isn’t AI that acts autonomously. It’s AI that investigates thoroughly and produces output that makes the humans who decide better at deciding. Workers don’t take actions. They produce findings, confidence levels, and recommendations. Your analysts review them and make the call.
Every investigation produces a structured evidence package: a full report with findings and disposition, a query log documenting every query the Worker ran and what it returned, an IOC ledger tracking every indicator and its enrichment status, and on high-severity findings, a senior analyst review that pressure-tests the Worker’s reasoning before it reaches your team. The audit trail isn’t a summary. It’s the complete investigative record. Your analyst can inspect the full chain of reasoning. When they make a call, they’re making an informed decision, not ratifying a black box.
The normalization layer underneath Workers matters here in ways that go beyond consistency. As we showed in Post 2, AI models presented with inconsistent or contradictory field representations across sources don’t flag the ambiguity. They resolve it probabilistically and produce confident-sounding output that may reflect incorrect correlations. OCSF normalization eliminates that failure mode before the Worker touches the data. Okta authentication events, Entra ID sign-in logs, and JumpCloud directory events all resolve to the same field paths at query time. The Worker reasons against consistent structure across inconsistent platforms, which means its context window is spent on investigative reasoning rather than schema reconciliation. The result is more accurate findings, better token efficiency, and output your team can actually audit. That’s not a data engineering nicety. In an AI investigation context, it’s a reliability guarantee.
Workers are also tunable. The investigation logic ships with proven workflows tested against real environments. Your organization modifies those workflows to match your threat model, your escalation standards, and the way your team actually works.
One Architecture That Supports Any Domain
The architecture that powers Query Workers supports any security workflow where the answers live in your data. The pattern is the same: discover what’s relevant, run domain-specific analysis, correlate across sources, produce actionable output.
We’re building a Vulnerability Prioritization Worker that produces a risk-ordered remediation picture using context from across your environment, a Phishing Investigation Worker that identifies the full recipient population across email platforms and correlates with endpoint and identity data, an Asset-to-Identity Correlation Worker that answers who was on this IP at this time by correlating across DHCP, VPN, EDR, and directory services simultaneously, and an Incident Timeline Builder that assembles a complete forensic timeline across every relevant connector.
Each of those is better because of the Query Security Data Mesh. Some of them are only possible because of it. Cross-boundary kill chain detection. Real-time blast radius calculation when a credential is compromised. Detection coverage mapping across your full stack. These aren’t features that are better with federation. They’re outcomes that are only achievable because of it.
The same is true for AI agents you’ve already built. If your team has developed custom investigation workflows, or you’re working with an AI SOC vendor whose agents you want to keep running, those agents can connect to the Query Security Data Mesh and query across your full environment with OCSF normalization underneath. You don’t have to choose between the AI capability you’ve already invested in and the data foundation that makes it more effective.
Available in Private Preview
We’ve opened a cohort of design partners for Query Workers, including BYO Agent access for teams running their own AI agents on the mesh. You run Workers or your own agents in your environment, against your actual data, and help shape what ships, with incentives for your investment.
Your data stays where it is and every investigation is fully auditable. Your team reviews every finding before any action is taken.
If you’re at RSAC this week, come find us. If you’re not, this link gets you to the right conversation.
The final installment of the series publishes tomorrow.
