This is part II of a series exploring the concepts and potential of Federated Security. See more.

Introduction

In the trenches of cybersecurity, investigations and threat hunting are where the rubber meets the road. The combination of your people, processes, and technology will match up against a threat and/or adversary. Both investigations (including incident response) and threat hunting are games of hide and seek – and there is a timer running. Investigations tend to be reactionary, prompted by an alert or report, while threat hunting is usually more proactive – there is a threat in the wild, let’s see if it’s present in our environment. But in both cases, it often devolves into a cumbersome game of hide-and-seek, hampered by fragmented data and inefficient tooling. This isn’t due to a lack of talent or willpower, it’s the very structure of modern security ecosystems that sets operators up to fail.

In This Blog:

  • Why investigations fail – or fail to happen
  • Using federated security tools to enable and empower investigations or threat hunting
  • Example: SolarWinds
  • Practical steps toward investigations and threat hunting in a Federated secops environment

Let’s unpack this.

The Current Reality

Here’s the typical scenario: You’ve hired talented analysts, equipped them with powerful Security Information and Event Management (SIEM) platforms, usually Splunk, and set them loose. The more junior team members are typically investigating alerts or perhaps an employee report. Your more senior analysts build clever hypotheses informed by Cyber Threat Intelligence (CTI), past incidents, or creative intuition. Then comes the investigation phase, and suddenly everything slows to a crawl. Why? Because your Splunk instance, despite its power and flexibility, can’t feasibly store or ingest every data source. 

Data onboarding is slow, expensive, and hindered by license constraints. Consequently, your analysts run queries against incomplete data sets or against data sets with latency issues due to upstream dependency on data pipelines. Worse yet, analysts start pivoting to external tools to chase down missing contextual data. This pivoting introduces inefficiencies, context-switching, and frustration. The overall impact is slow and ineffectual hunts, which is a horrible outcome. 

Consider a real-world example: Your analysts suspect a particular variant of obfuscated malware execution using mixed-case PowerShell commands – this is a common technique to bypass detection rules. Using Splunk to analyze Windows Event Logs (Event ID 4688) is straightforward enough. However, crucial endpoint logs from your Endpoint Detection & Response  (EDR) system or network telemetry stored externally in an AWS data lake aren’t available in Splunk. Analysts spend valuable time pivoting between consoles, manually correlating disparate logs, copying data into spreadsheets, or scripting painful API calls. By the time they confirm their suspicions, the threat actors are already embedded deeper, or worse, have moved on to exfiltrate sensitive data.

This fragmented approach also significantly limits retrospective hunting. When intelligence emerges about a threat that was previously undetected (e.g., SolarWinds-style supply chain compromise), analysts must comb through months of historical logs. Yet, historical data rarely resides in expensive SIEM storage, it lives in archival storage like Amazon S3 buckets, Snowflake warehouses, or Azure Data Lakes. Hunting these sources directly is slow, complex, and requires specialized query skills. The result? Many security teams abandon thorough historical hunts altogether.

Federated search allows security operations teams to overcome the challenge of distributed, heterogenous data when conducting investigations and threat hunts. Federated Search allows analysts to query multiple data sources simultaneously without having to centralize all security data into a single SIEM or data lake. Federated search  queries data where it resides, in real-time and at scale.

Federated Search isn’t new – Splunk, for instance, offers a limited federated search capability. But most implementations are partial at best, either restricted to a single vendor ecosystem (like Splunk-to-Splunk federation), or requiring extensive pre-processing, normalization, and data ingestion strategies.

Query was designed specifically for security operations and takes a more radically open and inclusive approach: it connects directly to various data sources via APIs, without moving or duplicating data, and executes queries natively at each data source. To ensure immediate usability, all of the results are immediately normalized to a common data model, theQuery Data Model (QDM), based on the Open Cybersecurity Schema Framework (OCSF).

With the above approach more broadly articulated, this makes Federated Search the natural selection for both security investigations, which eventually turn into threat hunts.  Both of these capabilities benefit highly from a federated search model.

The combination of access to distributed data via Federated Search, and security normalized data, makes for not just a Federated Security solution, the first of it’s kind.

Imagine revisiting our previous malware hunting scenario with a federated approach: Your analysts build their hypothesis around suspicious PowerShell executions and craft a single search. Instead of manually pivoting across five different tools, the query simultaneously searches Splunk for Windows Event Logs, Carbon Black for endpoint process data, AWS CloudTrail for access logs, and Azure Sentinel for cloud telemetry. In seconds, analysts receive aggregated, normalized, collated, and correlated results directly in their interfaces. No more manual pivoting, spreadsheet gymnastics, or lost context.

So What: The Technical, Operational, and Business Impact

Federated Security isn’t just a technical convenience, it has a big enough impact on time-to-answer, to meaningfully impact the business.

Federated Search dramatically enhances data accessibility and visibility. Analysts can run queries across diverse environments, reducing the manual effort required to correlate data from multiple isolated systems. Consider the SolarWinds incident as a real-world example: traditionally, analysts had to separately execute and reconcile results from multiple data sources:

Splunk Query Example:

index=win_events EventCode=4688 CommandLine="*SolarWinds.Orion.Core.BusinessLayer.dll*"

AWS Athena Query Example (for logs stored in AWS S3 buckets):

SELECT * FROM security_datalake.cloudtrail_logs
WHERE eventName = 'InvokeFunction' 
AND requestParameters LIKE '%SolarWinds%'

Carbon Black Endpoint Query Example:

curl -X POST -H 'X-Auth-Token: YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"query": "process_name:SolarWinds.Orion.Core*"}' \
https://cb-instance/api/investigate/v2/orgs/org_key/processes/search_jobs

Each query required separate management, which takes time and knowledge. With federated search, analysts craft a single unified query.

This returns immediate, correlated insights from Splunk, AWS CloudTrail, and Carbon Black endpoints, significantly accelerating the identification and remediation process. This creates shortened hypothesis and analysis windows which threat hunters and investigators have to operate in to validate the impacts of their hunts.

While this example was brought to you under the lens of a threat hunt, this could have easily been an investigation occurring with the onset of late breaking news. This is very similar to when WannaCry hit tens of millions of computers in May of 2017. For a manufacturing enterprise with around 250,000 endpoints spread across 170 countries, the complexity was immense. A federated search solution would have drastically simplified and accelerated response efforts.

Traditionally, analysts had to manually run individual queries like:

Splunk Endpoint Event Query:

index=endpoint_logs event_code=4663 process_name="*wannacry*"

Network Flow Query (via AWS Athena):

SELECT src_ip, dst_ip, dst_port FROM network_logs
WHERE dst_port IN (445, 139) AND event_time BETWEEN '2017-05-12' AND '2017-05-13';

Carbon Black Endpoint Detection Query:

curl -X POST -H 'X-Auth-Token: YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"query": "filemod:wnry.exe OR process_name:tasksche.exe"}' \
https://cb-instance/api/investigate/v2/orgs/org_key/processes/search_jobs

Federated search transforms this complex multi-step process into a single, streamlined query:

SELECT events FROM ALL_SOURCES
WHERE process_name LIKE '%wannacry%' OR filemod='wnry.exe' OR dst_port IN (445, 139)
AND event_time BETWEEN '2017-05-12' AND '2017-05-13';

Instantly, analysts would have visibility into compromised endpoints, lateral movement attempts via SMB protocol, and infected files, enabling rapid containment and mitigation.

In both of the above cases, whether an analyst is trying to hunt complex environments, or investigate an emerging threat, Federated Search for security greatly decreases the complexity of enterprise environments.

Operational Benefits: Federated Search optimizes analyst efficiency and reduces operational complexity. Analysts no longer spend hours pivoting across platforms, navigating multiple interfaces, and reconciling results manually. The reduction in investigative friction directly correlates with a lower mean time to detect (MTTD) and mean time to respond (MTTR), crucial metrics for security operations effectiveness.

Business Benefits: Implementing Federated Security presents substantial economic and strategic advantages by reducing reliance on costly SIEM solutions. Traditional SIEM systems often operate on data ingestion-based pricing, quickly escalating expenses as data volumes increase. For example, industry-standard solutions like Splunk typically cost around $3,780 per TB per year (assuming 10GB/day index volume and 1-year Splunk Cloud license), while Elastic’s SIEM solutions hover around $9,500 per TB per year (assuming 120GB/year Standard license).

Federated search significantly mitigates these expenses, potentially reducing data storage and access costs by over 80%, according to industry insights. By querying data directly in-place, federated search eliminates the need for expensive centralization and duplication, allowing organizations to utilize more economical storage options like cloud-based object storage.

Now What: Practical Steps Toward Federated Threat Hunting

To prepare your organization for federated security, consider taking these actionable steps today:

  1. Assess Your Data Landscape:
    • Inventory all relevant security data sources, documenting their current location, format, and accessibility.
    • Identify data sources not currently centralized in your SIEM due to cost or complexity, and evaluate their value for investigative activities.
  2. Standardize Query Capabilities:
    • Train analysts in SQL or other standardized query languages to ensure they can interact effectively with data stored in cloud environments, data lakes, and APIs.
    • Develop reusable query templates that align with common threat scenarios and can be adapted for federated searches.
  3. Implement API Access and Integrations:
    • Verify and enable API access for critical security tools (EDR, cloud services, etc.) to ensure readiness for federated queries.
    • Explore integration opportunities with open federated search platforms like Query.ai, which enable seamless querying across distributed datasets.
  4. Conduct a Federated Search Proof-of-Concept (PoC):
    • Choose one or two high-value, non-centralized data sources to test federated search capabilities.
    • Utilize federated search tools to demonstrate improved analyst efficiency, reduced query times, and enhanced threat detection outcomes.
  5. Evaluate Operational and Cost Efficiencies:
    • Track metrics such as time saved per investigation, number of threats detected, and reductions in data centralization costs.
    • Present these findings to senior leadership, emphasizing improved security outcomes, operational efficiencies, and cost savings.

By undertaking these steps, you’ll lay the critical groundwork for adopting federated threat hunting methodologies. As you progress, explore the federated search capabilities from Query to further enhance your threat hunting effectiveness and operational agility.