April 1, 2025

Security Data Strategy: Federated Search vs. Security Data Management

Introduction

As we spoke about in previous blog posts, and what is very apparent to everyone anyway, is that there is so much damn data. The jury is still out on what security data strategy will rule the day, be it centralization, decentralization, or federation but teams still need to access it ASAP. To get a better handle on all of the data security organizations need to operate effectively, a new sector has emerged: Security Data Management (SDM).

SDM tools are managed security-focused data mobility solutions that focus on Extraction, Transformation, and Loading (ETL) and orchestration of the ETL workflows from security source tooling into destination platforms such as SIEMs and data lakehouses. The concept has long existed, as security teams have often had to build their own crude data engineering platforms to fill the same niche.

While the SDM space has exploded with a myriad of solutions (both open-source and COTS), there are many areas where security teams can deprioritize investment into SDM and instead focus on getting the data they need, when they need it, without moving it, and without having to normalize it themselves with federated search.

In this blog you will learn about the various use cases, challenges and weaknesses associated with SDM, homegrown SDM, and federated search. You will learn about how federated search can help reduce or replace the dependency on SDM, and finally you will learn about selection criteria between the three.

Centralize, Orchestrate, or Federate?

Table stakes for almost every security organization’s tool stack is an identity platform, cloud security or DevSecOps tooling (think CSPM, CNAPP, ASPM, SCA, SAST), an EDR, and any security-relevant telemetry emitted from productivity and workplace tools. The bigger the organization, the larger the AppDev tech stack and platforms, the more security tools there will be. This creates a “data sprawl” problem: getting complete visibility requires accessing dozens of siloed sources, often with inconsistent schemas and access methods.

To understand the tradeoffs between SDM pipelines, custom ETL solutions, and federated search, it’s helpful to compare their ability to fulfill common technical capabilities required across the security data lifecycle. These capabilities influence investigation speed, data architecture decisions, cost exposure, and team efficiency.

Historically, organizations addressed this in three ways:

Security Data Management (SDM) Pipelines

SDMs are purpose-built commercial pipelines that collect, transform, normalize, and store security data, often into centralized object storage, lakehouses, and/or SIEMs. In 2025, there are at least two dozen SDM-specific startups and if you widen the aperture beyond just security, there are hundreds of data mobility and orchestration platforms to potentially choose from. There is a lot of noisy messaging in this space, with one primary buyer pain: current centralization products are too expensive.

Strengths:

Central control over schema, retention, and storage costs.
Strong integration with popular security tools and data platform destinations, e.g., supporting out of the box integration with CrowdStrike Falcon API and sending the data to Snowflake.
Ability to provide in-transit enrichment (e.g., IOCs, MITRE ATT&CK correlation, asset enrichment, geolocation) and transformation into common security schemas such as Open Cybersecurity Schema Framework (OCSF) or Elastic Common Schema (ECS).
Advanced tools can potentially offer limited in-transit search, historical look-backs, replaying events, and more.

Challenges:

Largely “black box”, you are stuck with what sources and destinations the SDM supports.
High latency from ingest-to-query, even more when using enrichments or pre/post-processing.
Source schema evolution or API specification changes can break entire pipelines.
Storage and egress costs can balloon.
Default storage formats and partitioning schemes may lead to performance issues when querying data, e.g., data is written in JSON-L instead of Parquet, data does not use Hive-like partitions in a data lake, data isn’t flattened or stuck in multiple layers of nesting.

Custom SDM

Custom security data pipelines are not a new concept, and is the standard mode of operation for any security organization that takes a SecDataOps approach, that is enabling security outcomes with data. From the first time a Syslog was written or an IDS rule was authored, a security engineer (well, probably a SysAdmin) was moving that data somewhere.

Custom SDM runs the gamut from Cron-jobs and bash scripts to sophisticated homegrown platforms that handle ETL, monitoring, orchestration, exponential backoff, and visualizations.

Strengths:

Tailored to internal architecture and priorities.
Cost-efficient if managed well.
Integrates well with internal data lakes and platforms.

Challenges:

Fragile and labor-intensive.
Source schema evolution or API specification changes can break entire pipelines.
High engineering overhead and skill requirements.
Contentious governance, access controls, privacy, and observability requirements.

Federated Search (Just-in-Time Access)

As we have written about in countless whitepapers and blogs, as well as spoken about in several webinars, federated search completely flips the move-and-search paradigm on its head. Rather than moving data, federated search enables querying where the data lives, across on-premise, cloud, and SaaS systems. Instead of focusing on the plumbing, Federated Search focused on delivering answers from data wherever it is.

Certain federated search solutions, such as Query Federated Search, take this further by providing just in time normalization to completely eschew the need for ETL unless there is a specific (custom or open-source) data model that is desired.

Strengths:

Query data in place without moving it, be it behind an API, in object storage, a SIEM, or otherwise.
Full query translation, free your resources up from having to learn one or more different query languages (SQL, SPL, KQL, CQL, etc.).
Centralized access to decentralized data, have a single UI and API plane to access data across disparate teams and organizations.
Keeping select data sources in place empowers teams with cost avoidance, don’t move or store data long-term that you do not need to.
Unified security mesh architecture, use federated search as an “integration portal” or “data bridge” to support federated detections, federated analytics, or augment current in-house SDM solutions.
Augmented with AI, use Co-Pilots and Agentic workflows to recommend actions or discover indicators across unified data.

Challenges:

Dependent on API performance and rate limits. A bad API cannot be made more performant by federated search.
Dependent on data sources being supported, or having an API available.
Some limitations with specific data of specific sources, e.g., not every audit log or asset API is exposed.
Supporting a security mesh architecture does not come out of the box, and requires some custom work.
Data is not stored. All data is retrieved just in time, investigations will need to rely on federated detections or saved searches, or move the data to persist it.

Whether it is a commercial or homegrown SDM solution, or a federated search solution, security leaders and other stakeholders should weigh the strengths and weaknesses and how they fit into their own Security Data Operations (SecDataOps) apparatus.

Real-World Scenarios

Theoretical advantages are only useful if they hold up under real-world operational pressures. Below are key examples of how organizations are deploying federated search in practical scenarios, and not just as a replacement, but as a force-multiplier that reshapes how teams approach Security Data Operations.

Replacing Data Movement for Investigations: Rather than duplicating data into centralized SIEMs or lakehouses, federated search enables teams to access relevant records in real-time, directly at the source. This reduces reliance on complex SDM pipelines and minimizes the operational drag of data mobility platforms.
Augmenting Lakehouse Storage Strategies: Federated search complements SDM platforms by querying cold or archived data (e.g., in Delta Lake or Iceberg) without rehydration or re-indexing, making hybrid data strategies more cost-effective and flexible.
Reducing Pipeline Overhead: Organizations with fragile or overburdened ETL jobs benefit from federated search by offloading time-sensitive use cases – like alert triage and threat correlation – away from orchestrators, allowing ETL resources to focus on long-term aggregation and enrichment.
Simplifying Access Control and Governance: With federated search operating across disparate environments via a common interface and schema, teams reduce the need for duplicated RBAC policies and access pipelines, enhancing both compliance and operational simplicity.
Supporting Elastic Workflows for Crisis Response: During high-severity incidents, federated search allows surge teams, contractors, or external partners to access data via unified, permissioned views without granting raw access to underlying infrastructure.
Delaying or Avoiding Costly Storage Decisions: In environments where it’s unclear whether specific telemetry needs to be retained, federated search allows for just-in-time access and evaluation, delaying investment in archival or hot-storage until it’s justified.

Each of these examples reveals how federated search can de-risk traditional data strategies and deliver immediate value across detection, investigation, and compliance workflows. As the complexity and scale of security data continues to evolve, these scenarios will only grow more relevant.

Conclusion

In this blog you learned about the evolving security data strategy landscape, comparing Security Data Management (SDM) platforms, homegrown ETL pipelines, and federated search. We explored the benefits and challenges of each approach across technical, operational, and architectural dimensions. We examined real-world use cases where federated search either replaces or augments traditional SDM tools, and how it enables flexible, just-in-time access to data with minimal operational overhead.

Security teams need more than another pipeline. They need a strategy that matches the speed and sprawl of modern environments. Federated search solves the modern visibility problem by providing just-in-time access to decentralized data without costly centralization.

To drive your SecDataOps strategy forward, it’s critical to understand where each data access and transformation model fits. Whether you’re dealing with compliance-driven long-term storage, instant incident response, or unified search experiences across siloed systems, there’s a place for SDM, ETL, and federated search.

To help summarize, here is a high-level capability comparison that can guide your strategy and architecture decisions:

Capability	SDM Pipelines	Home-Grown ETL	Federated Search
Long-term forensic analysis	✅	✅	➖ (use with lake)
Live ad hoc querying	❌ (index lag)	➖	✅
Cross-source data correlation	❌	❌	✅
Just-in-time schema normalization	✅	❌	✅
Alert triage without data moves	❌	❌	✅
Architecture cost avoidance	➖	✅	✅✅
Unlock value from distributed data without ETL	❌	❌	✅

This comparative view highlights why federated search is becoming an essential pillar and not a replacement for modern security architectures. It unlocks agility, reduces cost pressure, and complements SDM investments with just-in-time access and insights.

Whether you’re a security engineer tired of brittle ETL scripts, a SOC leader frustrated with alert lag, or a security architect scaling detection coverage, Query Federated Search is the data force multiplier your team needs. To find out more, reach out to our sales team to schedule a demo!

Stay Dangerous

Contributed by:

Jonathan Rau

VP/Distinguished Engineer, Query