CrowdStrike Query Better Together header

March 24, 2025

CrowdStrike and Query Federated Search: Better Together

Introduction

When you think of the leaders in the Endpoint Detection & Response (EDR) space, even if you do not personally use them, you cannot deny CrowdStrike’s leadership and innovation. However, to consider CrowdStrike as simply an EDR company is a mistake, as over the years they have expanded into everything from Vulnerability Management with their Spotlight product, to Identity Threat Detection & Response (ITDR) with their Identity Protection product for hybrid environments, and more.

While this is great news for organizations who utilize the CrowdStrike platform, it can prove difficult to bring all of this data together along with its downstream context and deliver it to the correct team. For instance, does the Enterprise Security team or the IT Shared Services organization have the final say on countermeasures or remediation for Identity Protection configurations? Is it the Security Architecture team or perhaps is it Cloud Security who should define Zero Trust Assessment scoring targets?

For organizations using CrowdStrike Falcon LogScale (formerly Humio) or Next-Gen Security Information & Event Management (SIEM), how can you ensure that SecOps, IT Ops, Architecture, and all other concerned parties have access to the data? What if you’re not using LogScale or Next-Gen SIEM, and instead wanted to ingest Falcon Data Replicator (FDR) telemetry into your own security data lakehouse?

In this blog you will learn how Query Federated Search unifies and simplifies access across disparate CrowdStrike data sources, be it from the Falcon API or from LogScale. You will learn about the features of Federated Search and why it is its own robust offering and not simply another poor simulacrum of a SIEM. Finally, you will learn about the reference architectures and use cases that can be fulfilled by using Query Federated Search, especially when bringing all of your security-relevant data to bear is of top desire.

It goes without saying that the CrowdStrike Falcon platform is very robust on its own, and you do not need to use Federated Search to best harness the data within it, but for any investigation, incident or threat hunt where you need not only CrowdStrike data but also to pivot to another tool or tools, Query will measurably enhance your experience.

Falcon Power

In this section you will learn a little bit more about CrowdStrike Falcon, and the various capability sets it provides to SecOps teams. This will only be a brief introduction and will not cover all possible features and license tiers. For more information, you should consult the CrowdStrike website and get into contact with their sales information.

The CrowdStrike Platform & Falcon EDR

At its core, CrowdStrike Falcon is a cloud-native security platform that provides endpoint detection and response (EDR), threat intelligence, and a range of advanced security features designed to protect modern organizations. The Falcon EDR agent is the veritable cornerstone to the platform, offering real-time threat detection and response capabilities across several flavors of platforms and operating systems.

101

Unlike traditional antivirus solutions, Falcon operates on a behavioral analysis model, leveraging AI-driven detection to identify and interdict threats. That said, you can also use your own Indicators of Compromise (IOCs) and define your own Indicators of Attack (IOAs) to augment your detection and prevention capabilities with known true positive signatures and heuristics. In addition to agent-specific policies for hardening, you can further fine-tune your detection and response posture.

Every single device that has the agent installed is known as a Host, this data can be retrieved and searched to surface details about the asset, the operating system, hardware, and other observable metadata such as IPs, serial numbers, device IDs, last logged on users, and various timestamps when the machine was last seen. Hosts are referenced in nearly every finding aggregation and raw event in the Falcon platform!

Alerts, Detections, and Incidents

So, you have a great EDR Agent, how do the findings and signals get surfaced to responders? CrowdStrike structures its finding data into three core categories, which are built off of raw “events”. These events are the raw telemetry from the Falcon platform that can be consumed from FDR and record everything from file downloads, process termination, network connections, and OS-specific setting updates.

Detections: Detections are the first categorization of “security findings” in CrowdStrike. These are made up of one or more events that are triggered as important or outright malicious by the combined capabilities and configurations of the Falcon sensor.
Incidents: Incidents are high-level aggregations which tie together Events and Detections into similar buckets. They contain information about implicated hosts, related downstream Detections, and specific Behaviors surfaced by them.
Alerts: These are notifications triggered by rules or integrations with third-party solutions, alerting responders to distinct behavior, to Detections, or Incidents. The metadata contained within greatly differs across the various source inputs, it can capture everything from grand-parent process information to specific Security identifiers (SID) of the source of the activity, and much more!

Understanding the interplay between these components is key to optimizing your security operations and accelerating response times. It is important to note that these Alerts, Detections, and Incidents are contributed across the various expanded capabilities in addition to the core capabilities of the Falcon platform. At first glance, they may seem duplicative, but each level offers different aggregation capabilities and information to ultimately Respond and Recover.

Expanded CrowdStrike Capabilities

Beyond EDR, CrowdStrike has expanded its platform to address different types of security responsibility areas with its own bevy of capability sets. For instance, CrowdStrike has several capabilities to address vulnerability management, identity security, security posture management, and more. Some of these capabilities include the following:

Spotlight: Falcon Spotlight provides real-time vulnerability management, integrating seamlessly with Falcon agents to deliver continuous visibility into software vulnerabilities without requiring additional scans.
Identity Protection: Tailored for hybrid Active Directory environments, this feature enhances security by detecting lateral movement, credential-based attacks, and privilege escalation attempts. Additionally, it offers inventory management capabilities and configuration information for risky settings such as using legacy authentication, e.g., NLTM.
Zero Trust Assessments: Part of the Identity Protection suite, Zero Trust Assessments (ZTA) are real-time security checks for endpoints that help organizations protect their devices, data, and networks. ZTA scores can be used to enforce conditional access and mitigate risks. ZTAs work by assessing agent-level, host-level, network-level, and other telemetry to produce scoring and teams can use that to inform hardening or mitigation work.
Crowdstrike Surface: An External Attack Surface Management (EASM) tool, Surface helps to–well–surface your entire internet edge. It helps to find known and unknown hosts, their configurations, potential vulnerabilities, and even works with Internet-of-Things (IOT) devices.

These are only some of the capability sets that you can use with the Falcon platform, CrowdStrike offers more to round out tooling for your security organization. Better yet, your Security Data Operations (SecDataOps) team can find new ways to harness the data from all of these capabilities to enhance your overall security posture and crush threats.

Falcon LogScale & Falcon Next-Gen SIEM

The Falcon platform generates an awesome (in the “Ahh! Woah!” sort of the way) amount of data, but so do your other security tools. Wouldn’t you like somewhere to put that all? As luck would have it, CrowdStrike offers a way to do just that with Falcon Logscale, built off of the Humio acquisition.

LogScale is a high-performance log management solution that offers a robust set of Application Programming Interfaces (APIs) and access via the web console to analyze, ingest, query, and visualize your data–from the Falcon platform and outside of it–all using the Falcon Query Language (FQL) you’re likely familiar with already. LogScale is built with an optimization for search speed and scalability similar to other SIEM and Application Performance Management (APM) logging platforms your SecOps and IT Ops teams may be accustomed to.

Central to LogScale is the concept of a Repository that contains all of your data, parsing logic, and retention information. You can mix many types of source data into the same Repository and not have to worry about schema mutation issues, as your SecOps and SecDataOps teams can use FQL to pull out exactly what they need. Atop Repositories, you can create Views to unify pre-filtering and searching across them, in a way they act similar to a Materialized View in a data warehouse.

CrowdStrike offers a further capability, built on the same backend and interface, Next-Gen SIEM. Analogous to how Splunk Enterprise Security (ES) works atop of Splunk Core, Next-Gen SIEM offers advanced and extensive security analytics and threat hunting capabilities. This allows specialized members of your SecOps organization such as threat hunters and security researchers to glean even more insights from data stored in LogScale.

As far as getting data into either solution, as mentioned above, the LogScale solution offers a robust Ingestion API and different mechanisms to route data into it. For instance, you can push directly via the API, use Ingest Tokens to route your source data to a proper parser in a Repository, or you can ingest from a data lake or archive built atop Amazon Simple Storage Service (Amazon S3) buckets.

Likewise, you can natively ingest FDR data to get at your raw events across all of your Hosts or you can use a third-party tool such as Cribl Stream to directly put data into both LogScale or Next-Gen SIEM. Cribl Stream is a popular choice, especially to integrate with other host data that can be sourced from Cribl Edge, or third party tools and log sources.

Falcon Data Replicator

For organizations that want to build their own security data lake or lakehouse, FDR provides raw telemetry streaming directly from CrowdStrike’s platform. As noted earlier, the raw events that power every single behavior analysis and threat detection within the Falcon platform can be surfaced by FDR. Atop of the raw events, details on Hosts and other unmanaged assets can also be gleaned from FDR to enrich asset management efforts such as getting up-to-date information on MAC addresses, IP addresses, hostnames, and interfaces to also place in your lake, lakehouse, or a Configuration Management Database (CMDB).

FDR is not without its own challenges. By default, when you configure FDR you consume from an Amazon Simple Queue Service (SQS) queue, which is a long-polled message queue that needs to be actively monitored for new messages. These messages are organized into different categories which require its own careful metadata management to ensure the right data is saved to the right location.

Further, the raw events in the FDR logs are comingled and require considerable effort to parse into their own full context. This is on purpose, it is raw signals after all. CrowdStrike provides their own mappings into the Open Cybersecurity Schema Framework (OCSF), LogScale can ingest the data, and even Cribl offers parsers and ingestion for FDR data. OCSF is by far the best format for usage within a data lake or lakehouse for your SecDataOps teams.

If this is your first time reading about the OCSF, or if you are coming back to it after an absence, consider reading our beginner and executive-friendly blog: Query Absolute Beginner’s Guide to OCSF. For a more detailed explanation of OCSF, see our other blog: Definitive Guide to Open Cybersecurity Schema Framework (OCSF) Mapping.

In this section you gained a high level understanding of the CrowdStrike Falcon platform. You learned about the capabilities of the sensor, configurations, and the concept of a Host. Furthermore, you learned about the different ways that security findings and signals are characterized and additional capability sets in the Falcon platform. You learned about how CrowdStrike approaches log management and SIEM with Falcon LogScale, Next-Gen SIEM, and FDR. Finally, you learned about different ways to ingest data into LogScale and Next-Gen SIEM, and how to consume FDR data.

In the next section, you will learn more about Query Federated Search and how it interoperates across all of the data from the various CrowdStrike capability sets including the core offerings, additional licenses, LogScale, FDR, and even Cribl.

Federated Search for CrowdStrike

Federated Search is the mechanism by which an operator can use a single, centralized tool to search across disparate data sources. The entire lifecycle of the search is handled on behalf of the operator, everything from capturing the intent of the search, to translating the query, and performantly retrieving the downstream data. Query Federated Search takes this further by also normalizing and standardizing the data into a unified format, the Query Data Model (QDM), which is derived from the Open Cybersecurity Schema Format. This allows two-way translations of your search intent and the results which makes it far easier to analyze, aggregate, filter, pivot, and ultimately utilize for SecOps.

Note first that, federated search solutions should not be confused with federated query solutions which utilize similarity of a query engine to reach out to another similar source. For instance, federated queries are possible from databases or data warehouses such as Databricks because the similarity is SQL, allowing you to query other databases and warehouses from within Databricks. This does not work for data sources that do not use SQL, and there is no intervention on the performance tuning or safety of the query that is being dispatched.

Another key differentiating factor of Federated Search is the ability to search this data without having to duplicate or move it again. From a cost, data sovereignty, and performance perspective this is important because you are not subsidizing the persistent storage of a vendor, you’re utilizing the investments and vectors you already have access to. You do not need to worry about privileged or sensitive data being cloned into another source, and you do not need to wait for the extra roundtrip time of the data being duplicated. This just-in-time fulfilment of searches allows your SecOps team to concentrate on gathering evidence and making decisions and less on the cost, privacy and security impacts of the searches.

Federated Search provides teams flexibility and freedom of choice when it comes to their consumption models and security data architectures. For instance, there is absolutely nothing wrong with keeping your LogScale or Next-Gen SIEM investment, and for all intents and purposes, you should keep it!

Using Query Federated Search, you let our platform be the expert Falcon Query Language (FQL) author and data retriever atop your Falcon platform telemetry and external data from Cribl Stream or otherwise. Additionally, Query uses FalconPy to directly integrate with the various Falcon platform APIs. You can simultaneously get data across ZTAs, Identity Protection, Alerts, Incidents, Detections, Hosts, and LogScale Repositories right within Query Federated Search.

For data that is a bit harder to normalize and deal with API limitations, leave the data in place! While you could ingest into LogScale, take into consideration if you had a large Microsoft M365 environment. Within your M365 deployment some of the most important audit, device, and identity data comes from Entra ID (formerly known as Azure Active Directory) and Microsoft Intune.

Instead of configuring another parser, another Cribl Stream, another Repository, or your own custom orchestrators and parsers–just leave the data in M365. You can continue to benefit from the increased context and visibility by collating and visualizing the data all in place. Now when you search for a Hybrid Active Directory (AD) device that is onboarded in Falcon Identity Protection you can also get information about it from Entra ID and/or Intune. You can view your CrowdStrike Detections data about it alongside the Entra ID Audit log data at the same exact time to improve your SecOps decision support.

By using Query Federated Search, you are guaranteed to have the most up to date data whenever you search for it. You do not need to worry about deduplication or pivoting, or making sense of the data, as our normalization into QDM puts all like data into the same like Attributes. Whether the identifier is an aid, agent_id, device_id, or an id, in Query it is simply a Resource UID to pivot from and visualize.

If you are using Falcon Fusion to provide Security Orchestration & Automated Response (SOAR) capabilities, after your search concludes with Query, you can use your same automation to close the result and append the search history or a saved search from Query into the upstream Incident. Even more impactful is that Query emulates search mechanisms and operators that FQL may not natively provide across the different APIs. For instance, Behaviors for Incidents are processed into the upstream Incidents at Search time, allowing you to search for Incidents or Alerts by specific indicators or observable data.

Query Federated Search & CrowdStrike Reference Architectures

Consider the following reference architectures, some of which reflect actual customers of ours, to show possible ways you can interoperate CrowdStrike data and external data with Query Federated Search.

Use Case: Host-Based & Identity Correlation

In the first reference, this is a typical “leave-in-place” use case where Federated Search is used to unify across disparate data sources that the Security Operations Center (SOC) may already be operating with.

In this case, the representative organization is using Microsoft M365 as their Identity Provider (IdP) and are also using Microsoft Intune to provide Mobile Device Management (MDM) capabilities but are using CrowdStrike Falcon instead of Microsoft Defender for Endpoint (MDE) to provide EDR, XDR, and other capabilities. As shown in the diagram, some of the log sources are highlighted above the target technologies that can be seamlessly queried and visualized from Query Federated Search.

Shown on the left side of the reference architecture are potential areas to consume the FDR data, be it directly from the source S3 bucket (using Amazon Athena and AWS Glue Data Catalog) or consuming FDR from a Falcon LogScale Repository configured to parse the FDR events. This offers the utmost flexibility and doesn’t require that data is replicated from M365 (or your own IdP and MDM solution) into LogScale or another data lake.

Remember, all data is normalized and collated, and all search operations are parallelized across their own workers to greatly lower the “round-trip time” that it would take if you queried all of this data at once from a legacy SIEM or a data lake using expensive JOINS and UNIONS.

Use Case: Cribl + CrowdStrike + Query – One View of All the Data

In the next reference architecture, consider this an evolution of the first, where data that is best to be moved is thus moved and searched from the best source for it: Falcon LogScale. In this example, the example organization has a very large Apple MacOS and iOS ecosystem which is not the best when managed via Microsoft Intune. Instead, JAMF Pro is also used specifically for Apple MDM and that data can also be searched using Query Federated Search.

While a lot of the setup remains the same, what is different is that the organization also uses Cribl Edge and Cribl Stream. Cribl Edge is used to collect data directly from hosts, and Cribl Stream provides pulling data as well as having data pushed to it in addition to its log destinations. In this case, Edge and Stream are used to consume Windows Events Logs and logging data from iOS and MacOS devices. Host-based data is not directly supported by Query Federated Search as we do not use agents or taps to grab logs from hosts.

However, LogScale is very well suited at the high volume and disparate host-based logging data that can come from iOS and Windows environments. Now when SecOps analysts search for any new Alerts for the day, they can gain every single bit of information they need all from one spot. They get the raw telemetry from their host-based logging Repository in LogScale, they get the sensor Events via FDR from LogScale as well. They get the Incidents, Detections, and Alerts to work from via Falcon as well as the Host data. Additional metadata from Intune and JAMF provides extra insights into the configuration and posture of the data.

With CrowdStrike capabilities together with best of breed tools such as Cribl Stream to move large volumes of data, and direct API integrations, Query Federated Search helps bridge the capability gaps without having to give up on any capability from the disparate tools.

Use case: Tech Transitions, Legacy Tech & Archives

Finally, consider the last reference architecture which demonstrates a migration from an incumbent EDR platform to a more full-featured EDR platform such as CrowdStrike Falcon. In this case, the representative customer moved from VMWare Carbon Black to CrowdStrike Falcon but needed to ensure that the SOC did not lose visibility of archival and live data as the migration was underway.

The customer in this case had a large amount of their SOC and supporting security functions along with IT and Networking teams dedicated to the integration and wanted to greatly simplify their architecture and infrastructure. Migrations are rife with risk, chief among them being degraded capabilities due to fatigue, reduced workforce capital, or data loss. Using Query Federated Search, alerts could continue to be triaged on both sides of the house while the IT and Networking teams could also utilize Query for troubleshooting connectivity issues or undesirable behavior from the new Falcon sensors.

The workforce capital that could be expended towards managing the logging took full advantage of the various Ingestion capabilities within LogScale to consume the archival logs from an Amazon S3 bucket that contained Carbon Black telemetry and old events. Additionally, an existing Windows Event Collector could also be integrated with LogScale to ensure that the raw Windows event logs and Sysmon data was not lost. To fully unify the endpoint and network telemetry, ZScaler Internet Access (ZIA) data was consumed using the ZIA LogScale plugin from the CrowdStrike Marketplace.

With Query at the center of this migration-led architecture, the disparate data sources were normalized into the familiar Query Data Model, so while the teams came up to speed with learning FQL as well as the various features and idiosyncrasies of the Falcon platform they could rely on Query to be the expert query translation and normalization layer as they progressed.

Whether your data is centralized, decentralized, or somewhere in between, Query Federated Search can greatly bolster decision support within your SOC and wider organization when it comes to CrowdStrike. In this section you learned about three reference architectures where Query Federated Search works better together with CrowdStrike Falcon. Whether it is centralizing a search interface across different platforms, utilizing your data and data repositories in the best way, or supporting a migration: Query Federated Search is flexible enough for all challenges.

Conclusion

In this blog you learned about the CrowdStrike Falcon platform, such as its key features, data categorization, and capability sets. Additionally, you learned about what federated search is, and how Query Federated Search can help your SOC access and utilize CrowdStrike data whether it is from the direct APIs, LogScale, and/or FDR.

While the reference architectures covered did not account for every possible variation of data architectures or log sources, hopefully this illustrates the various overlaps and consumption models that can be supported by Query Federated Search. Whether you only use the Falcon platform for all of your security and management use cases, interoperate with several platforms, need to support a migration, or a large amount of host-based data: Query Federated Search works better together with CrowdStrike Falcon and supporting tooling.

If your mind is not at rest, and if questions linger on, contact us to see how Query Federated Search works in real-time with CrowdStrike data! If you wanted to initiate a Proof of Value with our federated search platform, or were more interested in learning more about SecDataOps reference architectures and workshops: reach out to our sales team!

Operators are standing by to help you operate more efficiently with security data.

Stay Dangerous

Contributed by:

Jonathan Rau

VP/Distinguished Engineer, Query