Searching Historical CrowdStrike Data Stored in Amazon S3 Buckets
With over 70% of attacks originating at an endpoint, CrowdStrike, the leading Endpoint Detection and Response (EDR) tool, is a key control for strong security operations. CrowdStrike is optimized to detect attacks in real-time, and does an excellent job of doing so. However, novel attacks can occur without triggering the system, leaving the user with days or weeks of vulnerability until a patch or update enables protection against the new threat.
CrowdStrike offers variable data retention periods — ranging from 15-90 days — that depend on the type of data and the specifics of your contract, after which it is not available. This can mean that understanding an incident requires searching both current CrowdStrike data resident in the application, as well as older, historical records that are no longer in the application.
CrowdStrike is a key control for strong security operations, but leveraging historical data from CrowdStrike presents two major challenges:
With CrowdStrike only storing recent telemetry data, historical data must be moved and stored elsewhere. Once it is moved, it will no longer be available within the application with all of the attendant contextual and visual benefits.
You have two options:
- CrowdStrike data could be stored directly in your SIEM in order to continue to have the data available to security operators. But with the amount of data created and the high cost of SIEM storage, the cost is prohibitive; typically an extra $400,000 a year per 10,000 employees.
- OR, CrowdStrike can be stored in less expensive cloud storage solutions like Amazon S3 Buckets. CrowdStrike offers a separate paid subscription service known as CrowdStrike Falcon Data Replicator to offload data into S3 Buckets. This is much more cost effective, around $20,000 annually, but leads to our second problem.
While Amazon S3 Buckets make the most sense from a cost perspective, they make it much more difficult to use the data.
- Search and/or retrieval is difficult for the archived data. Analysts will have to download the files and then rely on doing raw text searches (grep, sed, awk, etc.) to find results.
- Once the results have been found, the analyst will have to manually put the results into context; determining relationships via network connections, timing, users, etc., as well as combining with data from other data sources — SIEM, HR, threat intelligence, etc.
Integrating the two different data sets is manual and challenging, requiring a different set of skills than typically found in a security operator.
Query provides a single search bar to simultaneously search current CrowdStrike data natively in the system, as well as historical data in S3 Buckets, or wherever it resides. Query is an open federated search solution for security.
Allows you to decide where and how CrowdStrike data is stored, so you can reduce cost without compromising on security response effectiveness or efficiency.
Enriches search results with context from other distributed security relevant data — from both security systems and relevant non-security systems — without needing to move or transform data ahead of time.
Visualizes data linkage and context to allow operators to quickly orient and act; eliminating alert fatigue and providing additional understanding and situational awareness.
Quickly enables operators to pivot from one question to the next; reducing time to investigate and respond to minutes instead of hours.
Managing CrowdStrike Data
in Amazon S3
Setting Up Query with CrowdStrike and Amazon S3
Using Query to Manage CrowdStrike and Amazon S3
“Before Query, our security team only had 14 days of live EDR data. Accessing and searching historical data required provisioning access to Amazon S3 buckets and writing complex queries manually. Now the entire team can search all security data from one search box in seconds, without thinking about where the data is stored or how to write queries.”
– Director of Security Operations — Software Company