The indicators are there. Assessing historical data provides insights, uncovers patterns, and allows us to analyze our weaknesses to strengthen our defenses.

Historical Endpoint Detection and Response (EDR) data provides invaluable insight for security teams. Access to otherwise unnoticed patterns and trends allows you to proactively identify potential threats and strengthen your defenses against future attacks. Unfortunately, thoroughly assessing this data can be challenging. In this blog we will discuss the benefits, challenges, and solutions for unlocking the secret weapon of security investigations.

Note: Typical EDR retention varies by product.  For example, Carbon Black Cloud stores alerts generated from events for 180 days, however non-alert data is only stored for 30 days.  In Crowdstrike, telemetry data is stored for 15 days before a separate paid subscription service known as Falcon Data Replicator is required to offload that data into a separate S3 bucket.

Benefits of analyzing historical EDR

Historical EDR empowers us to evaluate trends, respond effectively to incidents, proactively hunt for threats, and meet compliance obligations. 

  • Incident response — EDR data supports investigations by contributing to the record of what occurred and when to determine scope and impact of attack and root cause.
  • Data forensics & historical analysis — EDR data provides insight into the state of an endpoint at the time of a potential attack, to determine if it was victim of a cyber attack previously.
  • Threat hunting — EDR data is useful in identifying indications of intrusion or susceptibility to a theoretical attack before it happens.
  • Compliance — EDR data can be used to demonstrate regulatory and other compliance, to support suits and contribute to ongoing risk analysis.

Challenges of incorporating historical EDR analysis

Data accessibility is arguably the biggest challenge of taking advantage of historical EDR. But it also requires storage capacity, time, effort, and expertise, all of which come at a cost.  

  • Data Volume — EDR systems produce a lot of data, almost none of which will be used again. But if a data point is relevant to an investigation, it is extremely important that it is available. And the more thorough the data, the more data volume.
  • Data storage costs and complexity — Storing EDR data is costly and complex. Storing the data in the system is very expensive but also transitory – EDR systems have short retention windows. Therefore, at least one additional storage solution is necessary. While it is almost certainly less expensive, the volume of data to be stored along with the need to store data going back many months drives more cost. And although more clever, less costly storage solutions are available, implementing them is costly in terms of time and effort.
  • Context — If the data is needed, ancillary data from the EDR system may be needed to provide important context. In addition, significant effort will be needed to normalize data so it can be used with data from other systems.
  • Urgency — In the event of an incident or investigation, time is of the essence. The relative lack of availability and need for significant preparation to make EDR data usable means that incident handlers or operators frequently do not have the time or skill set to use the data, and so it is ignored.


Managing these challenges allows you to take advantage of the power in your historical EDR data.  One search bar to simultaneously search current and archived EDR data, Query can deliver answers from EDR data wherever it is already stored.

Example: A recent threat report from IBM indicated that the current time to detect a breach is 169 days, and 69 days to contain it.  Data that would be required to identify and investigate a breach doesn’t always come from alert data, as indicated by the average time it takes to detect a breach.  None of the mainstream EDR products on the market today provide enough short term retention of EDR telemetry to detect a breach within that time period.  Therefore, it is likely that the critical data required to threat hunt for, or investigate a potential breach based on MITRE ATT&CK TTPs instead of alerts would not be retained by your EDR solution.

Making sense of it all

Storing this data in your SIEM is cost prohibitive.  You could leverage cloud storage options such as AWS S3, Azure Blob Storage, GCP Big Query, or Snowflake, however you would have to build an in-house customized solution for a front end, or teach your security analyst yet another query language, and give them access to another system, continuing to overcomplicate their operations, and delay breach detection — unless you had Query.  

With Query, you could offload that data into the much cheaper storage (S3, Snowflake, etc), retain it for longer periods of time, and have on-demand searching capabilities for that data without having to train your analyst on specialized SQL syntax to search it.  Saving you time, money, and ensuring your analysts are mission ready no matter where the data resides.

One search bar to simultaneously search current and archived EDR data, Query can deliver answers from EDR data wherever it is already stored. Happy Querying!