There are real cost to usability trade-offs when it comes to cybersecurity data storage that have long term consequences.
Storage costs increase as you move to more dedicated and structured applications. This is typically justified with the expectation that:
- Search response times, or query run times, are faster with indexed platforms, and
- The cybersecurity data schema and interface are more security analyst-friendly if the platform collects data and indexes it within itself.
These expectations were accurate in the past. Now, with Query, everything changes. This blog discusses how to leverage inexpensive cloud storage with a single search bar to access and understand the data using Query’s open federated search to reduce costs, reduce search time, and improve outcomes without moving your data.
Costs of Centralizing
Using SIEM and EDR to maintain visibility is a natural starting point for security teams, but the larger and more complex your infrastructure, the more data you need to analyze, and the more costly storing data becomes. Costs vary, but let’s make sizing and pricing calculations ballparked from these references:
- Take endpoint data sizing estimates from Crowdstrike Product FAQ. Based upon this, we assume an endpoint generates 5MB of compressed data per day.
- Take SIEM’s per day ingestion based pricing information from Splunk’s SaaS pricing through the Marketplace. Annual pricing of 100GB daily is $80,000.
- Estimate compressed CrowdStrike data to indexed data expansion to be 10x, as per this Splunk help article.
If we assume a 10,000 employee company, based upon above, we estimate the daily SIEM ingestion of EDR data to be 488 GB / day (10,000 endpoints x 5MB daily compressed data per endpoint x 10 times expansion for indexing / 1024).
Since 100GB daily is $80,000 annually as per above SIEM pricing reference, the total annual SIEM cost of our EDR data example scenario comes to be $390,625, assuming 10,000 endpoints (($80,000 / 100GB) x 488GB).
Query reduces cost in a couple of ways. It enables you to store data in platforms with cheaper unit costs, i.e. store more data in the less expensive platforms vs. the more expensive dedicated application platforms. And it lets you search through multiple data sources residing in your blob storage without moving or duplicating the data. Since there is no “install” or data migration needed, adding a new data source can be done in minutes.
Yes, minutes — check out this video demo to see.
Cost savings will vary based on your actual data volumes and technology costs, but as an example, with Query, you can use your cloud provider’s blob storage, such as S3 on AWS, instead of migrating all of your data to your SIEM. (This blog details how to use Query open federated search with S3 specifically.) Let’s calculate costs with these references:
- S3 storage pricing is under $.025/GB per month as per Amazon S3 Simple Storage Service Pricing
- S3 query pricing is $5 per TB of scanned data, as per Amazon Athena Pricing – Serverless Interactive Query Service
NOTE: Above are for Amazon S3, but look for equivalents from your cloud provider.
For our example scenario, the one year compressed data in S3 comes out to be 17,822GB, since we have 5MB per endpoint per day x 10,000 endpoints x 365 days. This storage only costs about $5,346 per year (17,822GB x $.025/GB per month x 12 months).
Beyond storage, you do have to pay $5/TB to scan data. The analyst usage and query pattern is harder to estimate. Also, there are optimizations like caching. Nevertheless, let’s conservatively ballpark the query costs to be $18,250 over the year assuming ~10TB data is scanned by analysts every day, even during weekends and holidays. This still leads to total S3 storage and query costs of $23,596 every year!
So our overall conservative cost reduction of using cloud blob storage vs SIEM is $390,625 – $23,596 = $367,029 per year (for a 10,000 employee organization). And this is just for the EDR data. The price difference can quickly add up as more data sources are added. Query licenses start with up to 5 data integrations, so in this example, you can still add four more to compound the savings (and visibility).
Overall, the cost savings are staggering. You can reduce ~80% of incremental SIEM storage costs by moving data into cloud blob storage. Query is licensed based upon number of analysts and integrations, therefore is a fixed cost independent of data size/volume: starting at $5,000 per month for up to 5 integrations and 5 users. With our conservative numbers, for a 10,000 employee organization you are saving $307,029/year ($390,625 – $60,000 – $23,596) while increasing visibility and decreasing mean time to respond (MTTR). Additionally, in this scenario, four more data integrations can be added to Query for no additional cost, further increasing analyst data visibility and cost savings potential.
Reducing Search Time
When using federated search, time to query is less of a challenge because queries can run in parallel since the sources are accessed using each individual platform’s API. An investigation that would have normally taken hours now takes minutes, because Query is able to simultaneously search multiple platforms.
In my previous blog, we created a formula for calculating Analysts’ Search per Investigation (ASPI), and were able to determine a 500% decrease in ASPI using Query. It provides a focused and interactive UI interface for analysts to easily perform their security investigations. Open Federated Search doesn’t care where the data resides and can apply a common cybersecurity schema to correlate data across different platforms.
Setting up Query is quick and painless. Point the solution to your storage, configure access, configure the data model, and you are ready to search, visualize, filter, investigate, and pivot. You get complete control of your data from one console regardless of its location or format.
The third area of improvement is even more difficult to quantify: performance improvement.
We saw in the third blog of the series how federated search was much more efficient for analysts than the current process in our malware investigation use-case. With Query, your analysts instantly have visibility to the relevant data for an investigation.
This saves time, which means higher productivity per analyst. But, it also means more comprehensive data to contribute to the investigation in most cases. Searches are tedious and time-consuming manual tasks that cause many analysts to perform only the minimum amount of search they deem necessary to find an answer. This results in partial and potentially incorrect answers. With Query, the additional data available results in more accurate and complete answers.
Summary of the Series
Based on the conservative examples discussed in this series, Query’s open federated search for security solution was five times more efficient in our malware investigation use-case, decreased security data storage and access costs by ~80 percent, and expanded visibility for more complete searches.
By providing a single place for all your results, Query results in huge cost savings because data does not need to be duplicated or moved around. Query’s open federated search enables choice regarding where to keep what data in what platform while still giving one single interface to search and investigate. With Query, we found that the security team gets the flexibility to keep data in original platforms or an intermediary at a much lower cost. Overall, Query’s open federated search provides visibility, saves analyst time, and reduces infrastructure and licensing costs.