January 21, 2025

Amazon Redshift Integrated Into Query Federated Search

Query announces the Amazon Redshift Connector!

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.

The Amazon Redshift (“Classic”) service manages all of the work of setting up, operating, and scaling a data warehouse. These tasks include provisioning capacity, monitoring and backing up the cluster, and applying patches and upgrades to the Amazon Redshift engine. An Amazon Redshift cluster is a set of nodes, which consists of a leader node and one or more compute nodes. The type and number of compute nodes that you need depends on the size of your data, the number of queries you will run, and the query runtime performance that you need. Depending on your data warehousing needs, you can start with a small, single-node cluster and easily scale up to a larger, multi-node cluster as your requirements change. You can add or remove compute nodes to the cluster without any interruption to the service.

Security teams make use of Redshift clusters for security analytics and aggregated use cases within a warehouse. For instance, they can combine data feeds from several downstream databases, SIEMs, XDRs, and flat files to create complex analytics datasets. From there, these can be fed into Business Intelligence tools or other bespoke reporting. However, some teams process such high volumes of data that they also benefit from the columnar-wise orientation and speed on a data warehouse. Regardless, Query Federated Search supports integrations with any table, view, materialized view, or even external dataset with Amazon Redshift Spectrum.

The Query normalization functionality is built around the Open Cybersecurity Schema Framework (OCSF) – named the Query Data Model (QDM) – which expresses all search intents with OCSF/QDM concepts such as Entities/Observables used to represent facts and indicators whereas Event Classes represent things that have happened and are normalized against network, application, file system, identity, and 1st party security findings.

With Query, you do not need to author any SQL and you are also blocked from dispatching notional SQL against your warehouse resources. Query handles the full end-to-end query translation, planning, execution, and normalization of results. Query provides a no-code workflow to map your source data into the OCSF/QDM format so you do not need to craft additional ETL resources or views to take advantage of having the same schemas for your security data.

Constraints and features

Query can only map one table/view per Connector. You will need to create multiple Connectors and mappings per table/view/etc you have in your Redshift Clusters. Each Connector generates a distinct IAM Role External ID as well, you can create multiple Roles per Connector, or define an array of External IDs in the IAM Role Trust Policy.
Query requires an external reachable Cluster endpoint hostname to connect to. This requires Public Access enabled on your Cluster and a DB Subnet Group in your Public Subnets. Please contact your Query TAM or CSM for our IP ranges to use with a Security Group. You should only allow whichever Port you configured on your cluster access to the specific Query IP Address(es).
Query can use Basic Authentication (Username/Password) OR IAM-role based Authentication to generate temporary roles.

For more information see the docs here

Contributed by:

Query

Simplifying Search