Google BigQuery

Google Cloud Platform (GCP) BigQuery is a fully-managed, serverless Enterprise Data Warehouse (EDW) that enables scalable analysis over large sets of data. You can execute SQL queries against massive datasets with rapid execution times. BigQuery handles the infrastructure, providing you with a an analytics engine that can pull insights from data with minimal management required. 

Within the Query Federated Search Platform, BigQuery is considered a dynamic schema platform that can contain any number of source data within it. For that purpose, Query provides a Configure Schema no-code workflow to allow users to easily introspect, auto-discover schema mapping opportunities and time-based partitions in the tables, and map the source data into the Query Data Model (QDM). This allows you to model nearly any logging or event data, or asset data, stored within a BigQuery dataset and table.

Query’s integration with BigQuery cloud solution allows analysts to search the following entities:

  • IP Addresses (IPv4 and IPv6)
  • Domains & Hostnames
  • URLs & URIs
  • Email Addresses
  • Usernames & User IDs
  • Email Addresses
  • File Hashes (e.g., MD5, SHA1, SHA256, etc.)
  • File Names or Directories
  • Resource IDs (e.g., Agent or Device IDs, cloud resource IDs)
  • Process Names
  • MAC Addresses

For example, the analyst could obtain the following context:

  • Because Google BigQuery is considered to be a “green field” for data, security architects and engineers can decide which large volume data source they would like to store this data in to search.
  • For example, network flow log, Windows event data, and DNS data are examples of rather large (noisy) telemetry.  Therefore you could put any of those into a Google BigQuery database and than search any of the entities above with Query.

To integrate Google BigQuery, see integration documentation here.

The integration will normalize data pulled from BigQuery into Query’s OCSF based QDM (Query Data Model) which then enables cross-platform joins, compounding the analyst’s ability to investigate. Query normalizes BigQuery data into QDM based on the schema mapping outlined by the analyst.  Analysts can see key identity attributes like name, email, credential UID, and account UID in the QDM user object. Additional authentication information from the BigQuery database, like last login time, and last IP logged in from, is extracted into the QDM Device object.

With the federated join capabilities, the analyst can now see context on that entity pulled from additional data sources Query is integrated with. Based upon additional integrations in your environment, Query will show you:

  • The user’s devices.
  • Additional alerts correlated with the user or the device, such as based upon email, web, or file activity.
  • Relevant follow up searches to get vulnerability, malware, and threat intelligence information associated with related entities like files, processes, and applications on that device.