Open-source ELT platform with a huge connector catalog for syncing data into warehouses.
A snapshot of the tools data teams are picking up — and the ones holding their ground — across every stage of the modern data pipeline.
Open-source ELT platform with a huge connector catalog for syncing data into warehouses.
Managed, fully automated data pipelines that move source data into your warehouse.
SQL-first transformation framework that turns raw warehouse data into analytics-ready models.
Distributed event streaming platform powering real-time pipelines at internet scale.
Cloud data warehouse with separated storage and compute, the default at most enterprises.
Lakehouse platform unifying data engineering, analytics, and AI on a single store.
Open table format that adds ACID transactions and versioning to data lakes.
Open table format for huge analytic datasets, increasingly the lakehouse standard.
Battle-tested workflow scheduler for authoring, scheduling, and monitoring data pipelines.
Modern Python-native orchestrator focused on dynamic, code-first workflow definitions.
Asset-aware orchestrator that treats data products as first-class, observable entities.
The de facto standard for in-warehouse SQL transformations, testing, and documentation.
Next-gen transformation framework with virtual environments and column-level lineage.
In-process analytical SQL engine that runs anywhere — laptop, browser, or pipeline.
Serverless DuckDB in the cloud, blending local and cloud analytics in one engine.
Unified analytics suite combining ingestion, storage, BI, and AI on the OneLake foundation.
The same pipeline, built entirely on Google Cloud's managed data tools — no open-source assembly required. Each layer maps to a first-party GCP service.
Global, serverless messaging for real-time event ingestion at any scale.
Serverless CDC that streams changes from operational databases straight into BigQuery.
Unified stream and batch processing built on Apache Beam, fully managed end-to-end.
Visual, code-free ETL service for building and operating data pipelines on GCP.
Serverless cloud data warehouse with planet-scale SQL and built-in ML.
Unified storage layer that brings warehouse governance to lake formats like Iceberg.
Durable, multi-region object storage that anchors most GCP-native data lakes.
Petabyte-scale, low-latency NoSQL store for time-series and operational analytics.
Fully managed Apache Airflow for authoring and scheduling pipelines on GCP.
Serverless orchestrator that chains Google Cloud APIs and HTTP services in YAML.
Managed cron for triggering jobs, functions, and pipelines on a fixed schedule.
SQL workflow framework for building, testing, and versioning ELT directly in BigQuery.
Train and serve ML models with plain SQL inside BigQuery — no data movement required.
Managed Spark and Hadoop, with serverless modes for elastic, large-scale transforms.
Enterprise BI platform with a semantic layer (LookML) that governs every metric.
Free, shareable dashboards with native BigQuery, Sheets, and GA connectors.
Intelligent data fabric for governance, lineage, and quality across your lakehouse.
Unified ML platform spanning training, serving, and generative AI on GCP data.