dbt: El Estandar para Transformaciones de Datos
dbt ha revolucionado como los Data Engineers construyen transformaciones. Permite escribir modelos en SQL puro con versionado en Git, tests automatizados y documentacion generada. En 2025, dbt es requisito en el 60% de ofertas de Data Engineering en LATAM.
# Instalar y crear proyecto
pip install dbt-bigquery
dbt init mi_warehouse
# Estructura de proyecto dbt
mi_warehouse/
dbt_project.yml
models/
staging/
stg_orders.sql
stg_customers.sql
intermediate/
int_daily_revenue.sql
marts/
dim_customers.sql
fct_orders.sql
tests/
macros/-- models/staging/stg_orders.sql
WITH source AS (
SELECT * FROM {{ source('raw', 'orders') }}
),
renamed AS (
SELECT
order_id::VARCHAR AS order_id,
user_id::VARCHAR AS customer_id,
LOWER(TRIM(status)) AS order_status,
total_amount::DECIMAL(12,2) AS order_total,
currency,
CASE country_code
WHEN 'AR' THEN 'Argentina'
WHEN 'MX' THEN 'Mexico'
WHEN 'CO' THEN 'Colombia'
WHEN 'BR' THEN 'Brasil'
END AS country_name,
created_at::TIMESTAMP AS ordered_at
FROM source
WHERE order_id IS NOT NULL AND total_amount > 0
)
SELECT * FROM renamed-- models/marts/fct_orders.sql (incremental)
{{ config(materialized='incremental', unique_key='order_id') }}
WITH orders AS (
SELECT * FROM {{ ref('stg_orders') }}
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}
),
customers AS (SELECT * FROM {{ ref('dim_customers') }})
SELECT
o.order_id, o.customer_id, c.customer_segment,
o.order_total, o.currency, o.country_name, o.ordered_at,
ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.ordered_at) AS order_sequence
FROM orders o LEFT JOIN customers c ON o.customer_id = c.customer_id🚀 dbt en Produccion
Comandos esenciales: dbt run (ejecutar), dbt test (tests), dbt docs generate (documentacion). En CI/CD: dbt build --select state:modified+ solo ejecuta lo que cambio.
dbt conecta el mundo del SQL con las mejores practicas de software engineering.