Cloud Computing para Data Engineers
El 95% de los pipelines modernos corren en la nube. En LATAM, AWS lidera con ~45% de market share, seguido de Azure (~25%) y GCP (~20%). GCP tiene la ventaja de BigQuery, posiblemente el mejor data warehouse.
| Servicio | AWS | GCP | Azure |
|---|---|---|---|
| Data Warehouse | Redshift | BigQuery | Synapse |
| Object Storage | S3 | GCS | Blob Storage |
| Spark Managed | EMR | Dataproc | HDInsight |
| Streaming | Kinesis | Pub/Sub + Dataflow | Event Hubs |
| Orquestacion | MWAA (Airflow) | Cloud Composer | Data Factory |
# Terraform - Data Platform en AWS
resource 'aws_s3_bucket' 'data_lake' {
bucket = 'empresa-data-lake-latam'
lifecycle_rule {
enabled = true
transition {
days = 90
storage_class = 'GLACIER'
}
}
}
resource 'aws_redshift_cluster' 'warehouse' {
cluster_identifier = 'warehouse-latam'
database_name = 'analytics'
node_type = 'ra3.xlplus'
number_of_nodes = 2
}🚀 Certificaciones Valoradas
AWS: Data Analytics Specialty ($300 USD). GCP: Professional Data Engineer ($200 USD). Databricks: Data Engineer Associate ($200 USD). Una certificacion puede aumentar tu salario 15-25% en LATAM.
| Tamano | Stack | Costo Mensual |
|---|---|---|
| Startup (<10GB) | BigQuery + Airbyte + dbt | USD 50-200 |
| Mediana (10GB-1TB) | Redshift + Airflow + dbt + S3 | USD 500-2,000 |
| Grande (1TB+) | Databricks + Kafka + Delta Lake | USD 5,000-50,000+ |
Elige una cloud, dominala profundamente, y aprende conceptos transferibles.