Available for senior data engineering roles

Hi, I'm Sai Kiran.
I build cloud data platforms that scale.

Senior GCP Data Engineer with 9+ years designing enterprise-scale data platforms for Fortune 100 and healthcare clients — specializing in BigQuery, Airflow, PySpark, and cost-optimized ETL/ELT pipelines.

9+
Years Experience
20TB+
Data Migrated
200+
Pipelines Built
5
Enterprise Clients
01 — About

A bit about me

I'm a Senior GCP Data Engineer based in Dallas, TX, with deep expertise in building enterprise-scale, cloud-native data platforms. Over the past 9+ years I've led multiple end-to-end Teradata-to-GCP migrations, automated UAT reconciliation workflows, and built cost-optimized, HIPAA-compliant pipelines processing multi-terabyte workloads daily.

My core strength is bringing strong software engineering practices to data: CI/CD, automated data quality, Infrastructure as Code (Terraform), and version-controlled pipeline development. I care about pipelines that are reliable, observable, and cheap to run at scale.

I've delivered for clients including Verizon, Tenet Healthcare, DriveWealth, and TJX — cutting query latency, slashing compute spend, and shrinking migration cycles from weeks to days.

LocationDallas, TX
Experience9+ Years
FocusGCP / BigQuery
EducationM.S. Information Science
CertifiedGCP Pro Data Engineer
Emailkirane8989@gmail.com
02 — Skills

Technical toolkit

☁️ Cloud Platforms

BigQueryDataprocCloud ComposerGCSPub/SubDataflowVertex AIAzure ADFDatabricks

⚙️ Data Engineering

PySparkScaladbtDelta LakeKafkaHiveSqoopInformatica

💻 Languages

PythonSQLScalaSpark SQLShell / Bash

🗄️ Databases & Warehouses

BigQuerySnowflakeTeradataPostgreSQLSQL ServerOracleMongoDB

🔧 DevOps & IaC

Git / GitHubGitHub ActionsCloud BuildTerraformDockerCI/CD

📊 Quality & Viz

Great ExpectationsDataplexCloud MonitoringTableauLookerPower BIQlik
03 — Experience

Where I've worked

GCP Data Engineer
Verizon · Irving, TX
Mar 2025 – Present
  • Led Teradata → BigQuery migration of 20 TB across 250+ production tables with zero data loss.
  • Designed 65+ Cloud Composer (Airflow) DAGs processing 4 TB/day into BigQuery.
  • Built a Python reconciliation framework cutting UAT validation from 3 days to 4 hours (~80%).
  • Optimized BigQuery via partitioning & clustering — 45% lower latency, ~$18K/mo savings.
  • Automated migration of 150+ legacy scripts using Claude AI, cutting cycles from 6 weeks to under 2.
GCP Data Engineer
Tenet Healthcare · Dallas, TX
Mar 2023 – Feb 2025
  • Architected HIPAA-compliant GCP infra processing 1.5 TB/day across 55 hospital systems at 99.9%+ uptime.
  • Rebuilt 120+ Informatica ETL mappings as Composer DAGs & BigQuery SQL — 35% faster runtime.
  • Optimized Dataproc PySpark jobs (broadcast joins, caching) — 40% faster processing.
  • Built a GCS + BigQuery + Snowflake data lake enabling 10 teams to self-serve; insight time 5 days → 4 hours.
GCP Data Engineer
DriveWealth · New York, NY
Jan 2021 – Feb 2023
  • Built financial ingestion pipelines processing 15M daily transactions from 6 sources at 99%+ SLA.
  • Developed PySpark jobs enabling real-time aggregations at 3+ billion-row scale.
  • Designed a Snowflake warehousing layer integrated with BigQuery for 12 downstream consumers.
Azure Data Engineer
TJX Companies · Framingham, MA
Jul 2019 – Dec 2020
  • Designed ADF pipelines migrating 8 TB of retail data from SQL Server to Azure Data Lake / SQL DWH.
  • Built Databricks PySpark + Delta Lake ETL for real-time inventory and sales analytics.
Hadoop Developer
Evoke Technologies · India
Oct 2016 – Oct 2018
  • Built MapReduce, Hive, and PySpark pipelines on Hadoop/HDFS; designed HBase schemas.
  • Automated data ingestion from MySQL via Sqoop.
04 — Projects

Featured work

💡 These are template project cards. As you publish repos on GitHub, replace the titles, descriptions, and links below — or just send me the repo links and I'll wire them in automatically.
📁

Teradata → BigQuery Migration Toolkit

Python + LLM-assisted SQL dialect converter that translates legacy Teradata scripts and stored procedures into optimized BigQuery SQL, with automated reconciliation.

PythonBigQueryClaude AI
📁

Airflow ETL Pipeline Framework

Reusable Cloud Composer (Airflow) DAG framework for multi-stage ETL with built-in data quality checks, alerting, and CI/CD deployment via GitHub Actions.

AirflowPythonGCP
📁

Data Quality & Reconciliation Engine

Automated validation framework using Great Expectations and custom Python checks to catch anomalies before they reach downstream BI reports.

PythonGreat ExpectationsSQL
05 — Contact

Let's build something together

I'm open to senior data engineering roles and consulting. Whether you have a question or just want to connect, my inbox is always open.