DATA ENGINEER • CLOUD • PIPELINES

Hi, I’m Piyusha
I build scalable data
systems.

I design reliable ETL workflows, improve data quality, and ship analytics-ready datasets using SQL, Python, Spark, Airflow and cloud services.

Download Resume View Resume Contact

PySpark Airflow Snowflake AWS Databricks

4+ Years of Data Engineering

Entrepreneurship & Product Thinking

Marketing & Event Leadership

About

Who I am

I’m a Data Engineer who enjoys building data systems that teams can trust. My work sits at the intersection of engineering and analytics — turning raw, inconsistent inputs into clean, query-ready datasets that power dashboards, reporting, and product decisions.

I care about the details that make pipelines production-ready: data contracts, validation rules, monitoring, and recoverability. I also love simplifying complexity , designing workflows that are easy to maintain, easy to debug, and built to scale.

Beyond tools and pipelines, I value clarity and ownership. I take responsibility for the full lifecycle of data from ingestion to consumption, ensuring definitions are aligned, assumptions are documented, and downstream users can rely on the data. My goal is to make data products feel boringly reliable.

40%

Faster delivery

Pipeline optimization

200+

DQ issues fixed

Across client setups

30%

Less downtime

KPI monitoring

Data quality checks SLA-first pipelines Cost-aware design Clear documentation Ownership mindset Stakeholder-ready outputs

Hire / Collaborate View Experience

How I work

Design for trust

Define clear inputs/outputs, add validation checks, and keep datasets reproducible.

Make it observable

Monitor freshness, anomalies, failures, and SLA risks so issues are caught early.

Keep it maintainable

Use simple patterns, reusable components, and docs so pipelines scale with the team.

Ship with impact

Work backward from the question and deliver clean tables that drive decisions.

Communicate clearly

Write crisp updates, align definitions, and make trade-offs visible to stakeholders.

Own it end-to-end

Take responsibility for delivery, quality, and long-term reliability — not just code.

What you’ll get

Trust

Validated datasets + clear definitions

Visibility

Monitoring for freshness, failures, and SLAs

Speed

Clean tables that power decisions faster

Skills

Data Engineering

Pipelines, orchestration, reliability

SQL Python Airflow Spark / PySpark Batch + incremental loads CDC patterns Backfills & reprocessing Data modeling (analytics-ready) Data contracts Schema evolution

SLAs Alerts RCA

Cloud & Warehousing

Scalable, cost-aware systems

Snowflake AWS (S3, EC2, Glue, Lambda) Databricks CloudWatch monitoring RDS Cost optimization Security basics (IAM) Azure GCP

Performance Partitioning Cost

Analytics & Delivery

From raw data → decisions

Power BI Tableau Analytics engineering Metric definitions Dashboard-ready tables Data quality checks Freshness / volume monitoring REST APIs Git Jira Agile Stakeholder management

Stakeholders Requirements Shipping

Education

M.S. Information Systems

New Jersey Institute of Technology

GPA 3.95 / 4.0

Focus Data • Analytics • Cloud

B.E. Electronics & Communication

Punjab Engineering College

GPA 3.70 / 4.0

Strength Systems • Problem-solving

Certifications

Hands-on credentials across data + analytics.

IBM Big Data

BCG GenAI

PwC Power BI

Accenture / KPMG Analytics

Let’s talk

Want to collaborate or have a role in mind? Send a message — I’ll reply soon.

Hi, I’m Piyusha I build scalable data systems.