DATA ENGINEER • CLOUD • PIPELINES

Hi, I’m Piyusha
I build scalable data
systems.

I design reliable ETL workflows, improve data quality, and ship analytics-ready datasets using SQL, Python, Spark, Airflow and cloud services.

PySpark Airflow Snowflake AWS Databricks
Profile photo
4+ Years of Data Engineering
Entrepreneurship & Product Thinking
Marketing & Event Leadership

About

Who I am

I’m a Data Engineer who enjoys building data systems that teams can trust. My work sits at the intersection of engineering and analytics — turning raw, inconsistent inputs into clean, query-ready datasets that power dashboards, reporting, and product decisions.

I care about the details that make pipelines production-ready: data contracts, validation rules, monitoring, and recoverability. I also love simplifying complexity , designing workflows that are easy to maintain, easy to debug, and built to scale.

Beyond tools and pipelines, I value clarity and ownership. I take responsibility for the full lifecycle of data from ingestion to consumption, ensuring definitions are aligned, assumptions are documented, and downstream users can rely on the data. My goal is to make data products feel boringly reliable.

40%
Faster delivery
Pipeline optimization
200+
DQ issues fixed
Across client setups
30%
Less downtime
KPI monitoring
Data quality checks SLA-first pipelines Cost-aware design Clear documentation Ownership mindset Stakeholder-ready outputs

How I work

01

Design for trust

Define clear inputs/outputs, add validation checks, and keep datasets reproducible.

02

Make it observable

Monitor freshness, anomalies, failures, and SLA risks so issues are caught early.

03

Keep it maintainable

Use simple patterns, reusable components, and docs so pipelines scale with the team.

04

Ship with impact

Work backward from the question and deliver clean tables that drive decisions.

05

Communicate clearly

Write crisp updates, align definitions, and make trade-offs visible to stakeholders.

06

Own it end-to-end

Take responsibility for delivery, quality, and long-term reliability — not just code.

What you’ll get
Trust
Validated datasets + clear definitions
Visibility
Monitoring for freshness, failures, and SLAs
Speed
Clean tables that power decisions faster

Skills

Data Engineering icon

Data Engineering

Pipelines, orchestration, reliability

SQL Python Airflow Spark / PySpark Batch + incremental loads CDC patterns Backfills & reprocessing Data modeling (analytics-ready) Data contracts Schema evolution
SLAs Alerts RCA
Cloud Warehousing icon

Cloud & Warehousing

Scalable, cost-aware systems

Snowflake AWS (S3, EC2, Glue, Lambda) Databricks CloudWatch monitoring RDS Cost optimization Security basics (IAM) Azure GCP
Performance Partitioning Cost
Analytics icon

Analytics & Delivery

From raw data → decisions

Power BI Tableau Analytics engineering Metric definitions Dashboard-ready tables Data quality checks Freshness / volume monitoring REST APIs Git Jira Agile Stakeholder management
Stakeholders Requirements Shipping

Experience

Customer Data Engineer

Optimove · New York City, NY

Aug 2024 — Present
  • Optimized ETL workflows using SQL, Python, and Airflow, improving pipeline efficiency and reducing data delivery time by 40%.
  • Partnered with cross-functional teams to translate business requirements and resolve 200+ data quality issues across 10+ clients.
  • Built data-quality KPIs and monitoring to detect and resolve 50+ critical issues, driving a 30% reduction in downtime.

Business Development Analyst (Autonomous Trucking Model)

New Jersey Institute of Technology · Newark, NJ

Aug 2023 — May 2024
  • Conducted 20+ industry interviews to identify market trends and user pain points, informing product decisions.
  • Served as Entrepreneurial Lead for NSF I-Corps; led customer discovery and stakeholder analysis to uncover adoption barriers.
  • Delivered strategic recommendations to shift toward semi-autonomous capabilities aligned with market readiness.

Data Operations Analyst

New Jersey Institute of Technology · Newark, NJ

Jun 2023 — Aug 2023
  • Led an insurance provider transition and coordinated UAT across 3 implementation phases to ensure smooth rollout.
  • Created 50+ scripts for NJIT’s DegreeWorks platform, improving course enrollment efficiency and reducing scheduling errors.
  • Designed an ERD for the summer program student database, improving data retrieval speed by 25% and reducing redundancy.

Financial Data Analyst (CSR & Financial Performance)

New Jersey Institute of Technology · Newark, NJ

Mar 2023 — Jun 2023
  • Analyzed ESG datasets for 200+ companies using statistical methods to identify trends and generate CSR insights.
  • Curated and categorized 1,000+ keywords and assessed polarity to enable targeted narrative and performance analysis.
  • Built 40+ interactive dashboards in Power BI and Tableau to monitor KPIs for stakeholders.

Supply Chain Data Analyst

Faces Canada · India

Jan 2021 — Jul 2021
  • Analyzed ERP data for 500+ orders, identified inaccuracies, and improved fulfillment accuracy via centralized inventory workflow.
  • Designed automation strategies that streamlined warehouse operations, saving 50+ hours per month.
  • Improved order fulfillment and inventory tracking, reducing holding costs by 15% through tighter packing and stock controls.

Education

M.S. Information Systems

New Jersey Institute of Technology

GPA 3.95 / 4.0
Focus Data • Analytics • Cloud

B.E. Electronics & Communication

Punjab Engineering College

GPA 3.70 / 4.0
Strength Systems • Problem-solving

Certifications

Hands-on credentials across data + analytics.

IBM Big Data
BCG GenAI
PwC Power BI
Accenture / KPMG Analytics

Projects

Task Management Application preview

Task Management Application

User-focused task tracking application designed for clean workflows and usability.

UX Product Full-Stack
Makeup & Skincare preview

Makeup & Skincare E-Commerce

E-commerce experience showcasing product discovery, UI flows, and checkout journeys.

E-Commerce UI Demo
Supervised ML preview

Supervised Machine Learning Implementation

Applied supervised learning models with feature analysis and performance evaluation.

ML Python Analytics
Brain Tumor Detection

Brain Tumor Detection Using CNN

Deep learning pipeline using CNNs for medical image classification and detection.

CNN Deep Learning Computer Vision
Fast API

FastAPI: Capture Day-to-Day Events

Backend API built with FastAPI to log and manage daily events via REST endpoints.

FastAPI REST Backend
Netflix Dashboard

Netflix Analytics Dashboard

Interactive Power BI dashboard analyzing Netflix content and viewing trends.

Power BI Dashboard Insights
Pyspark

Scalable Data Processing with PySpark

End-to-end big data pipeline for scalable processing and analytics using PySpark.

PySpark ETL Big Data
Covid 19

COVID-19 Data Analysis

Exploratory analysis and visualization of global COVID-19 datasets.

EDA Python Visualization
Customer Retention

Customer Retention Dashboard

KPI-driven Power BI dashboard analyzing churn and customer retention patterns.

Power BI KPIs Business
Diversity Inclusion Dashboard

Diversity Inclusion Dashboard

Power BI dashboard analyzing diversity, inclusion, and workforce equity metrics.

Power BI Analytics Visualization
Call Center Trends Dashboard

Call Center Trends Dashboard

Interactive Power BI dashboard uncovering call volume, agent performance, and service trends.

Power BI Operations Analytics
Space Alpha Project

Space-Alpha

Hackathon project focused on space data exploration and intelligent analytics solutions.

Hackathon Data Innovation
CoalSea Project

CoalSea

Sustainability-focused analytics project developed during a global innovation hackathon.

Hackathon Sustainability Analytics
Cricket Analysis Dashboard

Cricket Analysis Dashboard

Sports analytics dashboard analyzing player performance and match trends using Power BI.

Power BI Analytics Visualization
BCG Generative AI Chatbot

BCG: Generative AI Chatbot

Python-based GenAI chatbot developed as part of BCG’s virtual internship program.

Python GenAI NLP
Accenture Data Analytics Project

Accenture: Data Analytics & Visualization

Business analytics and visualization project completed during Accenture virtual internship.

Analytics Visualization Consulting
Tata Online Retail Dashboard

Tata: Online Retail Store Dashboard

Data visualization project analyzing online retail performance and customer behavior.

Data Visualization Retail Analytics

Let’s talk

Want to collaborate or have a role in mind? Send a message — I’ll reply soon.