Operational Data Platforms
Project: Logicworks Cloud Compliance Dashboard & Data Protection Lakehouse
Tech Stack: Databricks, AWS APIs, Python, SQL
The Challenge
The native AWS Management Console has a critical blind spot: it displays the status of backups for resources that are configured for backup, but the console does not help identify resources that should exist in the backup scope but were somehow missed – often “shadow IT” or misconfigurations.
The Solution
I developed a Databricks Data Lakehouse solution to serve as the "Source of Truth" for data protection compliance:
Inventory Reconciliation: Built pipelines to ingest 2 distinct datasets: a real-time inventory of cloud resources (EC2, RDS, EBS), AWS Backups logs.
Gap Analysis Logic: Engineered SQL logic to identify gaps - resources that existed in prod environments but were somehow missing backup configurations.
Drift Detection: Created logic to compare Expected Backups vs. Actual Jobs to identify silent failures caused by degraded permissions or stuck jobs.
The Impacts
Risk Reduction: Eliminated "silent failures" where assets were assumed to be backed up but were not.
Automated Operations: Deployed triggers to alert the support ops team when protection gaps were detected, reducing the time-to-remediate from weeks to minutes.