Rodney Gitonga

Lab Challenges

A log of technical problems solved, CTFs completed, and practical experiments in data engineering and analytics.

Database Performance

Optimizing High-Volume Transaction Queries

View Write-up
The Problem

A critical daily report query was taking 45 minutes to execute, causing delays in morning briefings. The table contained over 10 million rows with multiple joins.

The Approach

Analyzed the execution plan and identified a full table scan on the transaction log table. Created a composite non-clustered index on transaction_date and branch_id. Refactored the query to replace a correlated subquery with a CTE.

Key Lessons

"Indexing strategies must align with query patterns. CTEs can improve readability and sometimes performance over subqueries."

Tools Used

SQL Server Management StudioSQL Execution PlanDatabase Engine Tuning Advisor
Power BI & Streaming

Real-time Fraud Detection Dashboard

View Write-up
The Problem

Fraud analysts needed to see suspicious transaction patterns in real-time, but the existing report was only updated once a day.

The Approach

Set up a Power BI streaming dataset connected to a Python script that monitored the transaction log API. Designed a dashboard with card visuals for immediate alerts and a line chart for trend analysis over the last hour.

Key Lessons

"Real-time dashboards require a different design philosophy than historical reports—focus on immediate outliers rather than deep aggregation."

Tools Used

Power BI ServicePython (Requests, Pandas)REST APIs
Python Automation

Automated Data Cleaning Pipeline

View Write-up
The Problem

Creating the monthly sales report involved manually cleaning and merging 15 different Excel files from various branches, taking 6 hours of manual work.

The Approach

Wrote a Python script using Pandas to iterate through the folder, read each Excel file, standardize column names, handle missing values, and merge them into a single master dataset ready for analysis.

Key Lessons

"Automation not only saves time but drastically reduces human error in data preparation tasks."

Tools Used

PythonPandasOpenPyXL