Projects

Six projects across the Python data and AI stack — from ETL pipelines to multi-agent systems.

AllDataMLAgentsApp Dev

Data Pipeline Automation Script

Planned

A CLI tool that ingests raw data from files or APIs, validates and transforms it with pandas, then loads it into a SQLite or PostgreSQL database. Full schema validation, logging, and pytest coverage.

PythonpandasSQLAlchemyTyperSQLitepytest

Read more →GitHub ↗

FastAPI Data Service

Planned

A production-ready REST API built on top of the data pipeline database. Supports filtering, pagination, and aggregation endpoints with full Pydantic validation and Swagger docs. Dockerized for deployment.

PythonFastAPISQLAlchemyPydanticDockerREST API

Read more →GitHub ↗

ML Model Training & Serving Pipeline

Planned

An end-to-end machine learning pipeline: data preprocessing, model training with experiment tracking via MLflow, and a FastAPI prediction endpoint. Reproducible and deployable.

Pythonscikit-learnMLflowFastAPIpandasXGBoost

Read more →GitHub ↗

Agentic Data Analyst

Planned

An AI agent that accepts natural language questions about a dataset and autonomously writes SQL queries, runs pandas operations, generates charts, and explains the results. Built with LangGraph and Streamlit.

PythonLangGraphLangChainOpenAISQLiteStreamlit

Read more →GitHub ↗

Automated ML Pipeline with Orchestration

Planned

A scheduled ML pipeline that detects new data, retrains the model, compares it against production, and auto-promotes if performance improves. Orchestrated with Prefect.

PythonPrefectscikit-learnMLflowPostgreSQLMLOps

Read more →GitHub ↗

Multi-Agent Research & Report System

Planned

A four-agent system where specialized AI agents collaborate to produce structured research reports. A Researcher, Analyst, Writer, and Reviewer work in sequence via LangGraph with a Streamlit interface.

PythonLangGraphCrewAIRAGOpenAIStreamlitChroma

Read more →GitHub ↗