Modular ML Experiments Framework

A systematic framework for machine learning experiments with modular workflow patterns. Built from lessons learned through iterative experimentation to enable reproducible research.

Project Structure

experiments-framework/
├── notebooks/                                  # all jupyter notebooks
│   ├── scripts/                               # notebook utility scripts
│   ├── templates/                             # clean starting templates
│   │   ├── 01_preprocessing.working.ipynb     # data prep template
│   │   ├── 02_annotation.working.ipynb        # labeling template
│   │   ├── 03_training.working.ipynb          # model training template
│   │   └── systems.working.ipynb              # infrastructure template
│   ├── machine_learning/                      # ml development
│   │   ├── 01_preprocessing.dev.ipynb         # active preprocessing
│   │   ├── 02_annotation.dev.ipynb            # active annotation
│   │   ├── 03_training.dev.ipynb              # active training
│   │   ├── preprocessing/                     # preprocessing experiments
│   │   ├── annotation/                        # annotation experiments
│   │   └── training/                          # training experiments
│   └── systems/                               # infrastructure notebooks
│       └── systems.dev.ipynb                  # systems development
├── data/                                      # data organization
│   ├── raw/                                   # original recordings
│   ├── clips/                                 # extracted video clips
│   ├── frames/                                # extracted images
│   └── annotations/                           # labels and metadata
├── configs/                                   # workflow configurations
├── models/                                    # trained ml models
├── scripts/                                   # project setup scripts
│   ├── setup_experiments_structure.sh         # creates dirs/notebooks
│   ├── setup_provenance.sh                    # tracks repo evolution
│   └── setup_orcid.sh                         # citation setup
├── lib/                                       # reusable code modules
│   └── notebook_tools/                        # notebook utilities
├── references/                                # citations and refs
├── environment.yml                            # conda environment
└── PROVENANCE.md                              # repo history tracking

Scope

Modular workflow patterns for preprocessing, annotation, and training
Clear separation of concerns across directories
Reproducible structure for researchers and students
Minimal dependencies with documented environment setup

Previous Work

Primary(ACTIVE) development repo → traffic-vision-v0.4
Prior(DEPRECATED) experimental repo → experiments-test
Achieved successful vehicle counting on 21 of 30 GDOT traffic camera feeds
Framework failed due to monolithic notebooks and environment conflicts
Individual notebook execution became a bottleneck

This repo rebuilds the workflow framework to be modular and scalable.

Quick Start

Clone and setup:

git clone https://github.com/iTrauco/experiments-framework.git
cd experiments-framework
chmod +x scripts/*.sh
./scripts/setup_experiments_structure.sh

Track your work:
```
./scripts/setup_provenance.sh
```
Start developing in the .dev notebooks or copy templates to begin new experiments.

Notebook Tools Installation

cd /path/to/notebook_tools
pip install -e .

This installs the library in "editable" mode - any changes you make to the code are immediately available without reinstalling.

⚠️ Development Status: All modules in lib/ are early-stage development prototypes. Functionality is still being worked out - some modules may be dead code, others are spaghetti. Creating modular packages as I identify what's killing my bandwidth.

Reproducibility Framework

Environment Setup

This project uses a Conda environment to manage dependencies for reproducible analysis. Follow these steps to set up the environment:

Prerequisites

Anaconda or Miniconda installed on your system
Git for cloning the repository

Setup Instructions

Clone the repository:

git clone https://github.com/iTrauco/experiments-framework.git
cd experiments-framework

Create the Conda environment:

conda create -n traffic-vision-env python=3.11 -y

Activate the environment:
```
conda activate traffic-vision-env
```

Install baseline packages:

conda install -c conda-forge jupyter numpy pandas matplotlib seaborn scikit-learn opencv -y

Install deep learning and computer vision packages:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install ultralytics supervision

Launch Jupyter Notebook:
```
jupyter notebook
```
Access the notebook in your browser via the URL displayed in the terminal.

Environment Details

The environment includes essential data science and computer vision packages:

Python 3.11
Jupyter Notebook
pandas & numpy for data manipulation
matplotlib & seaborn for visualization
scikit-learn for traditional ML algorithms
OpenCV for image and video processing
PyTorch for deep learning model development
Ultralytics for YOLO object detection
Supervision for object tracking utilities

Scripts

setup_experiments_structure.sh

Creates the complete project directory structure with interactive menu. Features:

Default creates in parent directory (../)
Remembers last used location
Rollback option to undo
Installs GitHub Actions and pre-commit hooks

setup_provenance.sh

Documents your experimental evolution across repository rebuilds:

Links to previous repos and branches
Records what you tested and learned
Builds a timeline in PROVENANCE.md

setup_orcid.sh

Adds ORCID identifier and citation infrastructure to your project.

Workflows

Development Flow

Copy templates from notebooks/templates/ to start new work
Develop in .dev notebooks at the machine_learning level
Create experimental variations in subdirectories
Track successful patterns in LESSONS_LEARNED.md files

GitHub Actions

Automatically converts notebooks to markdown on push for better documentation and diffs.

Pre-commit Hooks

Validates notebook metadata and cleans outputs before commits.

Environment Management

For collaborators who enhance the environment with additional packages:

# Export the updated environment
conda activate traffic-vision-env
conda env export > environment.yml

This ensures full reproducibility across systems by preserving all dependencies and versions.

Author: Christopher Trauco | ORCID: 0009-0005-8113-6528

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modular ML Experiments Framework

Table of Contents

Project Structure

Scope

Previous Work

Quick Start

Notebook Tools Installation

Reproducibility Framework

Environment Setup

Prerequisites

Setup Instructions

Environment Details

Scripts

setup_experiments_structure.sh

setup_provenance.sh

setup_orcid.sh

Workflows

Development Flow

GitHub Actions

Pre-commit Hooks

Environment Management

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
configs		configs
data		data
lib		lib
notebooks		notebooks
provenance		provenance
references		references
scripts		scripts
.gitignore		.gitignore
PROVENANCE.md		PROVENANCE.md
README.md		README.md
environment.yml		environment.yml

Folders and files

Latest commit

History

Repository files navigation

Modular ML Experiments Framework

Table of Contents

Project Structure

Scope

Previous Work

Quick Start

Notebook Tools Installation

Reproducibility Framework

Environment Setup

Prerequisites

Setup Instructions

Environment Details

Scripts

setup_experiments_structure.sh

setup_provenance.sh

setup_orcid.sh

Workflows

Development Flow

GitHub Actions

Pre-commit Hooks

Environment Management

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages