Skip to content

iTrauco/experiments-framework

Repository files navigation

Modular ML Experiments Framework

Repositoryexperiments-framework

A systematic framework for machine learning experiments with modular workflow patterns. Built from lessons learned through iterative experimentation to enable reproducible research.


Table of Contents


Project Structure

experiments-framework/
├── notebooks/                                  # all jupyter notebooks
│   ├── scripts/                               # notebook utility scripts
│   ├── templates/                             # clean starting templates
│   │   ├── 01_preprocessing.working.ipynb     # data prep template
│   │   ├── 02_annotation.working.ipynb        # labeling template
│   │   ├── 03_training.working.ipynb          # model training template
│   │   └── systems.working.ipynb              # infrastructure template
│   ├── machine_learning/                      # ml development
│   │   ├── 01_preprocessing.dev.ipynb         # active preprocessing
│   │   ├── 02_annotation.dev.ipynb            # active annotation
│   │   ├── 03_training.dev.ipynb              # active training
│   │   ├── preprocessing/                     # preprocessing experiments
│   │   ├── annotation/                        # annotation experiments
│   │   └── training/                          # training experiments
│   └── systems/                               # infrastructure notebooks
│       └── systems.dev.ipynb                  # systems development
├── data/                                      # data organization
│   ├── raw/                                   # original recordings
│   ├── clips/                                 # extracted video clips
│   ├── frames/                                # extracted images
│   └── annotations/                           # labels and metadata
├── configs/                                   # workflow configurations
├── models/                                    # trained ml models
├── scripts/                                   # project setup scripts
│   ├── setup_experiments_structure.sh         # creates dirs/notebooks
│   ├── setup_provenance.sh                    # tracks repo evolution
│   └── setup_orcid.sh                         # citation setup
├── lib/                                       # reusable code modules
│   └── notebook_tools/                        # notebook utilities
├── references/                                # citations and refs
├── environment.yml                            # conda environment
└── PROVENANCE.md                              # repo history tracking

Scope

  • Modular workflow patterns for preprocessing, annotation, and training
  • Clear separation of concerns across directories
  • Reproducible structure for researchers and students
  • Minimal dependencies with documented environment setup

Previous Work

  • Primary(ACTIVE) development repo → traffic-vision-v0.4
  • Prior(DEPRECATED) experimental repo → experiments-test
  • Achieved successful vehicle counting on 21 of 30 GDOT traffic camera feeds
  • Framework failed due to monolithic notebooks and environment conflicts
  • Individual notebook execution became a bottleneck

This repo rebuilds the workflow framework to be modular and scalable.

Quick Start

  1. Clone and setup:

    git clone https://github.com/iTrauco/experiments-framework.git
    cd experiments-framework
    chmod +x scripts/*.sh
    ./scripts/setup_experiments_structure.sh
  2. Track your work:

    ./scripts/setup_provenance.sh
  3. Start developing in the .dev notebooks or copy templates to begin new experiments.


Notebook Tools Installation

cd /path/to/notebook_tools
pip install -e .

This installs the library in "editable" mode - any changes you make to the code are immediately available without reinstalling.

⚠️ Development Status: All modules in lib/ are early-stage development prototypes. Functionality is still being worked out - some modules may be dead code, others are spaghetti. Creating modular packages as I identify what's killing my bandwidth.


Reproducibility Framework

Environment Setup

This project uses a Conda environment to manage dependencies for reproducible analysis. Follow these steps to set up the environment:

Prerequisites

  • Anaconda or Miniconda installed on your system
  • Git for cloning the repository

Setup Instructions

  1. Clone the repository:

    git clone https://github.com/iTrauco/experiments-framework.git
    cd experiments-framework
  2. Create the Conda environment:

    conda create -n traffic-vision-env python=3.11 -y
  3. Activate the environment:

    conda activate traffic-vision-env
  4. Install baseline packages:

    conda install -c conda-forge jupyter numpy pandas matplotlib seaborn scikit-learn opencv -y
  5. Install deep learning and computer vision packages:

    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    pip install ultralytics supervision
  6. Launch Jupyter Notebook:

    jupyter notebook
  7. Access the notebook in your browser via the URL displayed in the terminal.


Environment Details

The environment includes essential data science and computer vision packages:


Scripts

setup_experiments_structure.sh

Creates the complete project directory structure with interactive menu. Features:

  • Default creates in parent directory (../)
  • Remembers last used location
  • Rollback option to undo
  • Installs GitHub Actions and pre-commit hooks

setup_provenance.sh

Documents your experimental evolution across repository rebuilds:

  • Links to previous repos and branches
  • Records what you tested and learned
  • Builds a timeline in PROVENANCE.md

setup_orcid.sh

Adds ORCID identifier and citation infrastructure to your project.


Workflows

Development Flow

  1. Copy templates from notebooks/templates/ to start new work
  2. Develop in .dev notebooks at the machine_learning level
  3. Create experimental variations in subdirectories
  4. Track successful patterns in LESSONS_LEARNED.md files

GitHub Actions

Automatically converts notebooks to markdown on push for better documentation and diffs.

Pre-commit Hooks

Validates notebook metadata and cleans outputs before commits.


Environment Management

For collaborators who enhance the environment with additional packages:

# Export the updated environment
conda activate traffic-vision-env
conda env export > environment.yml

This ensures full reproducibility across systems by preserving all dependencies and versions.


Author: Christopher Trauco | ORCID: 0009-0005-8113-6528

Packages

 
 
 

Contributors

Languages