Peter Schafhalter∗, Alexander Krentsel∗, Hongbo Wei, Joseph E. Gonzalez, Sylvia Ratnasamy (UC Berkeley), Scott Shenker (UC Berkeley and ICSI), Ion Stoica (UC Berkeley).
This repository is the official codebase for the following NINeS 2026 conference paper:
TURBO: Utility-Aware Bandwidth Allocation for Cloud-Augmented Autonomous Control. Peter Schafhalter∗, Alexander Krentsel∗, Hongbo Wei, Joseph E. Gonzalez, Sylvia Ratnasamy (UC Berkeley), Scott Shenker (UC Berkeley and ICSI), Ion Stoica (UC Berkeley).
For a talk given on this research project, see this video presentation.
This repository contains a research prototype system that optimizes object detection accuracy for autonomous vehicles by dynamically allocating bandwidth and selecting model configurations across multiple perception services based on real-time network conditions.
Developed by students at UC Berkeley NetSys Lab.
An autonomous vehicle running multiple camera-based perception services faces a fundamental challenge: how to maximize detection accuracy when offloading inference to the cloud over a bandwidth-constrained network?
This system solves that problem through:
- High-performance QUIC transport: s2n-quic with BBR congestion control enables efficient, multiplexed data transfer between AV and cloud
- Utility-based bandwidth allocation: Linear programming solver runs every 500ms to optimally allocate bandwidth across services, maximizing total detection accuracy
- Adaptive model selection: Dynamically switches between EfficientDet variants (D1-D7x) and compression strategies based on real-time network conditions
- SLO-aware processing: LIFO queue management and timeout enforcement ensure only fresh, timely detection results are used for driving decisions
Key Features:
✅ Multi-camera support — Simultaneous perception from multiple USB cameras (FRONT, FRONT_LEFT, FRONT_RIGHT)
✅ LP-based bandwidth allocation — Utility optimization solver runs every 500ms to maximize detection accuracy
✅ High-performance QUIC transport — s2n-quic (Rust) with BBR congestion control for efficient network utilization
✅ LIFO queue management — Prioritizes fresh frames, dropping stale data to meet latency SLOs
✅ Zero-copy IPC — Shared memory + ZeroMQ for efficient data transfer between components
✅ Adaptive model selection — Dynamically switches between 5 EfficientDet variants (D1-D7x) and compression strategies
✅ Real-time monitoring — Web dashboard with bandwidth allocation, service status, and network utilization plots
✅ Comprehensive logging — Structured Parquet output for experiment analysis and reproducibility
TURBO is a distributed system with two main components:
Running on the AV's onboard computer (e.g., NVIDIA Jetson):
- Camera Streams — Capture frames from multiple USB cameras (FRONT, FRONT_LEFT, FRONT_RIGHT)
- Client Processes — One per camera, handles image preprocessing and compression based on allocated model configuration
- Bandwidth Allocator — Runs a linear programming solver every 500ms to determine optimal bandwidth allocation and model selection for each service
- QUIC Client — High-performance Rust binary that manages per-service bidirectional streams, enforces bandwidth limits, and implements LIFO queue management
- Ping Handler — Measures network RTT to the cloud server using ICMP pings
Running on a GPU-equipped cloud instance (e.g., H100):
- QUIC Server — Rust binary that receives image data over multiplexed QUIC streams
- Model Servers — One per service, runs EfficientDet inference on GPU and returns detection results
┌─────────────── AV (Client) ───────────────┐ ┌────── Cloud (Server) ──────┐
│ │ │ │
│ Camera → Client → QUIC Client │ │ QUIC Server → ModelServer │
│ Camera → Client → QUIC Client │──QUIC─│ QUIC Server → ModelServer │
│ Camera → Client → QUIC Client │ │ QUIC Server → ModelServer │
│ ↑ │ │ │
│ Bandwidth Allocator │ └────────────────────────────┘
│ (LP Solver + RTT) │
│ │
└────────────────────────────────────────────┘
Key workflow:
- Cameras continuously capture frames and place them in shared memory
- Each Client reads frames, applies preprocessing/compression according to its assigned model configuration, and sends to QUIC Client
- QUIC Client manages per-service streams with bandwidth enforcement and LIFO queuing, transmitting over QUIC to the cloud
- QUIC Server receives images and forwards to ModelServers for GPU inference
- ModelServers return detection results (bounding boxes, scores) back through QUIC
- Bandwidth Allocator monitors network conditions (bandwidth from QUIC, RTT from pings) and runs LP solver to update model configurations
For detailed architecture, see docs/ARCHITECTURE.md.
Client (AV) side:
- Python 3.10; preferably managed via uv (alternatively, via Anaconda, specifically the
Miniconda3-py310_25.11.1-1release version on this page) - Rust 1.70+ (for QUIC transport)
- USB webcams (or video sources)
- Linux (tested on Ubuntu 20.04+)
Server (Cloud) side:
- Python 3.10; preferably managed via uv (alternatively, via Anaconda, specifically the
Miniconda3-py310_25.11.1-1release version on this page) - CUDA-capable GPU (tested on H100, A100)
- PyTorch 2.0+
- Rust 1.70+ (for QUIC transport)
- Fine-tuned EfficientDet model checkpoints (see Model Setup below)
- Needed dependencies for
OpenCV-- (e.g.sudo apt-get update && sudo apt-get install ffmpeg libsm6 libxext6)
-
Clone the repository and install dependencies:
git clone https://github.com/NetSys/turbo.git cd turbo uv syncAlternative: using pip
pip install . -
Download fine-tuned EfficientDet model checkpoints:
The system uses custom EfficientDet models (D1, D2, D4, D6, D7x) fine-tuned on the Waymo Open Dataset for 5-class object detection (vehicle, pedestrian, cyclist, sign, unknown). Download and extract them:
# Download the model archive wget https://storage.googleapis.com/turbo-nines-2026/av-models.zip # Extract to a location of your choice (e.g., ~/av-models in this example) unzip av-models.zip -d ~
After extraction, update the checkpoint paths in your server configuration file (
config/server_config_gcloud.yaml) and model config (src/python/model_server/model_config.yaml) to point to the extracted checkpoint files. See docs/MODELS.md for detailed model information and configuration. -
Generate SSL Keys for QUIC:
cd src/quic uv run generate_cert.pyAlternative: using pip-installed environment
python generate_cert.py
Make sure the same outputted files are copied to both your client and server hosting locations.
-
Build QUIC binaries:
cd src/quic cargo build --release cd ..
You may install the latest version of Rust here.
-
Configure the system:
- Edit config/client_config.yaml for client-side settings
- Edit config/server_config_gcloud.yaml for server-side settings
- Edit config/quic_config_client.yaml for QUIC transport settings
See docs/CONFIGURATION.md for detailed configuration guide.
On the server (cloud) side:
-
Do the following pre-run steps:
-
If previous runs were done:
- Clear all previous zeromq socket files from any previous runs, if they exist. In this example, just remove all contents of the directory containing the zeromq files:
rm ~/experiment2-out/zmq/*
- Stash the previous log outputs from any previous runs, if they exist, and make sure the directories for storing the log outputs produced by all parts of this system are empty.
- Clear all previous zeromq socket files from any previous runs, if they exist. In this example, just remove all contents of the directory containing the zeromq files:
-
If this is the first run:
- Make output directories to store each of the log outputs for your current run.
- Make output directories to store
ZeroMQIPC socket files.
For reference, the author's output directory structure was created as follows:
mkdir ~/experiment2-out mkdir ~/experiment2-out/zmq mkdir ~/experiment2-out/client mkdir ~/experiment2-out/server mkdir ~/experiment2-out/quic-client-out mkdir ~/experiment2-out/quic-server-out
-
-
Start the QUIC server:
cd src/quic RUST_LOG=info cargo run --release --bin server ../../config/quic_config_gcloud.yaml ${YOUR_SERVER_INTERNAL_IP}:12345
(or, if debugging an error, use RUST_BACKTRACE=1 instead of RUST_LOG=...)
-
Start the model servers (in a separate terminal):
cd src/python uv run server_main.py -c ../../config/server_config_gcloud.yaml(or
python server_main.py ...if using a pip-installed environment)
On the client (AV) side:
- Do the following pre-run steps:
-
If previous runs were done:
- Clear all previous zeromq socket files from any previous runs, if they exist. In this example, just remove all contents of the directory containing the zeromq files:
rm ~/experiment2-out/zmq/*
- Stash the previous log outputs from any previous runs, if they exist, and make sure the directories for storing the log outputs produced by all parts of this system are empty.
-
If this is the first run:
- Allow ping requests (our PingHandler module needs to send pings from user-land):
sudo sysctl net.ipv4.ping_group_range='0 2147483647' - Make output directories to store each of the log outputs for your current run.
- Make output directories to store
ZeroMQIPC socket files.
- Allow ping requests (our PingHandler module needs to send pings from user-land):
-
IMPORTANT: The ordering of the following steps matters due to a behavior in ZeroMQ socket binding. See docs/IPC.md for details.
-
Start the client processes (in a separate terminal):
cd src/python uv run client_main.py -c ../../config/client_config.yaml(or
python client_main.py ...if using a pip-installed environment) -
Start the web dashboard for real-time monitoring:
cd src/python/web_frontend uv run start_web_dashboard.py --config ../../../config/client_config.yaml(or
python start_web_dashboard.py ...if using a pip-installed environment) Then openhttp://0.0.0.0:5000in your browser. -
Wait 20 seconds (or until you see log messages of the form
Client 2: Python waiting for Rust QUIC client handshake), then start the QUIC client:cd src/quic RUST_LOG=info cargo run --release --bin client ../../config/quic_config_client.yaml ${YOUR_SERVER_EXTERNAL_IP}:12345
(or, if debugging an error, use RUST_BACKTRACE=1 instead of RUST_LOG=...)
Experiment output will be logged to Parquet files in the configured output directories (default: ~/experiment2-out/).
- Model Setup & Reference - EfficientDet model download, configuration, and inference details
- System Architecture - Detailed technical architecture, problem setup, bandwidth solver, and end-to-end walkthrough
- Configuration Guide - Complete configuration file reference
- Experiment Logging - Parquet output file formats and logging reference
- IPC Reference - Inter-process communication protocols (ZMQ, shared memory)
Model Configurations:
Each configuration is identified by a string like edd4-imgcomp50-inpcompNone, specifying:
- EfficientDet variant (D1, D2, D4, D6, D7x)
- Image compression strategy (JPEG quality, PNG, or none)
- Input preprocessing compression
See docs/ARCHITECTURE.md#model-configurations for details.
Utility Curves: The system pre-computes step functions mapping available bandwidth → achievable detection accuracy (mAP) for each model configuration under given network conditions.
Bandwidth Solver: An LP-based allocator runs every 500ms to select the optimal (model, compression) configuration for each service, maximizing total utility subject to bandwidth and SLO constraints.
turbo/
├── src/
│ ├── python/
│ │ ├── client_main.py # Client-side process orchestrator
│ │ ├── client.py # Per-service client (preprocessing, QUIC I/O)
│ │ ├── server_main.py # Server-side process orchestrator
│ │ ├── server.py # Per-service model server (EfficientDet inference)
│ │ ├── bandwidth_allocator.py # LP-based bandwidth allocation solver
│ │ ├── utility_curve_stream/ # Utility curve computation framework
│ │ ├── camera_stream/ # USB camera capture
│ │ ├── ping_handler/ # ICMP RTT measurement
│ │ ├── model_server/ # EfficientDet model loading
│ │ ├── util/ # Shared utilities (plotting, logging)
│ │ └── web_frontend/ # Real-time web dashboard
│ └── quic/ # QUIC transport layer (Rust)
│ ├── quic_client/ # Client binary
│ ├── quic_server/ # Server binary
│ └── quic_conn/ # Shared library (bandwidth management, logging)
├── config/ # YAML configuration files
└── docs/ # Detailed documentation
- QUIC Transport: s2n-quic (Rust) with BBR congestion control
- IPC: ZeroMQ for control messages; POSIX shared memory for image data
- Object Detection: EfficientDet (D1-D7x) trained on Waymo Open Dataset
- Optimization: PuLP linear programming solver
- Logging: Polars DataFrames with Parquet output
- Visualization: Flask + WebSocket dashboard with matplotlib
Planned features and improvements, in addition to accepted GitHub Issues/PRs:
- Docker deployment configuration
- Graceful termination of python services
- Graceful handling of Ctrl-C in rust processes (to kill all zmq sockets and shm files, and avoid parquet data loss)
- Migration to full Rust implementation with Rust Python+numpy bindings; - eliminate ZeroMQ sockets and replace with more robust IPC
- Camera streams are sometimes laggy and unreliable; migrate from OpenCV and replace with low-latency alternative
- Camera streams are often miscalibrated w.r.t. brightness/exposure; fix is pending investigation
- Logging for some sub-processes is broken and/or unclear in Rust and Python; fix is pending investigation
We welcome contributions from the community! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- Report bugs and request features via GitHub Issues
- Submit pull requests for bug fixes and enhancements
- Improve documentation and add tutorials
- Share your deployment experiences and use cases
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
If you use this system in your research, please cite:
@article{Schafhalter_Krentsel_Wei_Gonzalez_Ratnasamy_Shenker_Stoica_2026, title={TURBO: Utility-Aware Bandwidth Allocation for Cloud-Augmented Autonomous Control}, journal={New Ideas in Networked Systems Conference}, author={Schafhalter, Peter and Krentsel, Alex and Wei, Hongbo and Gonzalez, Joseph E and Ratnasamy, Sylvia and Shenker, Scott and Stoica, Ion}, year={2026}} For questions and feedback, open a GitHub Issue.
