VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)

Overview

A modular Python project for autonomous driving research and prototyping, fully integrated with the BeamNG.tech simulator and Foxglove visualization. This system combines traditional computer vision and state-of-the-art deep learning (CNN, YOLO & YOLOP) with real-time sensor fusion and autonomous vehicle control to tackle:

Lane Detection: YOLOP (unified), Traditional CV (multi-lane)
Traffic Sign: Classification & detection (CNN, YOLO)
Traffic Lights: Classification & detection (YOLO)
Object Detection: Vehicles, pedestrians, cyclists and more (YOLO & YOLOP)
Multi-Sensor Fusion: Camera, Lidar, Radar, GPS, IMU
Microservices Architecture: Containerized multi-model inference (Docker), orchestrated via central aggregator
Real-Time Control: PID steering, cruise control (CC), automatic emergency braking (AEB)
Visualization: Real-time monitoring with Foxglove WebSocket + multiple CV windows
Configuration System: YAML-based modular settings
Drive Logging: Full telemetry and drive logs

Demos

Emergency Braking (AEB) Demo

Watch the Emergency Braking System (AEB) in action with real-time radar filtering and collision avoidance:

Extended Demo: Watch the full video here

Sign Detection & Detection and classification

This demo shows real-time traffic sign detection and classification:

Extended Demo: Watch the full video here

VisionPilot does not yet support multi-camera. This is for demonstration purposes only.

Traffic Light Detection & Classification Demo

This demo shows real-time traffic light detection and classification:

No extended Demo avaliable yet.

Latest Lane Detection Demo (v2)

Watch the improved autonomous lane keeping demo (v2) in BeamNG.tech, featuring smoother fused CV+SCNN lane detection, stable PID steering, and robust cruise control:

Extended Demo: Watch the full video here

Note: Very low-light (tunnel) scenarios are not yet supported.

Previous Lane Detection Demo (v1)

The original demo is still available for reference:

Lane Keeping & Multi-Model Detection Demo (v1)

Foxglove Visualization Demo

See real-time LiDAR point cloud streaming and autonomous vehicle telemetry in Foxglove Studio:

Extended Demo: Watch the full video here

Segmentation Demo

See real-time image segmentation using front and rear cameras:

Extended Demo: Watch the full video here

More demo videos and visualizations will be added as features are completed.

Sensor Suite

The vehicle is equipped with a comprehensive multi-sensor suite for autonomous perception and control:

Sensor	Specification	Purpose
Front Camera	1920x1080 @ 50Hz, 70° FOV, Depth enabled	Lane detection, traffic signs, traffic lights, object detection
LiDAR (Top)	80 vertical lines, 360° horizontal, 120m range, 20Hz	Obstacle detection, 3D scene understanding
Front Radar	200m range, 128×64 bins, 50Hz	Collision avoidance, adaptive cruise control
Rear Left & Right Radar	30m range, 64×32 bins, 50Hz	Blindspot monitoring, rear object detection
Dual GPS	Front & rear positioning @ 50Hz	Localization
IMU	100Hz update rate	Vehicle dynamics, pose estimation


Sensor Array	Front Radar	Lidar Visualization

Configuration files are located in the /config directory:

Microservices Architecture

VisionPilot uses a containerized microservices architecture where each perception task runs as an independent Flask service, orchestrated by a central Aggregator:

Service Stack

Service	Port	Function	Model/Framework
CV Lane Detection	4777	Multi-lane detection (3→2→1 fallback)	OpenCV
Object Detection	5777	Vehicle, pedestrian, cyclist detection	YOLOv11
Traffic Light Detection	6777	Traffic light detection & state classification	YOLOv11
Sign Detection	7777	Traffic sign detection	YOLOv11
Sign Classification	8777	Traffic sign type classification	CNN
YOLOP	9777	Unified: lanes + drivable area + objects	YOLOP

Data Flow

BeamNG Simulation Loop
    ↓
PerceptionClient.process_frame()
    ↓
Aggregator (concurrent orchestration)
    ├─→ CV Lane Detection (4777)
    ├─→ Object Detection (5777)
    ├─→ Traffic Light (6777)
    ├─→ Sign Detection (7777)
    ├─→ Sign Classification (8777)
    └─→ YOLOP (9777)
    ↓
Merge all responses
    ↓
Return unified AggregationResult
    ↓
Extract individual results + visualize

Benefits

Concurrency: All services run in parallel (ThreadPoolExecutor)
Modularity: Add/remove services without modifying BeamNG code
Scalability: Easy horizontal scaling with container orchestration
Fault Tolerance: Individual service failures don't break the pipeline
Reusability: Services can be used independently or together

Roadmap

Perception

Sensor Fusion & Calibration

Control & Planning

Simulation & Scenarios

Integrate and test in BeamNG.tech simulation (replacing CARLA)
Modularize and clean up BeamNG.tech pipeline
Tweak lane detection parameters and thresholds
Fog Weather conditions (Rain or snow not supported in BeamNG.tech)
Traffic scenarios: driving in heavy, moderate, and light traffic
Test all Systems in different lighting conditions (Day, Night, Dawn/Dusk, Tunnel)
Construction Zones (temporary lanes, cones, barriers)
💤💤 Test using actual RC car

Visualization & Logging

⭐ Full Foxglove visualization integration (Overhaul needed)
Modular YAML configuration system
Real-time drive logging and telemetry
Birds eye view BEV (Top down view of vehicle and surroundings)
Real time Annotations Overlay in Foxglove
Show predicted trajectories in Foxglove
Show Global and local path plans in Foxglove
Live Map Visualization

Note: Considering moving away from Foxglove entirely to build a custom dashboard. Not a priority at this time.

Deployment & Infrastructure

Containerize Models for easy deployment and scalability
- ⭐ Microservices Architecture (Aggregator + individual services)
- Message Broker (Redis support in docker-compose)
- Docker Compose orchestration
- Aggregator service (concurrent service orchestration)

README To-Dos

Add demo images and videos to README
Add performance benchmarks section
Add Table of Contents for easier navigation

Other

Vibe-Code a website for the project
Redo project structure for better modularity

Driver Monitoring System would've been pretty cool but human drivers are not implemented in BeamNG.tech

Legend

🔥 = High Priority

⭐ = Complete but still being improved/tuned/changed (not final version)

💤 = Minimal Priority, can be addressed later

💤💤 = Very Low Priority, may not be implement

Note on Installation

Status: This project is currently in active development. A stable, production-ready release with pre-trained models and complete documentation will be available eventually.

Known Limitations

Tunnel/Low-Light Scenarios: Camera depth perception fails below certain lighting thresholds
Multi-Camera Support: Single front-facing camera only (future roadmap)
Dashed Lane Detection: Requires improvement for better accuracy
PID Controller Tuning: May oscillate on aggressive maneuvers
Real-World Testing: Only validated in simulation (BeamNG.tech), for now...
Service Latency: Network overhead between BeamNG and containerized services (~50-100ms per aggregation)

Simulator-Specific Limitations

Rain/snow physics not supported in BeamNG.tech
Pedestrians not controllable by traffic system
Human drivers not implemented

Credits

Datasets:

CU Lane, LISA, GTSRB, Mapillary, BDD100K

Simulation & Tools:

BeamNG.tech by BeamNG GmbH
Foxglove Studio for visualization
Docker & Docker Compose for containerization

Special Thanks:

Kaggle for free GPU resources (model training)
Mr. Pratt (teacher/supervisor) for guidance

Acknowledgements

Academic Papers & Research:

YOLOP/YOLOPX: Anchor-free multi-task learning network for panoptic driving perception

@article{YOLOPX2024,
  title={YOLOPX: Anchor-free multi-task learning network for panoptic driving perception},
  author={Zhan, Jiao and Luo, Yarong and Guo, Chi and Wu, Yejun and Liu, Jingnan},
  journal={Pattern Recognition},
  volume={148},
  pages={110152},
  year={2024}
}

Citation

If you use VisionPilot in your research, please cite:

@software{visionpilot2025,
  title={VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception},
  author={Julian Stamm},
  year={2025},
  url={https://github.com/visionpilot-project/VisionPilot}
}

BeamNG.tech Citation

Title: BeamNG.tech
Author: BeamNG GmbH
Address: Bremen, Germany
Year: 2025
Version: 0.35.0.0
URL: https://www.beamng.tech/

License

This project is licensed under the MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
config		config
docker		docker
media		media
scripts		scripts
services		services
simulation		simulation
src		src
tests		tests
training		training
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

visionpilot-project/VisionPilot

Folders and files

Latest commit

History

Repository files navigation

VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)

Table of Contents

Overview

Demos

Emergency Braking (AEB) Demo

Sign Detection & Detection and classification

Traffic Light Detection & Classification Demo

Latest Lane Detection Demo (v2)

Previous Lane Detection Demo (v1)

Foxglove Visualization Demo

Segmentation Demo

Sensor Suite

Microservices Architecture

Service Stack

Data Flow

Benefits

Roadmap

Perception

Sensor Fusion & Calibration

Control & Planning

Simulation & Scenarios

Visualization & Logging

Deployment & Infrastructure

README To-Dos

Other

Legend

Note on Installation

Known Limitations

Simulator-Specific Limitations

Credits

Acknowledgements

Citation

BeamNG.tech Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages