- VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)
A modular Python project for autonomous driving research and prototyping, fully integrated with the BeamNG.tech simulator and Foxglove visualization. This system combines traditional computer vision and state-of-the-art deep learning (CNN, YOLO & YOLOP) with real-time sensor fusion and autonomous vehicle control to tackle:
- Lane Detection: YOLOP (unified), Traditional CV (multi-lane)
- Traffic Sign: Classification & detection (CNN, YOLO)
- Traffic Lights: Classification & detection (YOLO)
- Object Detection: Vehicles, pedestrians, cyclists and more (YOLO & YOLOP)
- Multi-Sensor Fusion: Camera, Lidar, Radar, GPS, IMU
- Microservices Architecture: Containerized multi-model inference (Docker), orchestrated via central aggregator
- Real-Time Control: PID steering, cruise control (CC), automatic emergency braking (AEB)
- Visualization: Real-time monitoring with Foxglove WebSocket + multiple CV windows
- Configuration System: YAML-based modular settings
- Drive Logging: Full telemetry and drive logs
Watch the Emergency Braking System (AEB) in action with real-time radar filtering and collision avoidance:
Extended Demo: Watch the full video here
This demo shows real-time traffic sign detection and classification:
Extended Demo: Watch the full video here
VisionPilot does not yet support multi-camera. This is for demonstration purposes only.
This demo shows real-time traffic light detection and classification:
No extended Demo avaliable yet.
Watch the improved autonomous lane keeping demo (v2) in BeamNG.tech, featuring smoother fused CV+SCNN lane detection, stable PID steering, and robust cruise control:
Extended Demo: Watch the full video here
Note: Very low-light (tunnel) scenarios are not yet supported.
The original demo is still available for reference:
Lane Keeping & Multi-Model Detection Demo (v1)
See real-time LiDAR point cloud streaming and autonomous vehicle telemetry in Foxglove Studio:
Extended Demo: Watch the full video here
See real-time image segmentation using front and rear cameras:
Extended Demo: Watch the full video here
More demo videos and visualizations will be added as features are completed.
The vehicle is equipped with a comprehensive multi-sensor suite for autonomous perception and control:
| Sensor | Specification | Purpose |
|---|---|---|
| Front Camera | 1920x1080 @ 50Hz, 70Β° FOV, Depth enabled | Lane detection, traffic signs, traffic lights, object detection |
| LiDAR (Top) | 80 vertical lines, 360Β° horizontal, 120m range, 20Hz | Obstacle detection, 3D scene understanding |
| Front Radar | 200m range, 128Γ64 bins, 50Hz | Collision avoidance, adaptive cruise control |
| Rear Left & Right Radar | 30m range, 64Γ32 bins, 50Hz | Blindspot monitoring, rear object detection |
| Dual GPS | Front & rear positioning @ 50Hz | Localization |
| IMU | 100Hz update rate | Vehicle dynamics, pose estimation |
![]() |
![]() |
![]() |
| Sensor Array | Front Radar | Lidar Visualization |
Configuration files are located in the
/configdirectory:
VisionPilot uses a containerized microservices architecture where each perception task runs as an independent Flask service, orchestrated by a central Aggregator:
| Service | Port | Function | Model/Framework |
|---|---|---|---|
| CV Lane Detection | 4777 | Multi-lane detection (3β2β1 fallback) | OpenCV |
| Object Detection | 5777 | Vehicle, pedestrian, cyclist detection | YOLOv11 |
| Traffic Light Detection | 6777 | Traffic light detection & state classification | YOLOv11 |
| Sign Detection | 7777 | Traffic sign detection | YOLOv11 |
| Sign Classification | 8777 | Traffic sign type classification | CNN |
| YOLOP | 9777 | Unified: lanes + drivable area + objects | YOLOP |
BeamNG Simulation Loop
β
PerceptionClient.process_frame()
β
Aggregator (concurrent orchestration)
βββ CV Lane Detection (4777)
βββ Object Detection (5777)
βββ Traffic Light (6777)
βββ Sign Detection (7777)
βββ Sign Classification (8777)
βββ YOLOP (9777)
β
Merge all responses
β
Return unified AggregationResult
β
Extract individual results + visualize
Concurrency: All services run in parallel (ThreadPoolExecutor)
Modularity: Add/remove services without modifying BeamNG code
Scalability: Easy horizontal scaling with container orchestration
Fault Tolerance: Individual service failures don't break the pipeline
Reusability: Services can be used independently or together
- Sign classification & Detection (CNN / YOLOv11m)
- Traffic light classification & Detection (CNN / YOLOv11m)
- Lane detection Fusion (SCNN / CV)
- π₯π₯ YOLOP integration
- Drivable area segmentation
- Lane detection (segmentation output)
- Object detection
- CV Lane Detection Service (OpenCV-based multi-lane detection)
- Advanced lane detection using OpenCV (robust highway, lighting, outlier handling)
- Integrate Majority Voting system for CV
- Lighting Condition Detection
- β Semantic Segmentatation (Already built not implemented here yet)
- Panoptic segmentation (instance + semantic)
- Depth Estimation (Monocular for obstacle distance)
- β Real-Time Object Detection (Cars, Trucks, Buses, Pedestrians, Cyclists) (Trained)
- π₯ Speed Estimation using detection from camera and lidar
- Multiple Object Tracking (MOT)
- π₯π₯ Handle dashed lines better in lane detection
- Road Marking Detection (Arrows, Crosswalks, Stop Lines)
- π₯π₯ Lidar Object Detection 3D
- Ocluded Object Detection (Detect objects that are partially blocked or not visible in the camera view using radar/lidar)
- Detect multiple lanes
- π₯ Classify lane types
- π€ Multi Camera Setup (Will implement after all other camera-based features are finished)
- π€ Overtaking, Merging (Will be part of Path Planning)
- π₯ Kalman Filtering
- Extended
- Integrate Radar
- Integrate Lidar
- Integrate GPS
- Integrate IMU
- π₯ Ultrasonic Sensor Integration
- π€π€ SLAM (simultaneous localization and mapping)
- Build HD Map of the BeamNG.tech map
- Localize Vehicle on HD Map
- Integrate vehicle control (Throttle, Steering, Braking Implemented) (PID needs further tuning)
- Integrate PIDF controller
- β Adaptive Cruise Control (Currently only basic Cruise Control implemented)
- β Automatic Emergency Braking AEB (Still an issue with crashing after EB activated)
- Obstacle Avoidance (Steering away from obstacles instead of just braking)
- Model Predictive Control MPC (More advanced control strategy that optimizes control inputs over a future time horizon)
- Curve Speed Optimization (Slow down for sharp curves based on lane curvature)
- Trajectory Predcition for surrounding vehicles
- π₯ Blindspot Monitoring (Using left/right rear short range radars)
- Traffic Rule Enforcement (Stop at red lights, stop signs, yield signs)
- Dynamic Target Speed based on Speed Limit Signs
- Global Path planning
- Local Path planning
- π₯ Lane Change Logic
- Change Blindspots before lane change
- Signal Lane Change
- Parking Logic (Path finding / Parallel or Perpendicular)
- π€π€ U-Turn Logic (3-point turn)
- π€π€ Advanced traffic participant prediction (trajectory, intent)
- Integrate and test in BeamNG.tech simulation (replacing CARLA)
- Modularize and clean up BeamNG.tech pipeline
- Tweak lane detection parameters and thresholds
- Fog Weather conditions (Rain or snow not supported in BeamNG.tech)
- Traffic scenarios: driving in heavy, moderate, and light traffic
- Test all Systems in different lighting conditions (Day, Night, Dawn/Dusk, Tunnel)
- Construction Zones (temporary lanes, cones, barriers)
- π€π€ Test using actual RC car
- β Full Foxglove visualization integration (Overhaul needed)
- Modular YAML configuration system
- Real-time drive logging and telemetry
- Birds eye view BEV (Top down view of vehicle and surroundings)
- Real time Annotations Overlay in Foxglove
- Show predicted trajectories in Foxglove
- Show Global and local path plans in Foxglove
- Live Map Visualization
Note: Considering moving away from Foxglove entirely to build a custom dashboard. Not a priority at this time.
- Containerize Models for easy deployment and scalability
- β Microservices Architecture (Aggregator + individual services)
- Message Broker (Redis support in docker-compose)
- Docker Compose orchestration
- Aggregator service (concurrent service orchestration)
- Add demo images and videos to README
- Add performance benchmarks section
- Add Table of Contents for easier navigation
- Vibe-Code a website for the project
- Redo project structure for better modularity
Driver Monitoring System would've been pretty cool but human drivers are not implemented in BeamNG.tech
π₯ = High Priority
β = Complete but still being improved/tuned/changed (not final version)
π€ = Minimal Priority, can be addressed later
π€π€ = Very Low Priority, may not be implement
Status: This project is currently in active development. A stable, production-ready release with pre-trained models and complete documentation will be available eventually.
- Tunnel/Low-Light Scenarios: Camera depth perception fails below certain lighting thresholds
- Multi-Camera Support: Single front-facing camera only (future roadmap)
- Dashed Lane Detection: Requires improvement for better accuracy
- PID Controller Tuning: May oscillate on aggressive maneuvers
- Real-World Testing: Only validated in simulation (BeamNG.tech), for now...
- Service Latency: Network overhead between BeamNG and containerized services (~50-100ms per aggregation)
- Rain/snow physics not supported in BeamNG.tech
- Pedestrians not controllable by traffic system
- Human drivers not implemented
Datasets:
- CU Lane, LISA, GTSRB, Mapillary, BDD100K
Simulation & Tools:
- BeamNG.tech by BeamNG GmbH
- Foxglove Studio for visualization
- Docker & Docker Compose for containerization
Special Thanks:
- Kaggle for free GPU resources (model training)
- Mr. Pratt (teacher/supervisor) for guidance
Academic Papers & Research:
- YOLOP/YOLOPX: Anchor-free multi-task learning network for panoptic driving perception
@article{YOLOPX2024, title={YOLOPX: Anchor-free multi-task learning network for panoptic driving perception}, author={Zhan, Jiao and Luo, Yarong and Guo, Chi and Wu, Yejun and Liu, Jingnan}, journal={Pattern Recognition}, volume={148}, pages={110152}, year={2024} }
If you use VisionPilot in your research, please cite:
@software{visionpilot2025,
title={VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception},
author={Julian Stamm},
year={2025},
url={https://github.com/visionpilot-project/VisionPilot}
}Title: BeamNG.tech
Author: BeamNG GmbH
Address: Bremen, Germany
Year: 2025
Version: 0.35.0.0
URL: https://www.beamng.tech/
This project is licensed under the MIT License - see LICENSE file for details.









