Skip to content

Open-source modular autonomous driving simulation platform with computer vision, deep learning, and sensor fusion. Features lane detection, object recognition, and adaptive control in BeamNG.tech, with real-time visualization via Foxglove.

License

Notifications You must be signed in to change notification settings

visionpilot-project/VisionPilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

200 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VisionPilot Banner

VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)

Star History Chart

Table of Contents

Overview

A modular Python project for autonomous driving research and prototyping, fully integrated with the BeamNG.tech simulator and Foxglove visualization. This system combines traditional computer vision and state-of-the-art deep learning (CNN, YOLO & YOLOP) with real-time sensor fusion and autonomous vehicle control to tackle:

  • Lane Detection: YOLOP (unified), Traditional CV (multi-lane)
  • Traffic Sign: Classification & detection (CNN, YOLO)
  • Traffic Lights: Classification & detection (YOLO)
  • Object Detection: Vehicles, pedestrians, cyclists and more (YOLO & YOLOP)
  • Multi-Sensor Fusion: Camera, Lidar, Radar, GPS, IMU
  • Microservices Architecture: Containerized multi-model inference (Docker), orchestrated via central aggregator
  • Real-Time Control: PID steering, cruise control (CC), automatic emergency braking (AEB)
  • Visualization: Real-time monitoring with Foxglove WebSocket + multiple CV windows
  • Configuration System: YAML-based modular settings
  • Drive Logging: Full telemetry and drive logs

Demos

Emergency Braking (AEB) Demo

Watch the Emergency Braking System (AEB) in action with real-time radar filtering and collision avoidance:

AEB Demo

Extended Demo: Watch the full video here


Sign Detection & Detection and classification

This demo shows real-time traffic sign detection and classification:

Sign Detection Demo & Vehicle Pedestrian

Extended Demo: Watch the full video here

VisionPilot does not yet support multi-camera. This is for demonstration purposes only.


Traffic Light Detection & Classification Demo

This demo shows real-time traffic light detection and classification:

Traffic Light Detection & Classification Demo

No extended Demo avaliable yet.


Latest Lane Detection Demo (v2)

Watch the improved autonomous lane keeping demo (v2) in BeamNG.tech, featuring smoother fused CV+SCNN lane detection, stable PID steering, and robust cruise control:

Lane Detection Demo

Extended Demo: Watch the full video here

Note: Very low-light (tunnel) scenarios are not yet supported.

Previous Lane Detection Demo (v1)

The original demo is still available for reference:

Lane Keeping & Multi-Model Detection Demo (v1)


Foxglove Visualization Demo

See real-time LiDAR point cloud streaming and autonomous vehicle telemetry in Foxglove Studio:

Foxglove Visualization Demo

Extended Demo: Watch the full video here


Segmentation Demo

See real-time image segmentation using front and rear cameras:

Segmentation Demo

Extended Demo: Watch the full video here

More demo videos and visualizations will be added as features are completed.

Sensor Suite

The vehicle is equipped with a comprehensive multi-sensor suite for autonomous perception and control:

Sensor Specification Purpose
Front Camera 1920x1080 @ 50Hz, 70Β° FOV, Depth enabled Lane detection, traffic signs, traffic lights, object detection
LiDAR (Top) 80 vertical lines, 360Β° horizontal, 120m range, 20Hz Obstacle detection, 3D scene understanding
Front Radar 200m range, 128Γ—64 bins, 50Hz Collision avoidance, adaptive cruise control
Rear Left & Right Radar 30m range, 64Γ—32 bins, 50Hz Blindspot monitoring, rear object detection
Dual GPS Front & rear positioning @ 50Hz Localization
IMU 100Hz update rate Vehicle dynamics, pose estimation
Sensor Array 1 Sensor Array 2 Sensor Array 3
Sensor Array Front Radar Lidar Visualization

Configuration files are located in the /config directory:

Microservices Architecture

VisionPilot uses a containerized microservices architecture where each perception task runs as an independent Flask service, orchestrated by a central Aggregator:

Service Stack

Service Port Function Model/Framework
CV Lane Detection 4777 Multi-lane detection (3β†’2β†’1 fallback) OpenCV
Object Detection 5777 Vehicle, pedestrian, cyclist detection YOLOv11
Traffic Light Detection 6777 Traffic light detection & state classification YOLOv11
Sign Detection 7777 Traffic sign detection YOLOv11
Sign Classification 8777 Traffic sign type classification CNN
YOLOP 9777 Unified: lanes + drivable area + objects YOLOP

Data Flow

BeamNG Simulation Loop
    ↓
PerceptionClient.process_frame()
    ↓
Aggregator (concurrent orchestration)
    β”œβ”€β†’ CV Lane Detection (4777)
    β”œβ”€β†’ Object Detection (5777)
    β”œβ”€β†’ Traffic Light (6777)
    β”œβ”€β†’ Sign Detection (7777)
    β”œβ”€β†’ Sign Classification (8777)
    └─→ YOLOP (9777)
    ↓
Merge all responses
    ↓
Return unified AggregationResult
    ↓
Extract individual results + visualize

Benefits

Concurrency: All services run in parallel (ThreadPoolExecutor)
Modularity: Add/remove services without modifying BeamNG code
Scalability: Easy horizontal scaling with container orchestration
Fault Tolerance: Individual service failures don't break the pipeline
Reusability: Services can be used independently or together

Roadmap

Perception

  • Sign classification & Detection (CNN / YOLOv11m)
  • Traffic light classification & Detection (CNN / YOLOv11m)
  • Lane detection Fusion (SCNN / CV)
  • πŸ”₯πŸ”₯ YOLOP integration
    • Drivable area segmentation
    • Lane detection (segmentation output)
    • Object detection
  • CV Lane Detection Service (OpenCV-based multi-lane detection)
  • Advanced lane detection using OpenCV (robust highway, lighting, outlier handling)
  • Integrate Majority Voting system for CV
  • Lighting Condition Detection
  • ⭐ Semantic Segmentatation (Already built not implemented here yet)
    • Panoptic segmentation (instance + semantic)
  • Depth Estimation (Monocular for obstacle distance)
  • ⭐ Real-Time Object Detection (Cars, Trucks, Buses, Pedestrians, Cyclists) (Trained)
  • πŸ”₯ Speed Estimation using detection from camera and lidar
    • Multiple Object Tracking (MOT)
  • πŸ”₯πŸ”₯ Handle dashed lines better in lane detection
  • Road Marking Detection (Arrows, Crosswalks, Stop Lines)
  • πŸ”₯πŸ”₯ Lidar Object Detection 3D
  • Ocluded Object Detection (Detect objects that are partially blocked or not visible in the camera view using radar/lidar)
  • Detect multiple lanes
  • πŸ”₯ Classify lane types
  • πŸ’€ Multi Camera Setup (Will implement after all other camera-based features are finished)
  • πŸ’€ Overtaking, Merging (Will be part of Path Planning)

Sensor Fusion & Calibration

  • πŸ”₯ Kalman Filtering
    • Extended
  • Integrate Radar
  • Integrate Lidar
  • Integrate GPS
  • Integrate IMU
  • πŸ”₯ Ultrasonic Sensor Integration
  • πŸ’€πŸ’€ SLAM (simultaneous localization and mapping)
    • Build HD Map of the BeamNG.tech map
    • Localize Vehicle on HD Map

Control & Planning

  • Integrate vehicle control (Throttle, Steering, Braking Implemented) (PID needs further tuning)
  • Integrate PIDF controller
  • ⭐ Adaptive Cruise Control (Currently only basic Cruise Control implemented)
  • ⭐ Automatic Emergency Braking AEB (Still an issue with crashing after EB activated)
    • Obstacle Avoidance (Steering away from obstacles instead of just braking)
  • Model Predictive Control MPC (More advanced control strategy that optimizes control inputs over a future time horizon)
  • Curve Speed Optimization (Slow down for sharp curves based on lane curvature)
  • Trajectory Predcition for surrounding vehicles
  • πŸ”₯ Blindspot Monitoring (Using left/right rear short range radars)
  • Traffic Rule Enforcement (Stop at red lights, stop signs, yield signs)
  • Dynamic Target Speed based on Speed Limit Signs
  • Global Path planning
  • Local Path planning
  • πŸ”₯ Lane Change Logic
    • Change Blindspots before lane change
    • Signal Lane Change
  • Parking Logic (Path finding / Parallel or Perpendicular)
  • πŸ’€πŸ’€ U-Turn Logic (3-point turn)
  • πŸ’€πŸ’€ Advanced traffic participant prediction (trajectory, intent)

Simulation & Scenarios

  • Integrate and test in BeamNG.tech simulation (replacing CARLA)
  • Modularize and clean up BeamNG.tech pipeline
  • Tweak lane detection parameters and thresholds
  • Fog Weather conditions (Rain or snow not supported in BeamNG.tech)
  • Traffic scenarios: driving in heavy, moderate, and light traffic
  • Test all Systems in different lighting conditions (Day, Night, Dawn/Dusk, Tunnel)
  • Construction Zones (temporary lanes, cones, barriers)
  • πŸ’€πŸ’€ Test using actual RC car

Visualization & Logging

  • ⭐ Full Foxglove visualization integration (Overhaul needed)
  • Modular YAML configuration system
  • Real-time drive logging and telemetry
  • Birds eye view BEV (Top down view of vehicle and surroundings)
  • Real time Annotations Overlay in Foxglove
  • Show predicted trajectories in Foxglove
  • Show Global and local path plans in Foxglove
  • Live Map Visualization

Note: Considering moving away from Foxglove entirely to build a custom dashboard. Not a priority at this time.

Deployment & Infrastructure

  • Containerize Models for easy deployment and scalability
    • ⭐ Microservices Architecture (Aggregator + individual services)
    • Message Broker (Redis support in docker-compose)
    • Docker Compose orchestration
    • Aggregator service (concurrent service orchestration)

README To-Dos

  • Add demo images and videos to README
  • Add performance benchmarks section
  • Add Table of Contents for easier navigation

Other

  • Vibe-Code a website for the project
  • Redo project structure for better modularity

Driver Monitoring System would've been pretty cool but human drivers are not implemented in BeamNG.tech

Legend

πŸ”₯ = High Priority

⭐ = Complete but still being improved/tuned/changed (not final version)

πŸ’€ = Minimal Priority, can be addressed later

πŸ’€πŸ’€ = Very Low Priority, may not be implement

Note on Installation

Status: This project is currently in active development. A stable, production-ready release with pre-trained models and complete documentation will be available eventually.

Known Limitations

  • Tunnel/Low-Light Scenarios: Camera depth perception fails below certain lighting thresholds
  • Multi-Camera Support: Single front-facing camera only (future roadmap)
  • Dashed Lane Detection: Requires improvement for better accuracy
  • PID Controller Tuning: May oscillate on aggressive maneuvers
  • Real-World Testing: Only validated in simulation (BeamNG.tech), for now...
  • Service Latency: Network overhead between BeamNG and containerized services (~50-100ms per aggregation)

Simulator-Specific Limitations

  • Rain/snow physics not supported in BeamNG.tech
  • Pedestrians not controllable by traffic system
  • Human drivers not implemented

Credits

Datasets:

  • CU Lane, LISA, GTSRB, Mapillary, BDD100K

Simulation & Tools:

  • BeamNG.tech by BeamNG GmbH
  • Foxglove Studio for visualization
  • Docker & Docker Compose for containerization

Special Thanks:

  • Kaggle for free GPU resources (model training)
  • Mr. Pratt (teacher/supervisor) for guidance

Acknowledgements

Academic Papers & Research:

Citation

If you use VisionPilot in your research, please cite:

@software{visionpilot2025,
  title={VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception},
  author={Julian Stamm},
  year={2025},
  url={https://github.com/visionpilot-project/VisionPilot}
}

BeamNG.tech Citation

Title: BeamNG.tech
Author: BeamNG GmbH
Address: Bremen, Germany
Year: 2025
Version: 0.35.0.0
URL: https://www.beamng.tech/

License

This project is licensed under the MIT License - see LICENSE file for details.

About

Open-source modular autonomous driving simulation platform with computer vision, deep learning, and sensor fusion. Features lane detection, object recognition, and adaptive control in BeamNG.tech, with real-time visualization via Foxglove.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages