Various research video demos with links to available open access manuscripts, open source software and datasets.

Robust Semantic Segmentation for 3D LiDAR Point Clouds

Issue: 3D LiDAR point cloud segmentation often focuses on the spatial positioning and the distribution of points which, while robust in variable conditions, encounters quality challenges due to its sole reliance on coordinates and point intensity.

Approach: we introduce Range-Aware Pointwise Distance Distribution (RAPiD) features that exhibit rigid transformation invariance and effectively adapt to variations in point density, with a design focus on capturing the localized geometry of neighboring structures.

Application: They utilize inherent LiDAR isotropic radiation and semantic categorization for enhanced local representation and computational efficiency, while incorporating a 4D distance metric that integrates geometric and surface material reflectivity for improved semantic segmentation.

Our method outperforms contemporary LiDAR segmentation work in terms of mIoU on SemanticKITTI (76.1) and nuScenes (83.6) datasets.

1 result

2024

[li24rapid-seg] RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for LiDAR Semantic Segmentation (L. Li, H.P.H Shum, T.P. Breckon), In Proc. European Conference on Computer Vision, Springer, 2024. (to appear)Keywords: autonomous driving, LiDAR, semantic segmentation, 3D feature points. [bibtex] [pdf] [arxiv] [demo] [software] [poster]

Reducing Task and Model Complexity for Semantic Segmentation in 3D LiDAR Point Clouds

Issue: whilst the availability of 3D LiDAR point cloud data has significantly grown, annotation remains expensive and time-consuming and has led to a demand for semi-supervised semantic segmentation methods, at the expense of computational cost and on-task accuracy.

Approach: we propose a new pipeline that employs a smaller architecture, requiring fewer ground-truth annotations to achieve superior segmentation accuracy.

Application: We use novel Sparse Depthwise Separable Convolution module that significantly reduces complexity while retaining overall task performance and sub-sample our training data via a new Spatio-Temporal Redundant Frame Downsampling (ST-RFD) method.

To leverage the use of limited annotated data samples, we further propose a soft pseudo-label method informed by LiDAR reflectivity.

Our method outperforms contemporary semi-supervised approaches using less labeled data and with a 2.3x reduction in model parameters and 641x fewer multiply-add operations whilst also demonstrating significant performance improvement on limited training data (i.e., Less is More).

1 result

2023

[li23lim3D] Less is More: Reducing Task and Model Complexity for Semi-Supervised 3D Point Cloud Semantic Segmentation (L. Li, H.P.H. Shum, T.P. Breckon), In Proc. Computer Vision and Pattern Recognition, IEEE/CVF, pp. 9361-9371, 2023.Keywords: autonomous driving, LiDAR, laser scanning, vehicle perception, LiM3D. [bibtex] [pdf] [doi] [arxiv] [demo] [software] [poster]

RoadLoc: A development & test framework for ground-truth vehicle localisation

Issue: Developers of autonomous vehicle and associated technologies, and those validating future operational safety, need to objectively measure on-road performance at both a system-wide and per-component level.

Approach:This project set out an innovative framework - consisting of a hardware sensor and analytics software - that can be used to measure the localisation of the vehicle between where it thinks it is, compared to where it actually is on the road surface itself.

Application: This can provide independent evaluation of on-board vehicle localisation systems to enable system-wide and per-component performance vs. cost optimization and improve the accuracy of any secondary environment mapping functionality.

1 result

2021

[li21durlar] DurLAR: A High-fidelity 128-channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-modal Autonomous Driving Applications (L. Li, K.N. Ismail, H.P.H. Shum, T.P. Breckon), In Proc. Int. Conf. on 3D Vision, IEEE, pp. 1227-1237, 2021.Keywords: autonomous driving, dataset, high resolution LiDAR, flash LiDAR, ground truth depth, dense depth, monocular depth estimation, stereo vision, 3D. [bibtex] [pdf] [doi] [demo] [software] [dataset] [poster]

DurLAR: A High-Fidelity 128-Channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery

Issue: a high-fidelity 128-channel 3D LiDAR dataset with panoramic ambient (near infrared) and reflectivity imagery captured in and around Durham, UK (DurLAR - Durham LiDAR with Ambient and Reflectivity)

Approach: Our driving platform is equipped with a high resolution 128 channel LiDAR, a 2MPix stereo camera, a lux meter and a GNSS/INS system. Ambient and reflectivity images are made available along with the LiDAR point clouds to facilitate multi-modal use of concurrent ambient and reflectivity scene information.

Application: Leveraging DurLAR, with a resolution exceeding that of prior benchmarks, we consider the task of monocular depth estimation and use this increased availability of higher resolution, yet sparse ground truth scene depth information to propose a novel joint supervised/self-supervised loss formulation.

Our evaluation shows that joint use supervised and self-supervised loss terms, enabled via the superior ground truth resolution and availability within this new dataset improves the quantitative and qualitative performance of leading contemporary monocular depth estimation approaches.

1 result

2021

[li21durlar] DurLAR: A High-fidelity 128-channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-modal Autonomous Driving Applications (L. Li, K.N. Ismail, H.P.H. Shum, T.P. Breckon), In Proc. Int. Conf. on 3D Vision, IEEE, pp. 1227-1237, 2021.Keywords: autonomous driving, dataset, high resolution LiDAR, flash LiDAR, ground truth depth, dense depth, monocular depth estimation, stereo vision, 3D. [bibtex] [pdf] [doi] [demo] [software] [dataset] [poster]

Semantic and Geometric Automotive Scene Understanding in Adverse Weather

Issue: Automotive scene understanding under adverse weather conditions raises a realistic and challenging problem attributable to poor outdoor scene visibility (e.g. foggy weather).

Approach: we propose two multi-task learning end-to-end pipelines capable of performing in real-time semantic scene understanding and monocular depth estimation under foggy weather conditions by leveraging both recent advances in adversarial training and domain adaptation.

Application: we illustrate performance over several foggy weather condition datasets including both synthetic and real-world examples.

2 results

2021

[alshammari21competitive] Competitive Simplicity for Multi-Task Learning for Real-Time Foggy Scene Understanding via Domain Adaptation (N. Alshammari, S. Akcay, T.P. Breckon), In Proc. Intelligent Vehicles Symposium, IEEE, pp. 1413-1420, 2021.Keywords: autonomous driving, weather, low visibility, fog, cnn, deep learning, convolutional neural network. [bibtex] [pdf] [doi] [arxiv] [demo] [talk]
[alshammari21multimodal] Multi-Modal Learning for Real-Time Automotive Semantic Foggy Scene Understanding via Domain Adaptation (N. Alshammari, S. Akcay, T.P. Breckon), In Proc. Intelligent Vehicles Symposium, IEEE, pp. 1428-1435, 2021.Keywords: autonomous driving, weather, low visibility, fog, cnn, deep learning, convolutional neural network. [bibtex] [pdf] [doi] [arxiv] [demo] [talk]

Temporally Consistent Monocular Depth Prediction

Issue: consistent depth prediction (monocular depth estimation and depth completion).

Approach: Via synthetic training images from urban driving scenarios, our approach also performs semantic scene segmentation in a single multi-task deep network.

Application: Model produces temporally consistent results via the temporal constraint of sequential frame recurrence and a per-trained optical flow network.

A deep robust multi-stream architecture with complex skip connections, results in better high-level contextual and geometric structural learning, leading to better quality and accurate results for depth and segmentation.

1 result

2019

[abarghouei19depth] Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach (A. Atapour-Abarghouei, T.P. Breckon), In Proc. Computer Vision and Pattern Recognition, IEEE/CVF, pp. 3368-3379, 2019.Keywords: monocular depth, generative adversarial network, GAN, depth map, disparity, depth from single image, multiple task learning, semantic segmantation, temporal consistency. [bibtex] [pdf] [doi] [arxiv] [demo] [software] [poster] [more information]

Multi-Task Depth Completion and Monocular Depth Estimation

Issue: Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation.

Approach: we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular depth estimation (i.e. predicting scene depth from a sin-gle RGB image) via two sub-networks jointly trained end to end using data randomly sampled from a publicly available corpus of synthetic and real-world images.

Application: The entire model can be used to infer complete scene depth from a single RGB image or the second network can be used alone to perform depth completion given a sparse depth input.

Using adversarial training, a robust objective function, a deep architecture relying on skip connections and a blend of synthetic and real-world training data, our approach is capable of producing superior high quality scene depth.

Extensive experimental evaluation demonstrates the efficacy of our approach compared to contemporary state-of-the-art techniques across both problem domains.

1 result

2019

[abarghouei19multi-task] To complete or to estimate, that is the question: A Multi-Task Depth Completion and Monocular Depth Estimation (A. Atapour-Abarghouei, T.P. Breckon), In Proc. Int. Conf. 3D Vision, IEEE, pp. 183-193, 2019.Keywords: monocular depth estimation, convolutional neural networks, lidar, sparse-to-dense, depth completion. [bibtex] [pdf] [doi] [arxiv] [demo]

Extended Real-time Object Detection and Attribute Estimation

Issue: the majority of prior work on real-time object detection for vehicle sensing concentrates on the detection of obstacles, dynamic scene objects (pedestrians,vehicles) and road signs.

Approach: we consider the performance of extended "long-list" object detection, via an extended end-to-end Region-based Convolutional Neural Net-work (R-CNN) architecture, over a large-scale 31 class detection problem of urban scene objects with integrated object attribute estimation for appropriate colour and primary orientation.

Application: for an autonomous vehicle to be truly able to interact with occupants and other road users using a common semantic understanding of the environment it is traversing it requires a considerably extended scene understanding capability.

We examine the extended performance of this multiple class object detection and attribute estimation task operating in real-time with on-vehicle processing at 10 fps.

Our work is evaluated under a range of real-world automotive conditions across multiple complex and cluttered urban environments.

1 result

2019

[ismail19understanding] On the Performance of Extended Real-Time Object Detection and Attribute Estimation within Urban Scene Understanding (K.N. Ismail, T.P. Breckon), In Proc. Int. Conf. on Machine Learning Applications, IEEE, pp. 641-646, 2019.Keywords: autonomous driving, object detection, stereo vision, driverless vehicles. [bibtex] [pdf] [doi] [demo]

DeGraF-Flow: Extending DeGraF Features to Optical Flow

Issue: we use the novel Dense Gradient Based Features (DeGraF) as the input to a sparse-to-dense optical flow scheme.

Approach: Consists of three stages: 1) efficient detection of uniformly distributed Dense Gradient Based Features (DeGraF); 2) feature tracking via robust local optical flow; and 3) edge preserving flow interpolation to recover overall dense optical flow.

Application: The tunable density and uniformity of DeGraF features yield superior dense optical flow estimation com-pared to other popular feature detectors within this three stage pipeline.

Furthermore, the comparable speed of feature detection also lends itself well to the aim of real-time optical flow recovery.

Evaluation on established real-world bench-mark datasets show test performance in an autonomous vehicle setting where DeGraF-Flow shows promising results in terms of accuracy with competitive computational efficiency among non-GPU based methods, including a marked increase in speed over the conceptually similar EpicFlow approach.

1 result

2019

[stephenson19degraf-flow] DeGraF-Flow: Extending DeGraF Features for Accurate and Efficient Sparse-to-Dense Optical Flow Estimation (F. Stephenson, T.P. Breckon, I. Katramados), In Proc. Int. Conference on Image Processing, IEEE, pp. 1277-1281, 2019.Keywords: optical flow, Dense Gradient Based Features, DeGraF, automotive vision, feature points. [bibtex] [pdf] [doi] [arxiv] [demo]

Monocular Depth Estimation Based on a Semantic Segmentation Prior

Issue: Monocular depth estimation using novel learning-based approaches has recently emerged as a promising potential alternative to more conventional 3D scene capture technologies within real-world scenarios.

Approach: we propose a monocular depth estimation approach, which employs a jointly-trained pixel-wise semantic understanding step to estimate depth for individually-selected groups of objects (segments) within the scene.

Application: The separate depth outputs are efficiently fused to generate the final result. This creates more simplistic learning objectives for the jointly-trained individual networks, leading to more accurate overall depth.

Extensive experimentation demonstrates the efficacy of the proposed approach compared to contem-porary state-of-the-art techniques within the literature.

1 result

2019

[abarghouei19segment-wise] Monocular Segment-Wise Depth: Monocular Depth Estimation Based on a Semantic Segmentation Prior (A. Atapour-Abarghouei, T.P. Breckon), In Proc. Int. Conf. on Image Processing, IEEE, pp. 4295-4299, 2019.Keywords: monocular depth estimation, convolutional neural networks, semantic segmentation. [bibtex] [pdf] [doi] [demo]

360 Degree Monocular Depth and Object Detection

Issue: future autonomous vehicles will not be viable without a more comprehensive surround sensing, akin to a human driver, as can be provided by 360 degree panoramic cameras.

Approach: to adapt contemporary deep network architectures developed on conventional rectilinear imagery to work on equirectangular 360 degree panoramic imagery providing both full 360 degree depth and 3D object detection (for vehicles).

Application: To address the lack of annotated panoramic automotive datasets availability, we adapt a contemporary automotive dataset, via style and projection transformations, to facilitate the cross-domain retraining of contemporary algorithms for panoramic imagery.

Following this approach we retrain and adapt existing architectures to recover scene depth and 3D pose of vehicles from monocular panoramic imagery without any panoramic training labels or calibration parameters.

Our approach is evaluated qualitatively on crowd-sourced panoramic images and quantitatively using an automotive environment simulator to provide the first benchmark for such techniques within panoramic imagery.

1 result

2018

[pdlg18panoramic] Eliminating the Dreaded Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery (G. Payen de La Garanderie, A. Atapour-Abarghouei, T.P. Breckon), In Proc. European Conference on Computer Vision, Springer, pp. 812-830, 2018.Keywords: monocular depth, 360 monocular depth, 3D object pose, 360 video, 3D bounding box, vehicle detection. [bibtex] [pdf] [doi] [demo] [software] [poster]

Illumination Invariant Automotive Scene Understanding

Issue: extreme variations in environmental conditions such as varying illumination and adverse weather lead to inaccurate semantic scene understanding.

Approach:We compare four recent transforms for illumination invariant image representation, individually and with colour hybrid images, to show that despite assumptions to contrary such invariant pre-processing can improve the state of the art in (deep learning based) scene understanding performance.

Application: By using an illumination invariant pre-process, to reduce the impact of environmental illumination changes, we show that the performance of deep convolutional neural network based scene understanding and segmentation can yet be further improved.

This illuminating result enforces the need for invariant (unbiased) training sets within such deep network training and shows that even a well-trained network may still not offer truly optimal performance (if we ignore any prior data transforms attributable to a priori insight).

1 result

2018

[alshammari18invariant] On the Impact of Illumination-Invariant Image Pre-transformation on Contemporary Automotive Semantic Scene Understanding (N. Alshammari, S. Akcay, T.P. Breckon), In Proc. Intelligent Vehicles Symposium, IEEE, pp. 1027-1032, 2018.Keywords: illumination invariance, semantic scene segmentation, pre-processing, CNN, all-weather performance, deep learning. [bibtex] [pdf] [doi] [demo] [poster]

Real-Time Monocular Depth Estimation

Issue: Synthetic images captured from a graphically-rendered virtual environment primarily designed for gaming can be employed to train a monocular depth estimation model. However, this will not generalize well to real-world images as the supervised model easily overfits to local features present within the training domain.

Approach: 1) train a primary model to estimate monocular depth based on synthetic images. 2) use a secondary model to transform real-world images to the synthetic style before their depth is estimated.

Application: At run-time only requires two forward passes required during inference – once through the style transfer network and once through the depth estimation model.

Our approach produces superior qualitative (sharper) and quantitative (lower error) results compared to the contemporary state-of-the-art.

1 result

2018

[abarghouei18monocular] Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer (A. Atapour-Abarghouei, T.P. Breckon), In Proc. Computer Vision and Pattern Recognition, IEEE/CVF, pp. 2800-2810, 2018.Keywords: monocular depth, generative adversarial network, GAN, depth map, disparity, depth from single image, style transfer. [bibtex] [pdf] [doi] [demo] [software] [poster]

Real-time Low-Cost Omni-directional Stereo Vision

Issue: omni-directional dense stereo vision, based on imagery from two consumer-grade spherical cameras mounted in a novel bi-polar configuration.

Approach: we demonstrate omni-directional dense stereo vision, based on imagery from two consumer-grade spherical cameras mounted in a bi-polar configuration, offering 360 degree depth recovery at 5.5 fps.

Application: we illustrate the disparity and synchronization error achievable with the use of such consumer-grade spherical camera units, in addition to the quality of disparity (depth) available, within the context of on-road sensing for future vehicle autonomy.

1 result

2018

[lin18spherical] Real-time Low-Cost Omni-directional Stereo Vision via Bi-Polar Spherical Cameras (K. Lin, T.P. Breckon), In Proc. Int. Conf. Image Analysis and Recognition, Springer, pp. 315-325, 2018.Keywords: stereo vision, spherical camera, angular disparity correction, bi-polar stereo, vertical stereo, spherical stereo. [bibtex] [pdf] [doi] [demo] [poster]

DepthComp: Real-time Depth Image Completion

Issue:the problem of hole filling in RGB-D (color and depth) images in real-time, when obtained from either active or stereo based sensing.

Approach: plausible hole filling in depth images in a computationally lightweight methodology that leverages recent advances in semantic scene segmentation. We identify a bounded set of explicit completion cases in a grammar inspired context that can be performed effectively and efficiently to provide highly plausible localized depth continuity via a case-specific non-parametric completion approach.

Application: Results demonstrate that this approach has complexity and efficiency comparable to conventional interpolation techniques but with accuracy analogous to contemporary depth filling approaches.

Furthermore, we show it to be capable of fine depth relief completion beyond that of both contemporary approaches in the field and computationally comparable interpolaion strategies.

1 result

2017

[abarghouei17depthcomp] DepthComp: Real-time Depth Image Completion Based on Prior Semantic Scene Segmentation (A. Atapour-Abarghouei, T.P. Breckon), In Proc. British Machine Vision Conference, BMVA, pp. 208.1-208.13, 2017.Keywords: depth filling, RGB-D, surface relief, hole filling, surface completion, 3D texture, depth completion, depth map, disparity hole filling. [bibtex] [pdf] [doi] [demo] [software] [poster]

Off-road Semantic Scene Segmentation

Issue: real-time road-scene understanding is a challenging computer vision task with recent advances in convolutional neural networks (CNN) achieving results that notably surpass prior traditional feature driven approaches.

Approach: we take an existing CNN architecture, pre-trained for urban road-scene understanding, and retrain it towards the task of classifying off-road scenes, assessing the network performance during the training cycle.

Application: We compare this CNN to a more traditional approach using a feature-driven Support Vector Machine (SVM) classifier and demonstrate state-of-the-art results in this particularly challenging problem of off-road scene understanding.

Within the paradigm of transfer learning we analyse the effects on CNN classification, by training and assessing varying levels of prior training on varying sub-sets of our off-road training data.

1 result

2016

[holder16offroad] From On-Road to Off: Transfer Learning within a Deep Convolutional Neural Network for Segmentation and Classification of Off-Road Scenes (C.J. Holder, T.P. Breckon, X. Wei), In Proc. European Conference on Computer Vision Workshops, Springer, pp. 149-162, 2016.Keywords: automotive vision, off-road semantic understanding, off-road computer vision, off-road scene labelling, terrain segmentation, terrain segments, transfer learning, convolutional neural networks, bag of visual words, deep learning. [bibtex] [pdf] [doi] [demo] [poster]

Object Removal for Dense Stereo Vision Based Scene Mapping

Issue: Mapping an ever changing urban environment is a challenging task as we are generally interested in mapping the static scene and not the dynamic objects, such as cars and people.

Approach: a novel approach to the problem of dynamic object removal within stereo based scene mapping, leveraging stereo odometry, to recover camera motion in scene space, and stereo disparity, to recover synthesised optic flow over the same pixel space, we isolate regions of inconsistency in depth and image intensity.

Application: This allows us to illustrate robust dynamic object removal within the stereo mapping sequence.

Results cover objects with a range of motion dynamics and sizes of those typically observed in an urban environment.

1 result

2016

[hamilton16removal] Generalized Dynamic Object Removal for Dense Stereo Vision Based Scene Mapping using Synthesised Optical Flow (O.K. Hamilton, T.P. Breckon), In Proc. Int. Conf. on Image Processing, IEEE, pp. 3439-3443, 2016.Keywords: 3D, automotive stereo, optic flow, disparity projection, pedestrain removal, object removal, scene mapping. [bibtex] [pdf] [doi] [demo] [dataset] [talk]

Raindrop Detection for Automotive Scene Understanding

Issue: degradation due to raindrop presence poses a key visual sensing integrity challenge.

Approach: raindrop detection using {shape, saliency and texture} features isolated from wider scene context and latterly deep learning based approaches.

Application: facilitates identification of scene regions where the image integrity cannot be trusted for other semantic scene understanding and navigation tasks.

Methodology proposed using both traditional feature-based processing pipeline [Webster 2015] and more recently using a combination of reduced complexity CNN architectures and a superpixel based region proposal strategy [Guo 2018].

Evaluated under a range of environmental conditions typical of all-weather automotive visual sensing applications.

2 results

2018

[guo18raindrop] On The Impact Of Varying Region Proposal Strategies For Raindrop Detection And Classification Using Convolutional Neural Networks (T. Guo, S. Akcay, P. Adey, T.P. Breckon), In Proc. Int. Conf. on Image Processing, IEEE, pp. 3413-3417, 2018.Keywords: raindrop detection, rain detection, rain removal, rain noise removal, rain interference, scene context, raindrop saliency, rain classification, CNN, deep learning. [bibtex] [pdf] [doi] [software] [poster]

2015

[webster15raindrop] Improved Raindrop Detection using Combined Shape and Saliency Descriptors with Scene Context Isolation (D.D. Webster, T.P. Breckon), In Proc. Int. Conf. on Image Processing, IEEE, pp. 4376-4380, 2015.Keywords: raindrop detection, rain detection, rain removal, rain noise removal, rain interference, scene context, raindrop saliency, rain classification. [bibtex] [pdf] [doi] [demo] [poster]

Real-time Driver Head-Pose Monitoring

Issue: Head pose estimation provides key information about driver activity and awareness. Prior comparative studies are limited to temporally consistent illumination conditions under the assumption of brightness constancy.

Approach: we present a base comparison, inside a moving vehicle vary considerably with environmental conditions, of three features for head pose estimation, via support vector machine regression, based on Histogram of Oriented Gradient (HOG) features, Gabor filter responses and Active Shape Model (ASM) landmark features.

Application: we estimate driver head pose in two degrees-of-freedom and compare against a baseline approach for recovering head pose via weak perspective geometry.

Evaluation is performed over a number of in- vehicle sequences, exhibiting uncontrolled illumination variation, in addition to ground truth data-sets.

1 result

2014

[walger14headpose] A Comparison of Features for Regression-based Driver Head Pose Estimation under Varying Illumination Conditions (D.J. Walger, T.P. Breckon, A. Gaszczak, T. Popham), In Proc. International Workshop on Computational Intelligence for Multimedia Understanding, IEEE, pp. 1-5, 2014.Keywords: head pose, driver head tracking, gaze tracking, pose estimation regression. [bibtex] [pdf] [doi] [demo]

Quantitative Assessment of Automotive Dense Stereo Vision

Issue: Existing dense stereo assessment techniques consider global scene performance rather than specific performance on foreground objects.

Approach: Utilisation of a publicly available stereo imagery dataset containing foreground object annotation and laser scanner ground-truth data to produce a novel object-based 3D re-projection accuracy metric.

Application: An object based accuracy assessment via ICP provides a method for dense stereo algorithm performance analysis in dynamic real world automotive environments.

1 result

2013

[hamilton13stereo] A Foreground Object based Quantitative Assessment of Dense Stereo Approaches for use in Automotive Environments (O.K. Hamilton, T.P. Breckon, X. Bai, S. Kamata), In Proc. Int. Conf. on Image Processing, IEEE, pp. 418-422, 2013.Keywords: stereo vision, quantative assessment, foreground objects, automotive stereo vision. [bibtex] [pdf] [doi] [demo] [poster]

Real-time Stereo Vision for Automotive Guidance

Issue: Comparative work on real-time dense stereo algorithms uses de-facto laboratory test imagery. Correspondence of these comparative results to real-world environment conditions is unexamined.

Approach: Construct a stereo rig mounted on a standard road vehicle. Compare the performance of five chosen algorithms in the dynamic, complex and noisy automotive environment.

Application: Five real-time dense stereo algorithms are evaluated over the de-facto stereo test imagery, vitrual automotive stereo imagery, imagery from our own on-vehicle automotive stereo rig (below, upper) and that from an independent research projects. Example results from the five algorithms are illustrated in the video in row-wise order left to right - [Block Matching - Konolige 1997], [Semi Global Block Matching - Hirschmuller, 2008], [Non Maximum Disparity - Unger, 2009], [Cross-based Local - Lu, 2009], [Adaptive Cost Aggregation - Wang, 2007] - with the upper left image showing the left stereo view of the scene for reference.

The block and semi-block based algorithms, [Block Matching - Konolige 1997] and [Semi Global Block Matching - Hirschmuller, 2008] are CPU based whilst the rest require GPU support for real-time performance.

Overall we find the computationally complex algorithms (using GPU) can out-perform the simpler against ground truth data on the de facto test set. For real-world sequences, without ground truth, we subjectively assess the comparative accuracy (in terms of coherent object separation and continuity) of the results as differing from the test imagery.

1 result

2012

[mroz12stereo] An Empirical Comparison of Real-time Dense Stereo Approaches for use in the Automotive Environment (F. Mroz, T.P. Breckon), In EURASIP Journal on Image and Video Processing, Springer, Volume 2012, No. 13, pp. 1-19, 2012.Keywords: stereo vision, dense correspondance, semi-global matching, automotive stereo, vehicle-based stereo vision, survey, review. [bibtex] [pdf] [doi] [demo]

Road Environment Classification

Issue: The adaptive vehicle dynamics present in many modern vehicles generate a need for road environment classification – the ability to determine the nature of the current road or terrain environment from an on-board vehicle sensor.

Approach: Here we investigate the use of a low-cost camera vision solution capable of urban, rural or off-road classification based the use of the use of simple Gabor feature responses realized via specific hardware processing.

Application: Extending our earlier work, based on analysis of colour and texture features, we present a methodology to use Gabor response features for real-time visual road environment classification. Processing Gabor filters using hardware solely dedicated to this task enables improved real-time texture classification. Using such hardware enables us to successfully extract Gabor feature information for a four-class road environment classification problem. We used summary histogram as an intermediate level of texture representation prior to final classification using a decision forest classifier.

A simple yet highly discriminatory Gabor feature response is extracted from multiple regions of interest in this forward facing camera view and combined with a trained classifier approach to resolve the multi-class road environment problem of {off-road, urban, major/trunk road and multi-lane motorway/carriageway}. The approach is shown to operate successfully over a range of illumination, weather and road conditions.

Optimal performance of ~96% correct classification is achieved for the {off-road, urban, major/trunk road, multi-lane motorway/carriageway} road type classification problem at ~22fps real-time performance.

2 results

2013

[mioulet13road] Gabor Features for Real-Time Road Environment Classification (L. Mioulet, T.P. Breckon, A. Mouton, H. Liang, T. Morie), In Proc. Int. Conf. on Industrial Technology, IEEE, pp. 1117-1121, 2013.Keywords: random forests, Gabor filters, histograms, road environment, scene classification. [bibtex] [pdf] [doi] [demo]

2011

[tang11classification] Automatic Road Environment Classification (I. Tang, T.P. Breckon), In IEEE Transactions on Intelligent Transportation Systems, IEEE, Volume 12, No. 2, pp. 476-484, 2011.Keywords: road type classification, colour texture classification, automotive vision, terrain classification. [bibtex] [pdf] [doi] [demo]

Automated Road Marking Recognition

Issue: The automatic extraction of road text markings from a forward facing on-board vehicle camera for secondary integration to vehicle navigation and driver control/display systems.

Approach: Marking extraction from the camera video image using a novel pipeline of inverse perspective mapping and multi-level binarisation. A trained classifier combined with additional rule-based post-processing then facilitates the real-time delivery of road marking information as required.

Application: An initial one-time calibration process facilitates the recovery of the vanishing points within the conventional (forward-facing) road scene using a variation on prior work in this area. From this an inverse perspective mapping facilitates the transformation of the forward-facing road surface video image into a "bird's eye view" of the road surface.

A multi-level intelligent thresholding technique is then used to isolated the road markings within the road surface image. This is performed over multiple levels of marking separation criteria in order to compensate for varying lighting and shadows. Overall we achieve real-time road marking extraction and symbol sequence recognition with a successful recognition rate of around 92% per symbol and around 85% for symbol sequences (words / labels). This is achieved at 15 fps.

1 result

2012

[kheyrollahi12marking] Automatic Real-time Road Marking Recognition Using a Feature Driven Approach (A. Kheyrollahi, T.P. Breckon), In Machine Vision and Applications, Springer, Volume 23, No. 1, pp. 123-133, 2012.Keywords: road marking recognition, vanishing point detection, intelligent vehicles. [bibtex] [pdf] [doi] [demo]

Real-time On-road Augmented Reality

Issue: Augmented Reality (AR) as an interface methodology for driver assistance systems requires intelligent placement of projected AR content within the scene to avoid existing environment objects.

Approach: Combined vanishing point and road surface detection enable the real-time adaptive emplacement of AR objects within a drivers’ natural field of view for on-road information display.

Application: Generalised Vanishing Points (VP) of the road scene are first obtained using a temporal filtered RANSAC approach . A subsequent available road surface area (i.e. free of scene objects) is identified, within this bounded area, using the colour-texture histogram surface tracking technique with additional temporal averaging of the output.

An AR object placement location, maximally distant from the available road boundary, is identified based on a Euclidean distance transform and scale parameter isolated from this road region.

1 result

2011

[bordes11ar] Adaptive Object Placement for Augmented Reality Use in Driver Assistance Systems (L. Bordes, T.P. Breckon, I. Katramados, A. Kheyrollahi), In Proc. 8th European Conference on Visual Media Production, IET, pp. sp-1, 2011.Keywords: augmented reality, road segmentation, driver assistance systems. [bibtex] [pdf] [demo] [poster]

Re-sampling & Re-Targeting for Driver Incident Simulation

Issue: Real-time dynamic video generation and visualization for integration into a driving incident simulator.

Approach: Generate a dynamically responsive and realistic composite output video visualisation of driving incidents using a combination temporal video re-sampling, object extraction and re-targeting techniques.

Application: Motion Frame Rate Up-Conversion is used to generate an up-sampled video sequence with sufficient frame sampling to maintain perceptual video realism in response to variable speed demands from the simulator driver controls.

Background differencing is used both to track the object and automatically set the background/foreground seed pixels needed to perform Grabcut object segmentation on a frame by frame basis for extracted object sequence generation. Objects are dynamically re-scaled, inserted and blended in each subsequent frame until either the end of the object sequence or the driver field of view passes the object transition path.

1 result

2011

[heras11driving] Video Re-sampling and Content Re-targeting for Realistic Driving Incident Simulation (A.M. Heras, T.P. Breckon, M. Tirovic), In Proc. 8th European Conference on Visual Media Production, IET, pp. sp-2, 2011.Keywords: Motion Frame Rate Up-Converters (MFRUC), frame interpolation, object segmentation, feature tracking. [bibtex] [pdf] [demo] [poster]

Real-time Speed Sign Recognition

Issue: We present a system for the real-time detection of the current speed limit from road-side speed signs (including national limit signs) from an onboard camera for subsequent on dashboard display or autonomous speed control.

Approach: The approach uses a novel combination of RANSAC shape detection and colour-based scene preprocessing to identify sign candidates with the scene. The detected sign candidates are then passed to a trained Neural Network classifier for final recognition. We use the host vehicle turn indicator to activate automatic turn detection based on the evaulation of a sparse optic flow field.

Application: An example of national limit detection (110 km/h limit on Polish roads) and numerical road signs is shown. Successful detection and recognition is performed in a complex environment with real-time (27 fps) performance including poor weather conditions.

Automatic turn detection cause cancellation of the current speed limit in use for display/control.

2 results

2008

[eichner08speedlimit_a] Integrated Speed Limit Detection and Recognition from Real-Time Video (M. L. Eichner, T.P. Breckon), In Proc. IEEE Intelligent Vehicles Symposium, IEEE, pp. 626-631, 2008.Keywords: automotive vision, real-time sign detection, speed limit detection, RANSAC. [bibtex] [pdf] [doi] [demo] [poster]
[eichner08speedlimit_b] Augmenting GPS Speed Limit Monitoring with Road Side Visual Information (M. L. Eichner, T.P. Breckon), In Proc. IET/ITS Conf. on Road Transport Information and Control, IET, pp. 1-5, 2008.Keywords: automotive vision, real-time sign detection, speed limit detection, RANSAC. [bibtex] [pdf] [doi] [demo]

Automotive Headlight Detection

Issue: Many drivers fail to be aware of oncoming vehicles at night. Here we aim to improve auto detection in this area, using single camera set-up behind the vehicle windscreen.

Approach: Lights candidates are first extracted from the acquired image using intelligent thresholding (below). Tracking allows recovering of the light movement vector, which is then utilized in the rule-based identification. Lights below artificial horizon are correctly classified according to their behaviour and arrangement.

Application:The overall performance of the system varies from 95% to 99% depending on weather conditions (left) with very few false positives occuring during the test footage utilised.

Our novel use of spatial and temporal information results in significant reduction of false positives when compared to earlier approaches like across various weather conditions.

1 result

2007

[eichner07headlights] Real-Time Video Analysis for Vehicle Lights Detection using Temporal Information (M. L. Eichner, T.P. Breckon), In Proc. 4th European Conference on Visual Media Production, IET, pp. I-9, 2007.Keywords: automotive vision, headlight detection, light tracking. [bibtex] [pdf] [doi] [demo] [poster]