Research Projects (Intelligent Vehicles)
Walking Pedestrian Detection
Pedestrian detection is a challenging problem studied over decades. Most algorithms are based on human appearances, which has a large variation that has to be grasped through learning of large samples. Few works use motion as a feature component. In this paper, we tackle this problem by considering only the motion of walking pedestrian. This motion is less dependent to pedestrian pose, body shape, illumination, and background. We model pedestrian motion that has unique properties as compared to background and rigid object motion in the spatial-temporal motion profiles. We identify pedestrian leg motion along with body trace over a short time period. Our method works for a vehicle borne camera where background also moves. We achieved more robust results by dealing with crowds, and other degenerating cases of human motion against background and dynamic scenes. The method is particularly powerful to screen pedestrians in a large data set of naturalistic driving video. Moreover, it has a low computational cost on motion profiles and can be combined with a shape-based method for reducing false positives. It further provides a feasible way to find pedestrian behaviors along walking trace on the street.
Related publications:
Driving Video Segmentation
 In vision-based tasks of autonomous driving, understanding spatial layout of road and traffic in 2D image is usually required at each moment. This involves the detection of road, vehicle, pedestrian, etc. In driving video, the spatial positions of these patterns are further tracked for their motion. This spatial-to-temporal approach inherently demands a large computational resource. In this work, however, we take a temporal-to-spatial approach to cope with fast moving vehicles in autonomous navigating. We sample one-pixel line at each moment in driving video, and the temporal congregation of lines from consecutive frames forms a road profile image. The temporal connection of lines provides layout information of road and surrounding environment, and this method reduces the processing data to a fraction of video in order to catch up driving speed of vehicles; we switch to video frame only when some unknown region is detected. The key issue now is to know different regions in a road profile; the road profile is divided in real time to road, roadside, lane mark, vehicle, etc. We will show in this paper that the road profile can be learned through Semantic Segmentation. We use RGB-F images of the road profile to implement Semantic Segmentation to grasp both individual regions and their spatial relations on road effectively. We have tested our method on naturalistic driving video and the results are promising.
 In vision-based tasks of autonomous driving, understanding spatial layout of road and traffic in 2D image is usually required at each moment. This involves the detection of road, vehicle, pedestrian, etc. In driving video, the spatial positions of these patterns are further tracked for their motion. This spatial-to-temporal approach inherently demands a large computational resource. In this work, however, we take a temporal-to-spatial approach to cope with fast moving vehicles in autonomous navigating. We sample one-pixel line at each moment in driving video, and the temporal congregation of lines from consecutive frames forms a road profile image. The temporal connection of lines provides layout information of road and surrounding environment, and this method reduces the processing data to a fraction of video in order to catch up driving speed of vehicles; we switch to video frame only when some unknown region is detected. The key issue now is to know different regions in a road profile; the road profile is divided in real time to road, roadside, lane mark, vehicle, etc. We will show in this paper that the road profile can be learned through Semantic Segmentation. We use RGB-F images of the road profile to implement Semantic Segmentation to grasp both individual regions and their spatial relations on road effectively. We have tested our method on naturalistic driving video and the results are promising.
Related publications:
Collision Avoidance
The objective of this work is the instantaneous computation of Time-to-Collision TTC for potential collision only from the motion information captured with a vehicle borne camera. The contribution is the detection of dangerous events and degree directly from motion divergence in the driving video, which is also a clue used by human drivers. Both horizontal and vertical motion divergence are analyzed simultaneously in several collision sensitive zones. The video data are condensed to the motion profiles both horizontally and vertically in the lower half of the video to show motion trajectories directly as edge traces. Stable motion traces of linear feature components are obtained through filtering in the motion profiles.
 As a result, this avoids object recognition and sophisticated depth sensing in prior. The fine velocity computation yields reasonable $TTC$ accuracy so that a video camera can achieve collision avoidance alone from the size changes of visual patterns. We have tested the algorithm for various roads, environments, and traffic, and shown results by visualization in the motion profiles for overall evaluation.
Related publications:
Road Visualization
Nowadays, many vehicles are equipped with a vehicle borne camera system for monitoring drivers’ behavior, accident investigation, road environment assessment, and vehicle safety design. Huge amount of video data is being recorded daily. Analyzing and interpreting these data in an efficient way has become a non-trivial task. As an index of video for quick browsing, this work maps the video into a temporal image of reduced dimension with as much intrinsic information as possible observed on the road. The perspective projection video is converted to a top-view temporal profile that has precise time, motion, and event information during the vehicle driving. Then, we attempt to interpret dynamic events and environment around the vehicle in such a continuous and compact temporal profile. The reduced dimension of the temporal profile allows us to browse the video intuitively and efficiently.
