USA Banner

Official US Government Icon

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure Site Icon

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

U.S. Department of Transportation U.S. Department of Transportation Icon United States Department of Transportation United States Department of Transportation

Public Roads - Sept/Oct 2009

Date:
Sept/Oct 2009
Issue No:
Vol. 73 No. 2
Publication Number:
FHWA-HRT-09-006
Table of Contents

Detecting Pedestrians

by David R.P. Gibson, Bao Lang, Bo Ling, Uma Venkataraman, and James Yang

With support from FHWA, researchers are integrating the latest camera technology with traffic control to improve safety at intersections.

ling1
Using cutting-edge cameras and computers, researchers are developing systems to detect pedestrians in crosswalks and delay traffic lights if needed to ensure safe passage for pedestrians such as these in Washington, DC.

In 2007, more than 4,600 pedestrians died in traffic crashes in the United States, according to the National Highway Traffic Safety Administration (NHTSA). That same year, crashes injured about 70,000 pedestrians. Zeroing in on intersections, NHTSA reports that 984 pedestrians were killed and 31,000 injured in 2005. Although these figures are lower than in previous years, the statistics underscore the continuing need for safety improvements. For instance, children 14 years old and younger accounted for 20 percent of all pedestrian injuries and 7 percent of all pedestrian fatalities. For NHTSA and the Federal Highway Administration (FHWA), even one fatality or injury is one too many.

As part of the U.S. Department of Transportation's Intelligent Transportation Systems (ITS) program, FHWA is conducting research and development of vehicle safety and driver information systems. For many systems and applications—such as IntelliDriveSM, traffic control, security monitoring, and pedestrian counting and flow analysis—pedestrian monitoring could add value. Specifically, monitoring can help avoid potential harm to pedestrians when collision avoidance measures or emergency vehicle preemptions are imposed when pedestrians are present. And, pedestrian monitoring can help reduce delays, minimize fuel consumption, and limit vehicle emissions by facilitating traffic control optimization when pedestrians are absent.

Even after decades of research, pedestrian detection at street intersections remains a challenge. Despite the variety of existing technologies, including microwave radar; video image processing; and ultrasonic, acoustic, passive infrared (IR), active IR, piezoelectric, and magnetic sensors, these approaches have yet to excel in detecting pedestrians in real-world applications. The limitations of pedestrian sensors are largely due to the highly dynamic backgrounds typical of intersections. Variable weather and illumination conditions, for example, make it difficult to design system features and templates suitable for all situations. The high false alarm rate—that is, detecting pedestrians who are not really there—associated with these technologies has kept traffic engineers from deploying them on a widespread basis.

But the tide might be turning. Using funding available through the FHWA Small Business Innovation Research (SBIR) program, researchers have developed a new stereo vision-based approach for detecting pedestrians at intersections. The technique involves a prototype of a new IR, light-emitting diode (LED) stereo camera that can detect pedestrians both during the day and at night. The researchers also developed advanced pedestrian detection algorithms that enable them to extract generic three-dimensional (3-D) features from a stereo disparity map, leaving the human figures behind. The technology can discriminate pedestrians from vehicles because automobiles appear basically flat, while human bodies have concave shapes.

With support from the Massachusetts Highway Department (MassHighway), the researchers installed the prototype camera system at the busy State highway intersection of Route 9 and Route 47 in the town of Hadley, MA, for testing over a 3-week period. The results from this pilot test indicate that the prototype is on track toward being ready for commercial sale and widespread use.

"The pedestrian detection application using computerized stereo vision has great potential for improving pedestrian safety," says Subramanian N. Sharma, chief of engineering and research at the New Hampshire Department of Transportation's (NHDOT) Bureau of Traffic, who has been monitoring the progress of this research.

Intelligent Traffic Signal Management

At signalized intersections and midblock crosswalks, pedestrians use pushbuttons to make service requests, that is, to request the WALK signal. Once a request is granted, pedestrians can safely cross the street. But in many cases, after pressing the button, pedestrians do not wait for the signal but instead cross the street when they see a break in the traffic flow. When the crosswalk signal finally turns to WALK, pedestrians might no longer be in the crosswalk, and vehicles end up needlessly stopped. When pedestrians do wait for the WALK signal, they might cross the street quickly, also leaving stopped vehicles idling for no reason.

A reverse situation also can occur. The service time for crosswalk signals often is fixed. However, slow-moving pedestrians such as children or senior citizens might need more time to cross a street than was preconfigured in the system. In these cases, the crosswalk service time should be extended to improve safety.

Both applications—reducing and extending crosswalk times—require a pedestrian-monitoring device at the intersection, such as the one the researchers developed and tested during this study to develop a new, robust system to detect and track multiple pedestrians. When the device detects pedestrians in the crosswalk, it sends a signal to the traffic signal controller, which in turn extends the pedestrian walk phase.

Pedestrian Intervals and Signal Phases

Excerpted from the 2003 edition of the Manual on Uniform Traffic Control Devices (MUTCD):

Guidance: The pedestrian clearance time should be sufficient to allow a pedestrian crossing in the crosswalk who left the curb or shoulder during the WALKING PERSON (symbolizing WALK) signal indication to travel at a walking speed of 1.2 m (4 ft) per second, to at least the far side of the traveled way or to a median of sufficient width for pedestrians to wait. Where pedestrians who walk slower than 1.2 m (4 ft) per second, or pedestrians who use wheelchairs routinely use the crosswalk, a walking speed of less than 1.2 m (4 ft) per second should be considered in determining the pedestrian clearance time.

Option: Passive pedestrian detection equipment, which can detect pedestrians who need more time to complete their crossing and can extend the length of the pedestrian clearance time for that particular cycle, may be used in order to avoid using a lower walking speed to determine the pedestrian clearance time.

Source: FHWA.

According to the manual for the Econolite ASC/3 traffic signal controller used for this research project, once the normal walk time is set, if a pedestrian is detected, the walk time will be extended until (1) the maximum walk time is reached, (2) the elapsed length of the walk extension plus the pedestrian clear time equal the maximum in effect, or (3) the detector input goes to false, meaning there are no pedestrians in the crosswalk. As long as pedestrians occupy the crosswalk, the traffic signals remain red on the potentially conflicting approaches designated by the traffic engineer. When the pedestrians no longer occupy the crosswalk for a short time, the device sends another signal to the traffic controller, which can then change the signal phase to the next appropriate phase for vehicular traffic.

This approach, a type of intelligent traffic signal transition logic, offers many benefits for a community. For example, it can help senior citizens and others safely negotiate the crosswalk. Intelligent pedestrian detection can reduce traffic congestion at street intersections, ease driver frustration, and reduce idling and associated fuel consumption and pollution. By contrast, many of today's intersection traffic control systems are necessarily designed for worst-case pedestrian crossing time scenarios and are not "intelligent" in the sense of being responsive to real-time situations. For example, the researchers had to redesign signal timing on an arterial to accommodate elderly pedestrians who could not safely cross the street in the allotted time, which was originally designed to meet the requirements of the Manual on Uniform Traffic Control Devices (MUTCD), which called for allowing a walk speed of 4 feet (1.2 meters) per second. Note that FHWA expects to change this standard to 3.5 feet (1.07 meters) per second in the future.

ling2
The flow diagram shows how the new system locates pedestrians in the crosswalk and converts the images into a signal used by traffic lights. The structure is essentially parallel because the system performs pedestrian detection from both images, instead of from a single image, with final detection made by fusing the right and left detections.

Software Components for Detection System

For automated pedestrian detection, traffic managers can employ stereo camera systems (twin cameras), wireless receivers, software algorithms, and stand-alone computers. The key issue in image capturing is sampling time. The most important issue for the algorithms is computation time. The cameras capture a new pair of samples once the computer and algorithms have completely processed the pair captured in the previous sampling. For pedestrians at street intersections, a system must capture two or three detections in 1 second to ensure proper detection.

Another approach is image differencing. Here, the same camera (right or left) takes two consecutive images, and a "difference image" is made by subtracting the two images. Image differencing is particularly suitable for detecting moving targets. In theory, stationary objects in the image background (such as buildings and streets) appear in both consecutive images, and computers can easily remove them.

However, weather and illumination changes can generate "moving objects" and shadows that cause false detections. Using image differencing alone can generate a high number of false detections. For example, on a rainy day, image differencing can produce continuous detections even when no pedestrians are in the crosswalk. But the new stereo-based approach can further discriminate the moving objects (whether true or false) using 3-D features and thus resolve the problem in practice.

In reality, however, the differencing image may retain part of the stationary objects. This phenomenon is caused by camera "noise," which is mainly related to the quality of the image-sensing device, such as the charge-coupled device used in the study. Camera jiggling and illumination changes also can cause retention of stationary objects. Camera jiggling is unavoidable. Because the stereo camera system is mounted outdoors at the street intersection, it jiggles in strong winds and other conditions. Illumination changes can alter the pixel values of two consecutive images randomly, making the still objects appear in the difference image. Noise in digital cameras is hardware-dependent (high-quality cameras are less noisy because of the electronics used) and can never be completely filtered out. Applying advanced image-filtering algorithms will not remove all stationary objects from the difference image. To solve this problem, researchers have developed an advanced image-filtering method that significantly reduces camera noise.

Compared to the image-differencing approach, detection using a single image offers some advantages. Because detection is made based on a single image, it is almost immune to camera jiggling and illumination changes. However, it is difficult to use a single camera or monocamera system to reliably detect pedestrians in an outdoor environment, such as at street intersections, largely because of dynamic variations of background and pedestrian appearance.

In the stereo camera approach, the systems usually detect pedestrians mainly based on 3-D features extracted from a disparity map (which provides the depth information of objects in an image), which can be time consuming for the system. The calculation of a full-size disparity map usually takes a few seconds, making it impossible for the system to capture two or three samples per second. Further, because a pedestrian usually occupies only a small area in the image, most disparity information in a full-size map is not even used for pedestrian detection. For example, buildings and light poles can have their own disparity maps as well. Because detecting these stationary objects is unnecessary, the stereo camera system can exclude them from the calculation of the overall disparity map, thus reducing computation time. Otherwise, the system would be unable to reach three or four detections per second using a low-cost industrial PC, which often has a slow central processing unit and limited memory.

Image Filtering

To overcome problems associated with some difference images, the researchers developed a suite of approaches to filter out glitches generated by camera noise, camera jiggling, and illumination changes. To window out moving pedestrians—that is, detect and extract them from an image—the researchers developed a new method to estimate the baseline, or threshold, noise characteristics of the camera. The pedestrian detection system adaptively estimates the threshold value (image pixels for which intensity values are larger than this threshold value are retained; otherwise, they are treated as noises and removed) from the image content, independent of the camera characteristics.

ling3
Shown here, two consecutive low-resolution images from a single surveillance camera show a woman and little girl walking through a parking lot. By subtracting the static portion of the images (the parked cars), the system can "window" out a new image of just the moving "objects" (the woman and girl), facilitating pedestrian detection. Photo: Migma Systems, Inc.

This method is essential in accurately chipping out moving objects since camera noise is not the only source causing the imperfect difference image. This method can be applied to various types of cameras. The pedestrian detection system applies the method simultaneously to right and left images acquired from the stereo camera system. Moving pedestrians are windowed out from both images as well. By windowing out the pedestrians, the system can detect when a pedestrian is in the crosswalk and delay the signal change to allow more time for the person to cross the intersection safely.

Disparity Map Estimation

One of the main advantages of using a stereo vision system is to relate the distance between object and camera, and the disparity obtained from two images taken by the right and left cameras. Researchers have studied disparity estimation for years, yet it remains a hurdle for those in the computer vision community. The main challenges are noises and occlusions (or blockages) in the image, fewer distinguishing textures in the search region, and depth discontinuity.

ling4
The researchers used this IR LED stereo camera, with 200 LED emitters, in the new pedestrian detection system.

To accomplish the goal of capturing three or four detections per second, the researchers needed to avoid developing a computation-intensive estimation scheme for the disparity maps. Many academic methods, such as phase-based matching, Markov random field modeling, and dynamic programming, will not work quickly enough for near real-time pedestrian detection applications. However, the researchers' new approach offers an efficient method for estimating the disparity maps that satisfies the time constraint. In short, the researchers only estimate the disparity values for the objects windowed out and refine the disparity values using spatial correlations.

Pedestrian Detection

Once the system estimates the disparity map, it can extract features from the map. Ideally, the disparity map of a pedestrian has the corresponding parts of the pedestrian, such as head, upper body, and legs. However, due to camera noise, camera jiggling, and illumination changes, the disparity map of a pedestrian is often disconnected or sometimes incomplete. Therefore, the system has difficulty extracting human body shapes from the map. Also, range information alone is not sufficient to discriminate between a moving vehicle and a walking pedestrian.

ling5
At the field test intersections, researchers mounted the two receivers and antenna, which receive the signals from the wireless IR LED cameras, inside a wooden box on the side of the traffic controller cabinet. The box is circled in red.

To overcome this challenge, the researchers developed a set of 3-D features from the disparity map. The features reflect the geometric differences between moving pedestrians and moving vehicles or other images caused by camera jiggling or illumination changes. However, disparity alone is not enough to accomplish the goal of near zero false detections. In addition to including 3-D disparity maps, the researchers also designed the system to extract features, which can be used to categorize 3-D objects as either pedestrians or nonpedestrian, from color images and use them in pedestrian detection and discrimination. Simply put, the color of a vehicle body, for example, often is uniformly distributed, while a person's clothes tend to have mixed colors, thus facilitating distinction between vehicles and pedestrians.

IR LED Stereo Camera

The IR LED stereo camera used in this study consists of two stand-alone cameras paired together. The stereo system captures images during all illumination conditions. The cameras provide high-resolution color images during the day and, gray-scale pictures in low-light conditions such as evening and night. The cameras use a high-resolution, 0.33-inch (0.85- centimeter) color, charge-coupled device and operate at 5.8 gigahertz. One hundred LED emitters in each camera make it possible for the system to detect pedestrians 80-100 feet (24-30 meters) away in total darkness.

To construct the stereo camera, the researchers positioned the two IR LED cameras side by side, such that the two focal rays of the lenses are parallel and perpendicular to the stereo baseline, and the image planes of both lenses are colinear. This arrangement ensures that the system can estimate the disparity map accurately.

Field Trial at a Highway Intersection

With assistance from MassHighway District 2, the researchers installed the prototype at a State highway intersection in Hadley, MA, for the 3-week field trial. Workers placed a mini-PC hosting all the detection algorithms inside a nearby traffic signal controller cabinet. The system configuration for this field trial included two wireless receivers and a wireless air card that received images from the stereo camera and provided wireless Internet access. Because the metal traffic controller cabinet blocks the wireless signals, the researchers placed the receivers and wireless card in a wooden box mounted beside the cabinet. They connected a separate underground power cable to the stereo camera through the inside of the traffic light pole, essentially invisible to the public.

ling6
This photo shows the traffic controller cabinet diagonally across the intersection, highlighted with an orange circle around it.

The researchers designed the system to recover itself from power and Internet communication failures. They implemented a mechanism to track the operation status of the mini-PC, whereby the system sent a "heartbeat image" to a Web server every 5 minutes. Each heartbeat image was a snapshot of the crosswalk at the moment it was sent. From this image, the researchers could determine whether the system was operating properly, detect interference between the wireless camera and its receiver, decide whether the wireless card was working properly, and assess weather conditions at the test site.

During the field trial, one camera in the stereo camera system recorded an image every 2 seconds during four predetermined, 2-hour periods: period A, 7-9 a.m.; period B, 11 a.m.-1 p.m.; period C, 4-6 p.m.; and period D, 8-10 p.m. The system stored these recorded images on the mini-PC's hard drive. The researchers then used the images to estimate the positive detection rate and missing detection rate. These four periods of time represent three rush hours and one night period. From the images recorded during these intervals, the researchers manually scanned through the images to identify pedestrians. Then, they checked the detections, which were also recorded. If the pedestrians present in the recorded images during the four time periods were among the images with detected pedestrians at the same time, the researchers concluded that the pedestrian detection was accurate. Otherwise, pedestrians were missed.

Evaluation of Prototype Performance

Because the study did not account for the total number of vehicles passing through the test site during the field trial, the researchers expressed the false detection rate as the number of false detections (that is, vehicles) per minute. The two intervals with the most missing detections were 8-9 a.m. and 4-5 p.m., which coincided with the local rush hours.

Other time intervals had no missing detection information because there were no independent recordings during these time intervals. One way to estimate the missing detections in these time intervals was to infer the missing detections from the actual positive detections. The researchers assumed that the number of missing detections was approximately proportional to the number of actual detections during the same period. They used this rule to infer the missing detections in the time intervals in which there were no recorded images.

False Detection Rate Summary
Time Period Detection Rate
6:00 a.m.–6:00 p.m.
1 per 63 minutes
24 hours 1 per 77 minutes

Source: Migma Systems, Inc.

Using the number of missing detections (actual and estimated), the researchers estimated the rate of missed pedestrian detection as one pedestrian every 15 hours, between 6 a.m. and 9 p.m. Because few if any pedestrians crossed the intersection after 9 p.m. and before 6 a.m., the researchers concluded that the pedestrian missing rate is approximately one per day.

Detections Made During 6 a.m.–6 p.m. and 24 Hours

 

Pedestrian
(6:00 a.m.–6:00 p.m.)
Vehicle
(6:00 a.m.–6:00 p.m.)
Pedestrian
(24 hours)
Vehicle
(24 hours)

August 10

9

9

10

14

August 11

9

19

9

25

August 12

22

13

25

19

August 13

14

11

18

18

August 14

16

8

19

18

August 15

9

10

13

22

August 16

6

5

6

14

August 17

7

12

11

19

August 18

2

3

2

6

August 27

12

2

14

10

August 28

13

10

21

17

August 29

8

20

11

26

August 30

13

8

15

20

August 31

16

11

20

24

September 1

11

9

14

14

September 2

10

3

19

10

September 3

16

19

6

15

September 4

15

10

23

16

September 5

17

20

24

13

September 6

6

19

7

41

September 7

14

9

18

20

September 8

9

12

14

15

September 9

2

2

2

2

Total

256

244

321

398

Source: Migma Systems, Inc.

The researchers summarized the overall performance of the prototype as follows: false alarm rate = one per 60-70 minutes; pedestrian missing rate = one per day.

ling7
This bar chart records missing detections for every hour from 6 a.m. until 9 p.m.

Representative Pedestrians

The system detected many pedestrians at the intersection, individuals and groups, during the field trial. For example, the system detected a pedestrian walking with a cane. Detection of slow-moving people is important because they might require extended service times to walk across the street safely.

ling8
This bar chart records detections and missing detections for every hour from 6 a.m. until 9 p.m. The researchers counted the missing detections from images recorded during four predetermined intervals.
ling9
This bar chart records detections and missing detections for every hour from 6 a.m. until 9 p.m. For the time intervals other than the four predetermined ones, the researchers estimated the missing detections using the new method they developed.

The system also detected children with bicycles crossing the intersection. This ability is extremely important for protecting the safety of children. Although pedestrians generally are difficult to detect at night using regular video cameras, the IR LED stereo camera developed by the researchers was able to detect pedestrians in the dark. And, in early morning and late afternoon, pedestrians often cast long shadows that make it difficult to detect them accurately. But, again, the researchers' system overcame these challenges.

ling10
During the field trial, the IR LED stereo camera captured this 256- by 192-pixelresolution color image of a pedestrian walking slowly with a cane.

Product Development

With the field trial completed, the third phase of the project, product development, is now underway. In the current prototype, all detection algorithms are executed by the mini-PC placed inside the traffic signal controller cabinet. In the final product offering, the IR LED stereo camera will host the detection algorithms. Moreover, the smart stereo camera will support Wi-Fi wireless communication, making it an Internet protocol (IP) IR LED stereo camera that overcomes the difficulties of disparity estimation caused by IP packet random delays. (The commercial IP camera has a random delay, so it is impossible to acquire two images simultaneously using two stand-alone commercial IP cameras. Therefore, the disparity map cannot be estimated using these IP cameras. In this study, the researchers acquired the images from two IR LED cameras to avoid the random delays. Detection results are transmitted wirelessly using TCP/IP.)

Phase three funding also will facilitate product prototyping and manufacturing. FHWA will carefully balance the mechanical design of the product and its cost to make the final product affordable for widespread deployment. The researchers expect that the final product would cost less than $3,000.

And what of the ultimate, long-term value of this research? Says Dan Stewart, manager of the bicycle and pedestrian program at the Maine Department of Transportation, "This initiative has the strong potential to improve pedestrian safety and reduce injuries and deaths, as well as improve traffic flow."

ling11
These photos are actual 256- by 192-pixel-resolution color images from the stereo camera showing a child on a bicycle (left) and, later, three children walking their bicycles across the intersection.
ling12
These gray-scale infrared images demonstrate the camera system's ability to detect pedestrians at night. At left, a couple crosses the intersection toward the camera; at right, a couple walks away from the camera.
ling13
The system is capable of distinguishing pedestrians from the long shadows they cast in the morning and evening. At left, the camera captures a man walking toward it and, at right, a woman and child walking away from the camera.

 


David R.P. Gibson is a highway research engineer on the Enabling Technologies Team in FHWA's Office of Operations Research and Development. He is a registered professional traffic engineer with bachelor's and master's degrees in civil engineering from Virginia Polytechnic Institute and State University. His areas of interest include traffic sensor technology, traffic control hardware, traffic modeling, and traffic engineering education.

Bao Lang is the district traffic engineer for MassHighway. He is a registered professional engineer with a bachelor's degree in civil engineering from the University of Massachusetts.

Bo Ling received his M.S. in applied mathematics and Ph.D. in electrical engineering from Michigan State University. He has served as a principle investigator for numerous government-funded research projects. He is a cofounder and president and CEO of Migma Systems, Inc. Ling is a senior member of the Institute of Electrical and Electronics Engineers, Inc. and a part-time faculty member of Northeastern University's Department of Electrical and Computer Engineering. His research interests include applying advanced signal processing algorithms to sensing applications.

Uma Venkataraman has an M.S. in computer science from India's University of Madras and more than 8 years of software development experience. She joined Migma Systems as a senior software engineer in 2003. She focuses on application software development, with research interest in the application of ontological models to enable data sharing and system interoperation among disparate systems.

James Yang received his B.S. in mechanical engineering and B.S. in computer science from Southeast University in China, and his master's in computer information systems from Boston University. He joined Migma Systems as a network engineer in 2005.

Acknowledgement: The technology described here was developed under USDOT SBIR Phase I funding (Contract No. DTRT57-05-C-10105) and Phase II funding (Contract No. DTRT57-06-C-10030).

For more information, contact David R.P. Gibson at 202-493-3271 or david.gibson@fhwa.dot.gov or Bo Ling at 508-660-0328 or bling@migmasys.com.