USA Banner

Official US Government Icon

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure Site Icon

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

U.S. Department of Transportation U.S. Department of Transportation Icon United States Department of Transportation United States Department of Transportation

Public Roads - September/October 2016

Date:
September/October 2016
Issue No:
Vol. 80 No. 2
Publication Number:
FHWA-HRT-16-006
Table of Contents

Big Data

by Tianjia Tang and Gene McHale

Collecting, storing, and processing transportation information is presenting new challenges but also new opportunities to track travel.

 

tan1
Scientists at the Transportation Secure Data Center analyze national multimodal transportation data on a wall-sized map. FHWA partnered with the National 30 Renewable Energy Laboratory to establish the center to securely handle data for millions of miles of travel.

 

American spirit, ingenuity, and success are evident in the Nation’s canals, railways, highways, airports, and seaports. In 2009, this multimodal transportation system logged more than 392 billion person trips. In 2013, it moved the shipment of more than 20.1 billion tons (18.2 billion metric tons) of goods worth near $18 trillion, according to a recent report from the U.S. Bureau of Transportation Statistics. Capturing data like these helps with understanding not only how the U.S. transportation system is functioning, but also with determining how transportation professionals can improve it.

In the last 20 years, increases in the amount and availability of data--driven primarily by the near ubiquity of the Internet and wireless devices--have enabled the transportation community to understand issues in a timelier manner at both the micro and macro levels. This increase in data is both a challenge and an opportunity for the transportation community. It enables a more indepth understanding of highway safety issues, travel behavior, and mobility, but it also demands significant technological resources and human capital investment to maintain.

The Federal Highway Administration is undertaking a wide range of initiatives to adopt, use, and lead data-driven and fact-based decisionmaking and management. These initiatives cover subjects like financial management, program administration, and transportation planning and operations, as well as advanced research activities. Here’s some background on “big data” and a review of FHWA’s information collection goals and initiatives.

Defining Big Data

The phrase refers not only to the large volume of information, but also to the various types of data, the real-time nature of modern data, and the new tools and techniques necessary to manage and analyze it.

Big data has at least seven dimensions: volume, velocity, variety, veracity, value, people, and governance. Volume refers to how many data are available. Velocity stresses how quickly the data are generated or gathered. Variety refers to how the data are structured. Data veracity refers to its trustworthiness, and data value refers to how meaningful the information is. People play a vital role in processing and analyzing data. Finally, governance refers to the flow of information and how the data are gathered and processed.

Recording, Storing, and Processing Big Data

The digital revolution has facilitated the ability to record and analyze a tremendous amount of information. In the transportation sector, traffic flow sensors, cameras, and GPS devices can record needed information. In addition, transportation agencies record the funds they spend and the fees they charge to maintain a given highway. Smartphones record the paths of travelers, tracking geospatial and temporal information. Sensors track and record weather and roadway conditions. Connected vehicles and driverless vehicles communicate not only with central infrastructure, but also among themselves. All these devices and countless others generate and store big data.

Data are often stored in a database, on a hard drive, or, more recently, in Web-based storage known as a cloud. Databases store data digitally in rows or columns, or some other hybrid fashion. This storage method affects data retrieval and processing efficiencies. Differences in processing efficiency, which are usually minimal for small data, are hugely consequential for big data. Choosing the right way of storing and processing big data is critical.

Collecting Data of Value

When FHWA considers launching a big data initiative, it first makes an assessment of value. That is, how will the data initiative help plan, operate, and maintain the U.S. highway system more efficiently? How will the effort help the public understand transportation issues and challenges better? Could it enable the public to become part of the solution?

 

tan2
The FHWA Data Visualization Center produced this infographic to demonstrate how Federal fuel taxes have changed over time. Using graphics like this to visualize data makes it easier to see the big picture.

 

“We do not collect, process, and analyze data just for the sake of it,” says David Winter, senior policy information executive at FHWA. “We have a clear goal and objective behind all of our data programs. Through big data, we can better understand our highway system, which helps us accomplish our goal to develop more effective and efficient solutions and strategies in a timely manner. It also helps us determine what we’re missing.”

To ensure efficient data collection and processing, and to achieve maximum data compatibility and comparability, FHWA developed and recently began implementing its first data governance guidance with the publication of the FHWA Data Governance Plan – Volume 1: Data Governance Primer in July 2015. The guidance prescribes FHWA review processes for data investment and establishes internal data architectures for efficient data collection, storage, and analysis. The guidance also defines structured data elements that are common in the transportation community, promoting data consistency, comparability, and compatibility across the sector. More information on the guidance is available at www.fhwa.dot.gov/datagov.

 

tan3
The Data Visualization Center produced this infographic showing how the number of vehicles on the road is increasing at a rate faster than the number of drivers on the road.

 

To promote public involvement in transportation decisionmaking, FHWA strives to make big data accessible. As Web-based technology advances, FHWA has shifted its focus from releasing static data and analyses to delivering both data and analytical tools for the public to use in performing its own analysis. The agency also delivers data in many different formats, accommodating the range of analytical tools and commercial software available on the market.

“Our public deserves to know how government does its business,” says U.S. Department of Transportation Chief Data Officer Daniel Morgan. “Providing easy access for the public to our data is the first step in the right direction. We continually strive to do this in a timelier and more comprehensive manner.”

Making data accessible also means making it easy to understand. Data visualization helps analysts decipher patterns and interrelationships that may otherwise be hard to recognize. Using mapping, charts, and figures to visualize data offers analysts an opportunity to see the big picture. Visualization also can make the data easier to understand. FHWA has established a Data Visualization Center to deliver data and information to the public in a visually informative format that is easy to understand.

Analyzing Passenger Travel

Years ago, FHWA established the National Household Travel Survey to assist transportation planners and policymakers who need comprehensive data on travel and transportation patterns in the United States. Since 1969, Americans randomly selected from residential addresses have been taking part in the survey. The information gathered has contributed to improving safety, reducing congestion, tracking improvements in air quality, and planning for future investments in transportation.

USDOT and other agencies use this information in a number of public reports. For example, FHWA published a summary of travel trends captured by the 2009 survey (see http://nhts.ornl.gov/2009/pub/stt.pdf). Other reports are available at http://nhts.ornl.gov/publications.shtml. FHWA also has published a wide range of database queries for public use at http://nhts.ornl.gov/tools.shtml.

 

tan4
Participants at the 2016 annual meeting of the Transportation Research Board take a look at depictions of travel behavior based on results from the National Household Travel Survey.

 

Under FHWA’s Exploratory Advanced Research program, the Office of Highway Policy Information has initiated research to devise analytical tools to evaluate corridors of regional and national significance for passenger travel as related to both freight and local passenger movements. This effort has resulted in a prototype functioning model called the Traveler Analysis Framework. Reports and products resulting from this effort are available at www.fhwa.dot.gov/policyinformation/analysisframework/02.cfm.

Analyzing Freight Movement

FHWA also has developed online tools to analyze freight data. The Freight Analysis Framework, produced through a partnership between the Bureau of Transportation Statistics and FHWA, integrates data from a variety of sources to create a comprehensive picture of freight movement among States and major metropolitan areas by all modes of transportation. Starting with data from the 2012 Commodity Flow Survey and international trade data from the U.S. Census Bureau, the framework incorporates data from agriculture, extraction, utility, construction, service, and other sectors. Users can download freight data collected and analyzed under the framework and generate a wide range of customized traffic flow maps, charts, and figures and perform additional analysis, such as customized congestion evaluations. The framework is available at http://ops.fhwa.dot.gov/FREIGHT/freight_analysis/faf/index.htm.

Freight Flows by Highway, Railroad, and Waterway: 2010

tan5
Shown here are multimodal freight flows based on data from the FHWA Freight Analysis Framework. The framework integrates data from many sources to create a comprehensive picture of freight movement across the United States.

 

The Freight Analysis Framework Data Tabulation Tool is designed to enable users to create and download summary tables directly from the framework’s regional database. The user can select one or more elements from each of four categories (total, domestic, import, and export flows) to generate a customized dataset on demand and download the results in Excel® for further analysis. The tool is available at http://faf.ornl.gov/faf4/Extraction0.aspx.

Ensuring Data Security

In 2010, FHWA partnered with theNational Renewable Energy Laboratory to establish the Transportation Secure Data Center to handle second-by-second GPS readings for millions of miles of travel, along with vehicle characteristics and demographics on survey participants, while preserving their privacy. The data center provides free access to detailed transportation data from a variety of travel surveys and studies via a secure network. This centralized repository offers features such as linked reference layers, data filtering, road grade and road network matching, summary statistics, and dataset comparisons through its online portal at www.nrel.gov/transportation/secure_transportation_data.html.

Technicians screen the initial data for quality control, translate each dataset into a consistent format, and interpret the data for spatial analysis. Users may quickly access publicly available cleansed data without personally identifiable information to support applications that require detailed travel distance and/or speed information, but not detailed latitude and longitude spatial information.

Researchers may access additional data, notably location data, by submitting an application for clearance by an oversight committee. Cleared users may work directly with protected data via remote connection and receive analysis results that have passed a disclosure assessment by the center’s staff.

Integrating Different Types of Data

Very often, a single organization will collect a single set of data for a specific purpose. But collecting data in multiple dimensions and from multiple sources presents a fuller picture. Recently, FHWA launched an Integrated Transportation Data Analysis Platform that combines project-level data on financial investments with annual data from the Highway Performance Monitoring System reported by States, annual data on the National Bridge Inventory and information reported by State highway agencies, and traffic flow data.

This new platform has enabled FHWA to perform integrated analysis with great flexibility and efficiency for the first time. It offers not only timely analysis of the results of program implementations and impacts by translating data into information, but also the evidence needed to decipher how certain conditions (for example, roadway surface, traffic, weather) impact one another. The system is currently available for use only by FHWA; as it is further enhanced, the agency plans to make it available to the public.

Gathering Performance Management Data

Although performance management for accountability is not a new concept in highway operations, the Moving Ahead for Progress in the 21st Century Act (2012) and the subsequent Fixing America’s Surface Transportation Act (2015) have made performance management a legal requirement. In response to the legal mandate and for consistency and comparability purposes, FHWA has been obtaining travel time data covering National Highway System roadways and making it available for use by all State highway agencies and metropolitan planning organizations.

The FHWA travel time data, known as the National Performance Management Research Data Set, includes passenger vehicle and truck travel time for a given roadway segment in intervals of 5 minutes. Roadway segments called traffic message channels range from less than 0.1 mile (0.16 kilometer) in congested urban areas to more than 10 miles (16 kilometers) in rural areas. With nearly 230,000 miles (370,000 kilometers) of National Highway System roadways, which consist of more than 2.6 million unique traffic message channels that each have 288 time intervals in a day, the datasets demand significant analytical capability.

 

tan6
These server banks are inside the Transportation Secure Data Center, where FHWA and the National Renewable Energy Laboratory make available to the public detailed transportation data from a variety of travel surveys and studies via a secure network.

 

FHWA has integrated this data with its Highway Performance Monitoring System to analyze both speed and volume and underlying infrastructure conditions. Examples of the latter include the number of lanes, width of lane, curvature, and pavement condition.

In addition, FHWA is developing an online data portal that State highway agencies can use to submit performance management data and information.

Using Big Data for Safety And Operations Research

In 2006, FHWA’s research program on roadway safety partnered with the American Association of State Highway and Transportation Officials and the Transportation Research Board to launch the second Strategic Highway Research Program (SHRP2). This program created two sets of highly unique big data: the Naturalistic Driving Study and the Roadway Information Database.

The driving study recruited from six States 3,500 male and female volunteers of various ages, and outfitted their vehicles with video cameras, radar, GPS, and other sensors to collect data continuously as they went about their daily driving tasks. The majority of participants were in the study for 1 to 2years during a period from 2010 to 2013. Data recorded during the study include information on more than 5.4 million trips representing more than 30 million vehicle-miles (48 million vehicle-kilometers) and 1 million hours traveled. These data provide information on the driver and driving behavior, individual trip characteristics, including events (crashes and near-crashes), nonevent “normal” driving (exposure data), and continuous vehicle network data, such as accelerator and brake use, steering wheel angle, and speed.

The Roadway Information Database contains geospatial data that provide the context for the driving study’s trips, including roadway characteristics and features, crash histories, traffic volumes, weather, 511 information, work zones, and railroad crossings.

Integrating these data offers scaled and detailed information on human behavior, vehicle and roadway conditions, and other contextual information. Together, these data provide previously unavailable information to the highway community on how people actually drive in real-world conditions. The driving study and information database are geo-referenced and linkable, enabling driver behavior to be matched with the roadway and temporal elements, such as surrounding traffic, work zones, and weather. These data provide decisionmakers with better information, resulting in a more efficient, reliable, and inherently safer experience for road users.

FHWA’s Office of Safety Research and Development is currently developing the Safety Training and Analysis Center at the Turner-Fairbank Highway Research Center to assist the research community and State departments of transportation in using driving study data and the information database. The center will provide qualified researchers with secure access to the data without jeopardizing the privacy of those who participated in the study. It will advance the design, development, and execution of training to support research and analysis focused on addressing the safety of the roadway environment, specifically the impact of roadway features on driving behavior. It also will provide technical assistance to stakeholders (primarily State DOTs), including access to subject matter experts from the academic and scientific communities working at the Turner-Fairbank Highway Research Center to expand the roadway safety body of knowledge.

The Safety Training and Analysis Center also will support the needs of USDOT by using SHRP2 safety data to conduct research on priority topics, and by developing tools that enhance data extraction and analytical capabilities.

Average Speed on Maryland Interstate Highways: July 2013

tan7
These graphs showing morning peak-hour vehicle speeds are based on data from the National Performance Management Research Data Set.

 

FHWA also has begun a pooled fund study to support research using data from the driving study and information database. The goal is to advance the development of implementable solutions for State and local transportation agencies with an emphasis on the broad areas of safety, operations, and planning. FHWA will lead the pooled fund with active participation from member State and local agencies to determine what research to undertake. The studies could include development and improvement of countermeasures, predictive models, and design guides, policy recommendations, and the advancement of a connected-automated highway system, among other subjects. The Safety Training and Analysis Center will manage the pooled fund, and with oversight and approval of a technical advisory committee, develop work plans to address the committee’s research needs, manage research contracting, and conduct each individual project.

Big Data on Connected Vehicles

The USDOT connected vehicle research program is a multimodal initiative that aims to enable safe, interoperable networked wireless communications among vehicles, infrastructure, and personal communications devices. Research has resulted in a considerable body of work supporting pilot deployments, including concepts of operations and prototyping for more than two dozen applications.

The pilot deployments are expected to integrate connected vehicle research into practical and effective elements, enhancing existing operational capabilities. The intent of the pilot deployments is to encourage partnerships of multiple stakeholders, such as private companies, States, transit agencies, commercial vehicle operators, and freight shippers. The goal is to deploy applications using data captured from multiple sources to support improved system performance and enhanced performance-based management. The data sources include vehicles, mobile devices, and infrastructure, across all elements of the surface transportation system from transit to freeway, arterial, parking facilities, and tollways. The pilot deployments also are expected to support an impact evaluation that will inform a broader cost-benefit assessment of connected vehicle concepts and technologies.

In 2012, USDOT tested more than 2,800 equipped vehicles operating on the streets of Ann Arbor, MI. All of the vehicles were equipped with a vehicle awareness device that wirelessly transmitted data, including speed, location, and heading data, for receipt by specially equipped participating vehicles and roadside devices. In addition, nearly 400 of these vehicles were also equipped with vehicle-to-vehicle safety technology that could both transmit and receive information. The technology helped everyday drivers avoid crashes as they traveled along their normal routes. Safety apps warned drivers of alerts such as braking vehicles ahead, vehicles in their blind spots, or impending traffic light violations. The study proved that connected vehicle technology indeed works in the real world and in a variety of vehicle types, including cars, trucks, transit vehicles, motorcycles, and even bicycles.

2015 Transportation Datapalooza
Infographic. Used on a marketing flyer, a colorful graphic of a head is divided into interlocking sections labelled economics, safety, policy, multimodal, conditions, and performance management.USDOT hosted the 2015 Transportation Datapalooza on June 16 and 17 at its headquarters in Washington, DC. This conference was the third annual data-focused event hosted by USDOT. The 2015 event focused on sharing data collection, applications, and analytical techniques spanning all transportation modes and highlighting innovations in harnessing the power of big data to develop a safe and efficient multimodal transportation system. An estimated 440 participants attending in person and online learned the latest information on various USDOT data initiatives. They also learned USDOT data coverage and primary data usages, industry and private business advancements in data collection and analysis, challenges associated with big data, research and development needs, and private and public partnerships in data collection and processing opportunities. USDOT plans to host the next Datapalooza in the summer of 2017.Advertisement. A marketing ad for the Transportation Datapalooza: A Showcase of Innovative Technology Solutions.


Imagery used to promote the 2015 Datapalooza.

During the Ann Arbor Safety Pilot, each of the 2,836 vehicles generated data every one-tenth of a second. The FHWA Office of Operations Research and Development managed a project to test the pilot’s safety data using a 2-month sample of this data, with close to 4 billion basic safety messages. The project assessed the data’s capability to predict near-term future traffic conditions. Initial results indicate that researchers could use connected vehicle data to predict future road congestion, given current conditions and a network model created using big data analytics. A cleansed version of this 2-month data sample is available now on a research data exchange at www.its-rde.net, along with many other public datasets. For more information on the connected vehicle research program and its pilot deployments, visit the Intelligent Transportation Systems Joint Program Office Web site at www.its.dot.gov/pilots.

Building the Foundation Of Technology Programs

FHWA champions big data through its leadership and significant investments. The Fixing America’s Surface Transportation (FAST) Act allows FHWA to dedicate not only monetary investments (per FAST Act Section 6028: Performance Management Data Support Program), but also step-up efforts to engage State highway agencies and local governmental agencies to improve data quality and timeliness, and to enhance data collection and utilization efforts.

FHWA’s big data programs will evolve with technological advances in data generation, collection, and analysis, but its goals will remain the same.

“FHWA’s goal to make our highways the safest, most efficient, and convenient to our people will never change,” says FHWA Associate Administrator for Research, Development, and Technology Michael Trentacoste. “Our effort to achieve the ultimate objective of ‘zero’ deaths will never waver. However, we must rely on new technology, and data and information are the foundation for allour technology programs.”


Tianjia Tang is chief of the Travel Monitoring and Surveys Division at FHWA’s Office of Highway Policy Information. He holds a bachelor’s degree in engineering from the University of Central Florida and a Ph.D. from the University of Arkansas. He is a registered professional engineer in Georgia.

Gene McHale leads the Transportation Operations Applications team, Office of Operations Research and Development, at FHWA’s Turner-Fairbank Highway Research Center. He holds a bachelor’s and a master’s degree in systems engineering from the University of Virginia and a doctorate in civil engineering from Virginia Tech. He is a registered professional engineer in Virginia.

For more information, contact Tianjia Tang at 202–366–2236 or tianjia.tang@dot.gov, or Gene McHale at 202–493–3275 or gene.mchale@dot.gov.