Pedestrian Count Expansion Methods: Bridging the Gap between Land Use Groups and Empirical Clusters

Abstract: 

Count expansion methods are a useful tool for creating long-term pedestrian or cyclist volume estimates from short-term counts for safety analysis or planning purposes. Expansion factors can be developed based on the trends from automated counters set up for long periods of time. Evidence has shown that the activity patterns can vary between sites so that there is potential to create more accurate estimates by grouping similar long-term count trends into factor groups. There are two common approaches to developing factor groups in pedestrian and cyclist count expansion studies. The land use classification approach has the advantage of being simple to apply to short-term count locations based on attributes of the surrounding area, but it requires assumptions by the researchers about which characteristics correlate with different activity patterns. Empirical clustering approaches can potentially create more distinct clusters by effectively matching locations with similar patterns, but they do not present an easy way to apply the resulting factor groups to appropriate short-term count sites. This study connects the two approaches and takes advantage of the benefits of both by using objective measures of the surrounding land use to model membership in the empirical cluster groups.

Count expansion methods are a useful tool for creating long-term pedestrian or cyclist volume estimates from short-term counts for safety analysis or planning purposes. Expansion factors can be developed based on the trends from automated counters set up for long periods of time. Evidence has shown that the activity patterns can vary between sites (1) such that there is potential to create more accurate estimates by grouping similar long-term count trends into factor groups.

There are two common approaches to developing factor groups in pedestrian and cyclist count expansion studies. Land use classification approaches group count locations based on the nearby land use and they are based on the assumption that land use plays an important role in determining hourly or daily fluctuations in pedestrian activity. Researchers in previous studies have distinguished unique patterns in pedestrian activity for employment centers, neighborhood commercial area, residential areas, and locations near multi-use trails or schools (13). Others have grouped patterns by employment density (45), the level of urbanism, or weather conditions (6). This approach has the advantage of being simple to apply to short-term count locations based on attributes of the surrounding area, but it requires assumptions by the researchers about which characteristics correlate with different activity patterns.

Several studies have used empirical approaches to grouping sites based on the count data directly. These studies have employed cluster analysis (7), heuristic methods (8), and longitudinal k-means clustering (1) to classify mixed-mode, bicycle, and pedestrian activity patterns, respectively. Other studies have used similar methods (910). Empirical clustering approaches can potentially create more distinct clusters by effectively matching locations with similar patterns, but they do not present an easy way to apply the resulting factor groups to appropriate short-term count sites.

Griswold et al. (1) compared the land use and empirical clustering approaches for a data set of long-term counts in California, using the clustering results to refine the land use classification and comparing the accuracy of the various expansion methods. This study continues the effort to connect the two approaches and take advantage of the benefits of both. In this study, we used automated counter data from 153 locations around the U.S., developing individual weekly trends for each site and classifying the trends using a longitudinal clustering algorithm. We summarized the surrounding land use at each location, using data from Google Places API and used these variables as explanatory variables in a multinomial logit model to predict membership in a particular cluster. We used repeated k-fold cross-validation to evaluate the prediction accuracy of the multinomial logit model. We also compared, for different short-term count periods, the in-sample expansion accuracy of the single factor group, the empirically classified factor groups, and the factor groups predicted based on the multinomial logit models.

Publication date: 
April 12, 2019
Publication type: 
Journal Article