A woman takes a selfie at Yosemite National Park. Photo credit: Zsolt Palatinus, Pexels

A woman takes a selfie at Yosemite National Park. Photo credit: Zsolt Palatinus, Pexels

Parks, green spaces, and other public lands provide people with a broad range of physical, social, and economic benefits. Perhaps most importantly, well-managed, maintained, and accessible open spaces offer unique and critical places to recreate – for restorative and rejuvenating activities such as exercise, socializing, and experiencing nature. However, not everyone has access to these natural places, in part because decision-makers cannot effectively evaluate management options or advocate for open spaces without better information on the amount and character of recreational use. The time and expense required to count and survey visitors is too large, and timelines for informing policy and management are too short to wait for new data collection.

Seasonal trends in visits (left) to Western U.S. National Parks from 2005-2014 according to the Park Service's visitor surveys (black) correspond with numbers of photographs shared on Flickr (red). Image credit: Spencer Wood

Seasonal trends in visits (left) to Western U.S. National Parks from 2005-2014 according to the Park Service’s visitor surveys (black) correspond with numbers of photographs shared on Flickr (red). Image credit: Spencer Wood (Click to enlarge any image)

To help governments and community organizations meet the increasing and varying demand for outdoor recreation, eScience research scientist Spencer Wood is leading a team that is innovating the use of crowd-sourced data and volunteered information from social media as instant and real-time data on recreation. The team is researching how to mix big digital data streams with traditional data sources in order to map and measure public land use more efficiently, and at finer spatial and temporal scales.

They are finding that locations of photographs that are uploaded to photo-sharing websites such as Flickr and Instagram are a reliable source of data on the numbers of visitors, and these methods can be scaled up to study many destinations and points in time. Locally, in Mount Baker-Snoqualmie National Forest, for example, the density of photographs taken by hikers on trails is correlated in space and time with on-site counts of visitors and with trip reports posted to the WA Trails Association’s online hiking guideSimilarly, at National Parks in the western U.S., seasonal trends in recreational use are mirrored by seasonal trends in the popularity of parks online. In New York City and the Twin Citiestrends in park use are mirrored by trends in the numbers and types of social media postings from the same locations. 

Visitors to Washington beaches (right) live in communities across the entire region, according to images of beaches shared on Flickr from 2005 - 2014. The size of the orange circles indicates the number of beach visitors from each community. Orange lines connect to the beaches that community members visited. Image credit: Spencer Wood

Visitors to Washington beaches (right) live in communities across the entire region, according to images of beaches shared on Flickr from 2005 – 2014. The size of the orange circles indicates the number of beach visitors from each community. Orange lines connect to the beaches that community members visited. Image credit: Spencer Wood

The team is also developing approaches for collecting demographic profiles of visitors to public lands, in order to answer questions about who is being served by open spaces and who isn’t. Affiliate eScience researcher Afra Mashhadi is testing methods that use convolutional neural network to analyze publicly-shared photographs for information on the socio-economic status of park visitors. These data feed new statistical models that are aimed at predicting the quantity and quality of recreational use. This allows managers to evaluate how features of the built and natural environments enhance recreational opportunities and benefitsas well as which neighborhoods or communities would be most affected by improvements or losses associated with potential changes in management or accessibility. This research is supporting ongoing planning processes at several locations globally, including parks within Seattle, King County, and elsewhere in Washington.

The project team is tackling several data and software design issues associated with collecting and synthesizing rich data on outdoor recreation. First, the team is developing an ontology for representing and storing data on recreational use. Traditionally, information about park visitors has been collected using very heterogeneous methods – including passive automated counters and traditional intercept surveys and visitor counts – and at varying spatial and temporal scales.

Practical information for recreation planning can be gleaned from millions of data points from social media postings to Flickr (purple), Twitter (red), Instagram (yellow), and WA Trails Association (green) around the Middle Fork Snoqualmie River and I-90 corridor, east of Seattle. Social media posts were used to train a statistical model that predicts weekly visits to recreation sites in this region. Inset graphs show predicted weekly visits to four popular recreation sites (yellow lines) compared with actual numbers of visitors to the same sites (blue lines) during the summer of 2016. Image credit: Spencer Wood

Practical information for recreation planning can be gleaned from millions of data points from social media postings to Flickr (purple), Twitter (red), Instagram (yellow), and WA Trails Association (green) around the Middle Fork Snoqualmie River and I-90 corridor, east of Seattle. Social media posts were used to train a statistical model that predicts weekly visits to recreation sites in this region. Inset graphs show predicted weekly visits to four popular recreation sites (yellow lines) compared with actual numbers of visitors to the same sites (blue lines) during the summer of 2016. Image credit: Spencer Wood

This heterogeneity and the lack of a data standard has made it difficult to integrate information from many sites or researchers. Next, the team is developing a database and infrastructure for querying recreational data through an application programming interface (API). As this infrastructure falls into place becomes available, the team is beginning to scope new user-facing software that will allow other decision-makers and partners to make use of their analytical methods for summarizing and visualizing data streams.

A vibrant community of practice is forming around data-driven tools and approaches for making better and more coordinated decisions, and improving the opportunities that people have to recreate outdoors. So far, the diverse team includes researchers, public land managers, and data experts from the UW eScience Institute, US Forest Service, National Park Service, WA Trails Association, Trust for Public Land, Western Washington University, Mountains to Sound Greenway Trust, King County Natural Resources and Parks, and Seattle Parks and Recreation, with funding from the Bullitt Foundation, and US Forest Service.

Through applied research and design, the team aims to create methods and tools that are accurate, legitimate, and relevant for real-life applications. For more information on this research, the recreation data infrastructure, and the community of practice in outdoor recreation planning, contact Spencer Wood at spwood(@)uw.edu.