Aggregate to Innovate: Lessons from Chelsea, Massachusetts
Why aggregate city data?
by Ashley Marcoux
From fire prevention in New Orleans to protecting children from lead poisoning in Chicago, there are no shortages of success stories from city governments that aggregate data to inform their actions. In New Orleans, private-public partnerships produced Smoke Signals, a data tool that gives the Fire Department block-by-block estimates of fire risk to target distributions of free smoke detectors. Smoke Signals now offers risk assessments for 178 cities across America. In Chicago, the Health Department and the University of Chicago partnered to identify properties that are most likely to contain lead-based paint to prioritize interventions and outreach. This strategy now serves as a model for other Illinois cities to reduce the rate of child lead poisoning.
These success stories add to what are already strong incentives for cities to integrate and share data. Open data can increase trust and accountability in government, inform and improve service provision, and help ensure continuous improvement over time. When taken together, readily-accessible city data can improve interagency collaboration, expand public engagement, enforce regulations, and evaluate program performance, among other established and emerging applications. Governments increasingly understand the value and importance of leveraging integrated data to understand and solve complex community problems.
So how do city governments become data-driven? Integrated data offers cities clear benefits, but the investments required to digitize, share, standardize, aggregate, and analyze data from different departments that use separate systems can prevent or impede data aggregation. A “smart city” may bring to mind big cities like New York City that have access to resources and sophisticated data analytics capabilities, but cities of all sizes and budgets can use data to make evidence-based policy decisions. The story of how Chelsea, a small city in Massachusetts with approximately 40,000 residents, adopted open data demonstrates how small to medium-sized cities can overcome common technical and cultural barriers to aggregate data into actionable evidence.
Case in Point: Chelsea, MA
Chelsea is a vibrant and close-knit community located across the Mystic river from Boston. It is the second most densely populated and fifth lowest income city in Massachusetts and is home to a diverse community of immigrants and New Americans.
There is a rich history in Chelsea of cross-sector collaboration among city departments, community agencies, and non-profits to understand and address residents’ needs. The Chelsea Collaborative, a non-profit formed in 1988 to bring stakeholders together to improve the social, environmental, and economic health of Chelsea’s residents, embodies in its name the City’s open, community-based approach to problem solving. This spirit underlies Chelsea’s open data practices today. Recognizing that siloed responses to interrelated problems like crime, poverty, and poor housing conditions often fall short of optimal, Chelsea instituted ‘the Hub’ in 2015 to reduce crime through targeted assistance to high-risk individuals and families. Led by the Chelsea Police Department, representatives from roughly 30 organizations and agencies meet weekly to discuss situations and coordinate responses. As of 2019, the Hub has addressed approximately 500 cases and received national recognition as a cost-effective model for community-based policing.
I had the opportunity to work in Chelsea as an Innovation Fellow at City Hall during the summer of 2019. The fellowship was part of the Innovation Field Lab (“IFL”), a program through the Ash Center for Democratic Governance and Innovation which partners with city governments to develop data-driven strategies that prevent properties from falling into blight or becoming “problem properties.”
One of my primary goals over the summer was assisting the city with an initiative it began several years prior — digitalizing and aggregating datasets from across city departments. Through the Innovation Field Lab, the city also partnered with Tolemi, a data analytics company that hosts a map-based application called BuildingBlocks. BuildingBlocks connects and updates data held in different systems and formats across city departments and agencies. The goal of this work was not only to inform data-driven strategies to prevent and respond to problem properties, but also to support data-driven initiatives throughout City Hall.
Over the course of my fellowship, it became clear to me that in order to aggregate data from across city departments, two main types of challenges had to be overcome:
1. Technical challenges associated with identifying, accessing, cleaning, and- in one case- manually entering data held on paper so that it could be imported into BuildingBlocks, linked by property ID, and
2. Adaptive challenges related to developing a culture of data-sharing and demand for data-driven work
Breaking Technical Barriers
Efforts to aggregate existing department data into a central open data portal were ongoing in Chelsea for about four years when I began work at City Hall. Chelsea’s City Manager, Chief Information Officer, and Innovation and Strategy Advisor (a former Chelsea IFL fellow) recognized how valuable integrated city data could be for informing and evaluating initiatives throughout the City. Their motivation spearheaded the initiative.
Former IFL student groups working in Chelsea, alongside City staff, located and synced most property-related datasets to BuildingBlocks by 2019. That summer, the Chief Information Officer was working on building secure channels for department data to automatically update to BuildingBlocks. Each month, the Innovation and Strategy Officer pulled reports from various City department databases into a shared folder from which Tolemi could sync the data to BuildingBlocks. The process was time intensive and subject to delays, and the City wanted to make data sharing more efficient, accurate, and accessible by automating updates through OpenGov, the City’s open data portal.
All that was needed to integrate datasets into Building Blocks is a property-ID attached to any kind of data or record. BuildingBlocks contained data on fire incidents, police calls, and property characteristics, among other records. My task was to locate and standardize data on delinquent taxes, fire inspections, water usage, and code violations from housing inspections. The last dataset was key: a year of code enforcement violations from between July 2018-July 2019 was missing from BuildingBlocks, and this information was critical to build predictive models of code violations that pose significant threats to public health- one of the motivating objectives for the IFL in Chelsea.
While trying to solve the mystery of the missing inspection data, I learned that the best way to understand how inspectors use and collect data was to accompany them on inspections. Lucky, the dedicated public servants within Chelsea’s Inspectional Services Department were amenable to me accompanying them on inspections. I watched as they entered homes, checked for working smoke detectors, broken windows, and other violations of the state sanitary code; penciled-down violations; and typed-up inspection reports back at City Hall.
During conversations with inspectors, I learned that the City changed permitting software the previous summer to bring inspection data into the cloud. However, inspectors generally preferred the form and functionality of the old software. Through observing inspections, I discovered that inspectors typed their inspection reports into word documents, which they uploaded as PDFs to the new permitting software. This explained why BuildingBlocks contained records of inspections but not code violations since July 2018, when the City adopted the new software.
Other datasets were not quite as tricky to track down, but obtaining them still presented unique challenges. For example, the Assessor’s Office in Chelsea calculates property values and maintains data on properties that have delinquent taxes. Once I explained the value of aggregating delinquent tax data into BuidlingBlocks, Chelsea’s assessor shared it with me without issue. The challenge was scheduling time with her so that she could explain the data to me. The data was not usable until I knew the meaning behind various abbreviations and codes. In a small city with few staff in each department, essential tasks and emergencies understandably come before nice-but-not-need-to-have projects like data aggregation.
My persistence paid off by the end of the summer and I was able to sync most of the data I set out to find into BuildingBlocks, but securing time from the various City staff that I needed to help me could have been a barrier had I not had time available to press for it.
Developing a Culture of Data-sharing & Demand for Data-driven Work
Leadership may start at the top of City government, but top-down is not how city staff and community leaders describe City Manager Tom Ambrosino’s approach. The City Manager empowers city staff to strive to continuously improve the quality and delivery of City services through a spirit of collaboration and willingness to share information. The authority he lent the initiative was invaluable in gaining the buy-in and cooperation of department leaders to help make their data available.
Nevertheless, some departments were easier than others to obtain data from. As in most cities, each department in Chelsea has its own micro-culture and work processes that are shaped by the function it performs. Restrictions in the way information was formatted, who (the City vs. the vendor) had access to the datafiles in interpretable formats, and whether departments were willing to change the way they stored data (on paper vs. digitally, as in the example with code violations in the previous section) factored into how difficult each dataset was to obtain. Fortunately, as more data was aggregated in BuildingBlocks, the value of integrated data became apparent to more stakeholders. Departments became increasingly willing to contribute data in workable formats.
While the data integration process in Chelsea took several years, cities don’t need to wait for all data sources to be aggregated to start learning from them and acting. For example, even in the early stages of data aggregation, the Innovation and Strategy Officer used data on property characteristics to inform monthly Housing Task Force meeting discussions about management of “problem properties.” These early use cases helped to demonstrate the value of integrated data, strengthen incentives for data integrity, and build demand for further data-driven work.
Cloud-based applications like Tolemi can help cities integrate data without replacing technology or disrupting established work processes. While software reduces the time and expertise needed to build analytical capabilities, aggregating data remains a significant investment for smaller cities. The relationship between the Innovation Field Lab, Tolemi, and the City of Chelsea entered its fifth year at the time I started my summer fellowship.
Building the infrastructure and investing the human capital necessary to aggregate information into actionable evidence can undoubtedly be costly, especially for resource-strapped cities. But the time and resources necessary to develop these partnerships are worthwhile investments and within the scope of most cities, with or without academic partnerships. Like every good investment, the long-term gains to government efficiency and program effectiveness are well worth the upfront costs of aggregated and open data.
Ashley Marcoux holds a Masters in Public Policy from the Harvard Kennedy School. She served as an Innovation Fellow at Chelsea City Hall in 2019, where she worked across city departments to obtain and organize city datasets. Ashley is an experienced grassroots political organizer and currently the President of the New Hampshire Young Democrats.