icon picker
Data Collection Methodology

Working With ACS Data

What is ACS Data?

American Community Survey (ACS) Data are datasets that are released annually by the Census Bureau. It utilizes data from mailed questionnaires, telephone interviews, and information collected from Census Bureau field representatives that visit roughly 3.5 million households addresses in the U.S. Unlike the Decennial Survey (Census Data), the ACS collects data from a small percentage of the population on a rotating basis, and uses statistical analysis to make projections about demographics and characteristics of surveyed populations.
The data that is then made publicly available is the American Community Survey Public Use Microdata Sample (PUMS) files. This data is collected as a sample, some data is not statistically significant enough or large enough to be accessed without being easily connected to individual respondents. This can sometimes impact the ability to pull detailed response data from small populations in a particular Census Block (such as the income or education level responses connected to respondents of racial/ethnic groups that have a small population in a given geography).

ACS Data Decisions: 1, 3, 5, and 10 (decennial data)

ACS Data is then cleaned and compiled as 1 year, 3 year, and 5 year datasets. These data sets also include the margins of error for any category of data.
When building layered or comparative research, it’s important to note when data collection occurred. In the case of this project, we are using the 2018 ACS, which is conducted every 5 years. In addition, the Detroit Community Health Survey data that we are also using is also from a 2018 assessment of Detroiters.
“Users interested in studying annual trends, particular moments in time, or the most current conditions will prefer the temporal precision of the 1-year estimates. But these data are also the least reliable ACS estimates because they have the smallest sample period, and to ensure an acceptable sample size for all estimates, the Census tabulates 1-year data only for areas with populations of 65,000 or more.
Because the 5-year ACS data have a longer sample period, they are considerably more reliable than 1-year data, and the Census provides 5-year estimates for all areas regardless of population. The 5-year data are the only ACS series to include data for census tracts and block groups.
The 3-year data represent a compromise option, providing estimates for areas with populations of 20,000 or more, with finer temporal precision than 5-year data and greater reliability than 1-year data. However, the 3-Year Summary Files were discontinued by the Census after the 2013 ACS release.” -

ACS Data Hierarchy Reference

image1.png


Building A Data Collection Plan

Getting Specific

It’s important to narrow down the scope of your project. Especially when working with Census data, which is broad and complicated, you’ll want to make sure you are using the data in a specific, strategic way. You’ll also want to be aware of the timeframe that you are studying so that you can make sure that you can source Census data relevant to that time.
When continuing to work with your data, you may face different constraints that shifts how you move forward. Our project focuses on data in Detroit, Michigan from 2018. In the beginning of our methodology building, we scaled up our analysis to looking at data between 2010 and 2020. However, because we used survey data from 2018, we combined ACS data from the same time-period. This was decided to provide an example of how public data and data collected by organizations (such as health survey) can be combined to provide insights about the communities an organization may be engaging with. We were specifically interested in overlaying spatial data to identify specific census areas, zip codes, and city council districts with information such as:
Median Rent and Percent of Income Spent on Rent
Employment
Income per capita and Median Rent
Migration and Length of time lived in the city of Detroit
Opinions on quality of life
We picked these areas of interest because we knew we could find public data that would reveal trends on these categories.

Example; our methodological process:

Question:
How have changes across income, demographics, migration, and housing impacted the housing market and affordability of housing for resident ‘Detroiters’ in the past 10 years?
Trends from 2010 vs 2020
More recent trends 2021-2022
Alt Suggestions:
Define when looking at Detroit proper
Define when looking at Wayne County -or- Wayne County Metro
Focus on 2019-2022
2020 5 year ACS data
Questions coming up:
Deciding if we are staying in detroit proper or extending to metro, and where we make those differentiations
The question is around changes and drivers, analysis is around trends
How do we capture information about populations that work but dont live within the city
Define characteristics/attributions of disparity
Drivers
Income shifts
Define characteristics/attributions of displacement
Drivers
Output:
Geospatial visualizations
Tableau
QGIS
Cleaned data covering scope topics
Python (or R)
Store definitions or high level analysis in excel workbook
BigQuery
SQL
Analysis:
What is analyzed in Tableau
What is analyzed in Google Data Studio
Key trends showing up visually in data
Analyze correlations and trends (+ compare what is relevant)
Demographic changes
Income changes
Analysis topics coming up after reviewing data Timeline
Another thing that is useful as you begin to plan out your project is how long you expect various steps to take. Data collection can be tedious and difficult at times, so you’ll want to give yourself more than enough times to collect everything you’ll need to form an impactful analysis. For example, is the data that you are looking for publicly available? Is it available as a BigQuery public dataset? Is the raw data you found easily accessible or will it need standardization? Will you need to query an API? Is that something you already know how to do or will you have to learn? Figuring out the answers to these sort of questions can help you figure out a proper timeline for the data collection phase of your project. After these steps, you can move on to actually acquiring the data and storing it in your data warehouse.

Publicly Accessible Data Resources

Here’s a list of some public data resources we found while in the data collection phase of our project, in case they might be of use to you! We suggest also referencing data storage pages affiliated with the area you are considering analyzing in order to access shapefiles and routinely collected data as we have done in Detroit/Wayne County/Michigan below.
Exploring Census Data
Shape-files
General Demographic Data
Demographics
Income Data
Property Data
Property Value
Other Property Data


Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.