Methodology
Overview
This tool combines data from several sources to offer users unique access to information about variation in the revenue and employment of businesses in the nation’s top 100 metro areas and the wages that they pay. We demonstrate how these characteristics are related to the race of business owners, their industry, and the social and economic characteristics of the neighborhood in which they operate.
Census Tract Data
To show the race of owner(s), median revenue, industry, and employment size of businesses, we purchased business records for the year 2023 from Data Axle, a data provider that maintains an extensive private dataset of company-level information. We used 2-digit North American Industry Classification System (NAICS) codes, which are the most general standardized categories available. While Data Axle reports industry information at the 6-digit (most specific) level, concerns over maintaining data confidentiality and ensuring the best functionality of our tool led us to report higher-level 2-digit industry information. To further protect the confidentiality of individual businesses, we do not report data on employment size, revenue, or industry if a tract has three or fewer businesses in any employment size class, revenue range, or industry group. For this reason, the unfiltered tract totals will not always align with the sum of businesses created by selecting all categories in any one of the business characteristics filters.
We used data on the residential characteristics of neighborhoods, including median household income, percentage of residents from different racial and ethnic groups, poverty rate, and employment rate, from the American Community Survey (ACS) 5-year estimates for 2018-2022. We also used ACS 5-year estimates from 2018-2022 to define groups of contiguous tracts with high non-student poverty and low median income as under-resourced communities. For more information on this definition, please refer to ICIC’s report The New Face of Under Resourced Communities. We used ACS 1-year population estimates for 2022 to restrict our tool to the nation’s 100 most populous metropolitan areas, and 2022 Office of Management and Budget metropolitan area boundaries to define our metro areas.
We used 2010 Decennial census tracts to represent the geographies of neighborhoods and combined the Data Axle business records and the ACS data into these tracts. This enabled us to link the information on neighborhood residential characteristics from the ACS and business data from the Data Axle dataset while also preserving the confidentiality of individual business records. Because the boundaries of Census geographies are changed between data years, we used land area to weight and standardize the ACS and decennial census data across the multiple years we used.
Racial and Ethnic Definitions
Data Axle classifies the race and ethnicity of business owners using a custom coding scheme, which we recoded into categories that more closely resemble the standardized categories used by the U.S. Census Bureau. These categories are: Black, Latine, Asian, White, Other, and Unknown. While these designations are more analogous to Census categories than the original labels provided by Data Axle, there are several important distinctions. First, while Census has historically differentiated race (Black, Asian, white, etc.) from ethnicity (Hispanic, non-Hispanic), Data Axle treats racial and ethnic categories as mutually exclusive. We included multiracial people in the “Other” category, along with Indigenous people. We initially planned on reporting both on majority-Indigenous tracts and on businesses owned by Indigenous people. However, because there are very few majority-Indigenous tracts in the 100 largest metro areas, we classified majority-Indigenous tracts as having “No Racial Majority.” For a similar reason, we classified Indigenous businesses owners as business owners of “Other” race or ethnicity. Data Axle’s racial identification methodology relies on a language learning model to code first and last names of business owners into racial or nationality groups and verifies the racial identities of samples of business owners using surveys.
Metro and State Data
To develop insights at the metro level, we grouped business revenue data by the racial categories of business owners, the racial majority category of the census tract where their businesses are located, and whether the census tract where a business is part of an under-resourced community.
To find the percent of businesses in each revenue category by race/ethnicity of business owner, we found the total number of businesses in the metro in each of our owner racial categories (all Black-owned businesses, all Latine-owned businesses, all Asian-owned businesses, all white-owned businesses, all businesses owned by a person from another racial group—Middle Eastern/North African, Indigenous, other, and all businesses where the race of the owner was unknown). We then calculated the percentage of each of those groups that fell in each revenue range.
To find the percent of census tracts in each business median revenue category by neighborhood majority racial group, we calculated the median business revenue of all tracts in the metro, then found the percentage of Black-majority, Latine-majority, Asian-majority, White-majority, and No Majority tracts with median revenue in each of our revenue ranges. Similarly, to show the percent of census tracts in each business median revenue category by under-resourced status, we found the median revenue of businesses in each tract in a metro and calculated the percentage of under-resourced and non-under-resourced tracts with median revenue in each revenue range.
To show the average wages paid by businesses of different sizes of businesses, we used data from the Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW). These data provide state-wide estimates of average weekly wages paid by businesses in each of the 2-digit NAICS industries by the size of the businesses’ employment. We report these data for the state in which each metro area is located. In instances where a metro area crosses state lines, we report the data for each state that intersects the metro.