This study is a reproduction of Charles W. Sterling III et al’s study on “Connections Between Present-Day Water Access and Historical Redlining”.
Sterling III, Charles W., et al. “Connections between present-day water access and historical redlining.” Environmental Justice (2023). DOI:[10.1089/env.2022.0115](https://doi.org/10.1089/env.2022.0115)
This study uses ACS Census data and historical HOLC records of neighborhoods to examine the correlation of historical redlining and current-day access to water in cities. The study uses a binary logistic regression to identify relationships of different demographic and HOLC varaibles to access to water and sewage. The study finds that historically worse HOLC scores were correlated with less access to water in cities across all regions of the United States.
Key words
: ACS, HOLC Grade, Redlining, Water
AccessSubject
: Social and Behavioral Sciences: Geography:
Human GeographyDate created
: 4/7/25Date modified
: 4/7/25Spatial Coverage
: A collection of inner-city cores with
historical HOLC grades that are stored in the University of Richmond
Mapping Inequality database.Spatial Resolution
: Area-Weighted Census Blocks Clipped
by HOLC zoneSpatial Reference System
: WGS84 EPSG:4326Temporal Coverage
: 2016-2020Temporal Resolution
: Observations collected yearlyFunding Name
: noneFunding Title
: noneAward info URI
: noneAward number
: noneSpatial Coverage
: Specific inner-cities that are
included in the MID HOCL Grade dataset across the United States. Data is
limited to post- WW2 inner core census tracts due to the boundaries of
HOLC Zones.Spatial Resolution
: resolution of original studySpatial Reference System
: spatial reference system of
original studyTemporal Coverage
: temporal extent of original
studyTemporal Resolution
: temporal resolution of original
studyDescribe how the study relates to prior literature, e.g. is it a original study, meta-analysis study, reproduction study, reanalysis study, or replication study?
Also describe the original study archetype, e.g. is it observational, experimental, quasi-experimental, or exploratory?
Enumerate specific hypotheses to be tested or research questions to be investigated here, and specify the type of method, statistical test or model to be used on the hypothesis or question.
This study is a reproduction of Charles W. Sterling III et
al’s study on “Connections Between Present-Day Water Access and
Historical Redlining”, a study which set out to explore the relationship
between prevalence of the complete plumbing
ACS variable in
census blocks and historical neighborhood HOLC grades. In the study,
Sterling hypothesizes that there will be a link between HOLC grade and
complete plumbing. This hypothesis is based on the HOLC’s well
documented history of assigning neighborhoods of color the lowest grades
and previous findings of a nationwide link between race and complete
plumbing prevalance as found in Deitz and Meehan 2019. Sterling’s work
uses similar method to this study, using a logistic regression to test
the link between the neighborhood grades and plumbing.
Other methods included in this study include the use of Areal Interpolation to supposedly assign ACS variable characteristics to the HOLC graded neighborhoods. This is somewhat of a strange strategy; often used to fill in missing or anamolous values in gridded data, areal interpolation doesn’t make a lot of sense for something like assigning data to an irregularly shaped polygon. Furthermore, areal interpolation assigns a value that is a average of several values of the areal units on either side of it. This suggests that the phenomena to be studied in the filled in area is a continuation of patterns that are present proximal to it. This is almost the opposite of the point of HOLC grades, grades assigned to neighborhoods based on their characteristics that set them apart. There’s a somewhat simple explanation to why Sterling chose to use this approach; The study that Sterling cites as the published usage of this technique for re-aggregating HOLC data - Fricker and Allen 2022 - used area-weighted-reaggregation to re-assign casualties from tornado paths to the neighborhoods they passed through, but incorrectly used the term ‘areal interpolation’ to describe their method. This means that likely Sterling also just used AWR. This assumption is supported by an interrogation of the python package used for the reaggregation, which considers AWR “the simplest form of areal interpolation”, a somewhat misleading designation given that the two refer to different techniques.
This reproduction uses the following R packages
# record all the packages you are using here
# this includes any calls to library(), require(),
# and double colons such as here::i_am()
packages <- c("tidyverse", "here", "tidycensus", "margins", "sf", "tigris", "rstatix", "MASS", "knitr", "kableExtra", "rstatix", "ggcorrplot")
This study used only two data sources; American Community Survey (ACS) demographic data and maps of neighborhood’s historical mortgage HOLC grades, published and maintained on the University of Richmond’s Mapping Inequality Database.
Title
: American Community Survey (ACS)Abstract
: The ACS is a nationwide survey that collects
and produces information on social, economic, housing, and demographic
characteristics about our nation’s population every yearSpatial Coverage
: United States of AmericaSpatial Resolution
: ACS data is aggregated by Census
Block.Spatial Representation Type
: vector
,
MULTIPOLYGON
Spatial Reference System
: NAD83
Temporal Coverage
: ACS data from 2016-2020.Temporal Resolution
: Data is collected annually.Lineage
: Downloaded and used as-is.Distribution
: Data is publically avalible from the US
Census (https://www.census.gov/programs-surveys/acs/data.html)[https://www.census.gov/programs-surveys/acs/data.html]Constraints
: Publicly available data.Data Quality
: N/ASee acs_values.csv at
/data/raw/public/acs_values.csv
## # A tibble: 30 × 3
## name label concept
## <chr> <chr> <chr>
## 1 B01003_001 Estimate!!Total TOTAL POPULATION
## 2 B02008_001 Estimate!!Total: WHITE ALONE OR IN COMBINATIO…
## 3 B02009_001 Estimate!!Total: BLACK OR AFRICAN AMERICAN AL…
## 4 B02010_001 Estimate!!Total: AMERICAN INDIAN AND ALASKA N…
## 5 B02011_001 Estimate!!Total: ASIAN ALONE OR IN COMBINATIO…
## 6 B03003_001 Estimate!!Total: HISPANIC OR LATINO ORIGIN
## 7 B03003_003 Estimate!!Total:!!Hispanic or Latino HISPANIC OR LATINO ORIGIN
## 8 B05012_001 Estimate!!Total: NATIVITY IN THE UNITED STATES
## 9 B05012_003 Estimate!!Total:!!Foreign-Born NATIVITY IN THE UNITED STATES
## 10 B25003_001 Estimate!!Total: TENURE
## # ℹ 20 more rows
Unplanned Deviation This study uses block group-level data for the foreign-born population. However, the ACS datasets have no information about place-of-birth at this level. Instead, data for foreign-born population percentage is gathered at the tract level.
Title
: HOLC grades from the Mapping Inequality Database
(MID) - Abstract
: HOLC grades are historical neighborhood
delineations created by the Home Owners’ Loan Corporation in the 1930s
to assess mortgage lending risk. These maps were used to guide
investment and disinvestment in urban areas and are a foundational
dataset for understanding redlining and its long-term impacts. -
Spatial Coverage
: Select major U.S. cities -
Spatial Resolution
: Neighborhood-scale boundaries (city
block to district level, variable by city) -
Spatial Representation Type
: vector
,
MULTIPOLYGON
- Spatial Reference System
:
NAD83
- Temporal Coverage
: 1935–1940
(approximate dates of HOLC map creation) -
Temporal Resolution
: Single mapping effort. -
Lineage
: Downloaded and used as-is. -
Distribution
: Data is avalible on the Mapping Inequality
website (https://dsl.richmond.edu/panorama/redlining/data)[https://dsl.richmond.edu/panorama/redlining/data] -
Constraints
: Publicly available data under a CC-by-NC
2.5 license - Data Quality
: Accurate to the degree that
the original digitization of the 1930’s HOLC maps was accurate.
Label | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
city | City Name | City in which the HOLC designated area resides | character | N/A | Names of US cities | none | none |
state | State Abbreviation | USPS state designator | character | N/A | The 50 states | N/A | none |
category | Desirability | Assessment of investment desirability based on HOLC designation | character | subjective/historical | a range of desigations | none | none |
grade | Grade | Historical HOLC grade | character | subjective/historical | A - D | none | none |
label | Shape Label | HOLC grade and arbitrary number (A1, C3) to divide same-grade areas from the same city | character | N/A | A-D, number of same-grade zones in the same city | none | none |
residential | Is it residential? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
commercial | Is it commercial? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
industrial | Is it industrial? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
Unplanned Deviation Not all HOLC polygons have valid geometry, so invalid geometry was fixed by splitting to single parts before performing the intersects function
## Warning in st_cast.sf(holc_polygons_bugged, "POLYGON"): repeating attributes
## for all sub-geometries for which they may not be constant
In order to first save data, the tract and census block data should first be intersected by these HOLC polygons
## `summarise()` has grouped output by 'area_id', 'city', 'state', 'city_survey',
## 'category', 'grade', 'label', 'residential', 'commercial', 'industrial'. You
## can override using the `.groups` argument.
## Reading layer `acs_censusblock' from data source
## `/Users/lucas/Documents/GitHub.nosync/RPr-Sterling-2023/data/raw/public/acs_censusblock.gpkg'
## using driver `GPKG'
## Simple feature collection with 49449 features and 72 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -122.8334 ymin: 25.69572 xmax: -69.56878 ymax: 48.28347
## Geodetic CRS: NAD83
## Reading layer `acs_tract' from data source
## `/Users/lucas/Documents/GitHub.nosync/RPr-Sterling-2023/data/raw/public/acs_tract.gpkg'
## using driver `GPKG'
## Simple feature collection with 18473 features and 17 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -122.8676 ymin: 25.69572 xmax: -69.49354 ymax: 48.28347
## Geodetic CRS: NAD83
There are no prior observations related with this data.
Three main source of bias in this study come from the HOLC data itself. Firstly, the spatial extent of the HOLC data only covers the centers of major cities across the united states, limiting the analysis to center-cities. Secondly, ‘Areal Interpolation’ was used to determine census data values for each HOLC neighborhood, a method that doesn’t seem logical or advisable give the study context. Some data in the study suggests that this actually was a more logical area weighted reaggregation. Choosing individual HOLC polygons as the study’s resolution increases the threat to validity through the Modifiable Areal Unit Problem. Many of the polygons represent outdated neighborhood lines, and are limited only to populated areas from that time period. This affects the data being used to represent each neighborhood today, especially considering that area-weighted reaggregation is being used.
First, remap ACS values from block groups to HOLC polygons through areal interpolation, likely actually area-weighted reaggregation.
## Reading layer `acs_censusblock_intersected' from data source
## `/Users/lucas/Documents/GitHub.nosync/RPr-Sterling-2023/data/raw/public/acs_censusblock_intersected.gpkg'
## using driver `GPKG'
## Simple feature collection with 102515 features and 87 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -122.7675 ymin: 25.70537 xmax: -69.60044 ymax: 48.2473
## Geodetic CRS: NAD83
After performing AWR on the HOLC polygons, we are left with significantly less geometries than we started with (10154 originally vs. 9614 after re-aggregation). This will come into play later down the road.
Using census statistics from the ACS dataset, resulting calculated assigned variables are: Black Population %, White Population %, Indigenous American %, Asian Population %, Hispanic or Latino Population %, % Below Poverty Line, % Housing Units Owned, % Housing Units Rented, % Mobile Homes, % Homes before 1980, % Homes after 1980, % With Complete Plumbing, % With Incomplete Plumbing, and % Foreign Born. This data is separated by region as well as collected nationally. Using regional and national averages for plumbing rates, a binary variable is calculated on whether plumbing is below or above the average.
HOLC_polygons <- HOLC_polygons |>
mutate(
water_access = case_when(
`ic_plumbE` > 2.61668 ~ 1,
TRUE ~ 0
)
) #assigning a binary variable to plumbing access- 1 if the incomplete plumbing percentage is higher than the national average, 0 if it is lower than the national average.
data_clean <- HOLC_polygons |>
mutate(
holc_id = area_id,
holc_grade = grade,
black_pct = (black_popE / total_pop_raceE) * 100,
white_pct = (white_popE / total_pop_raceE) * 100,
indig_pct = (ind_popE / total_pop_raceE) * 100,
asian_pct = (asian_popE / total_pop_raceE) * 100,
hispan_Latino_pct = (hisp_popE / total_pop_hispE) * 100,
pct_below_poverty = (bp_popE / total_pop_plE) * 100,
pct_home_ownership = (total_ownE / total_hu_tenE) * 100,
pct_renters = (total_rentE / total_hu_tenE) * 100,
pct_mobile_homes = (total_mhE / total_hu_mhE) * 100,
pct_pre_1980 = ((age_1939E + age_1940E + age_1950E + age_1960E + age_1970E) / total_hu_ageE) * 100,
pct_post_1980 = ((age_1980E + age_1990E + age_2000E + age_2010E + age_2014E) / total_hu_ageE) * 100,
pct_complete_plumb = (c_plumbE / total_hu_plumbE) * 100,
pct_incomplete_plumb = (ic_plumbE / total_hu_plumbE) * 100,
water_access = water_access
)
Data will now be verified using the supplementary tables by subtracting the national summary statistics table from the calculated table, which should result in zeros across the board.
This is somewhat of a test of whether the authors actually did in fact use area weighted reaggregation or areal interpolation; if they used AWI, our results should more or less match, aside from the differing numbers of HOLC polygons remaining in the analysis. Additionally, because the reaggregation was done in a Python package this will test the similarities/differences of our implementation.
The total number of observations in the summary statistics tables is 8878, 736 less than our AWI results. While we lost some polygons through the scaling, the original authors, using a python package, lost many more than we did. It’ll be interesting to see how different results are, as this is likely the cause of much of the potential difference between the datasets.
black_pct_diff | white_pct_diff | indig_pct_diff | asian_pct_diff | hispan_Latino_pct_diff | pct_below_poverty_diff | pct_home_ownership_diff | pct_renters_diff | pct_mobile_homes_diff | pct_pre_1980_diff | pct_post_1980_diff | pct_complete_plumb_diff | pct_incomplete_plumb_diff |
---|---|---|---|---|---|---|---|---|---|---|---|---|
8.860255 | 10.38488 | 1.137381 | 3.257141 | 5.387422 | 6.488319 | 13.09047 | 13.18406 | 0.6540667 | 7.271628 | 7.204174 | 2.144397 | 1.780241 |
Then, use the Dunn Test in the to determine whether incomplete plumbing % in Grade A is significantly different than the other Grades and compared with supplementary table 1.
Z | P.unadj | P.adj |
---|---|---|
8.341575 | -0.0003491 | -0.0020289 |
25.456331 | 0.0000000 | 0.0000000 |
22.150445 | 0.0000000 | 0.0000000 |
38.116993 | 0.0000000 | 0.0000000 |
37.871252 | 0.0000000 | 0.0000000 |
19.708821 | 0.0000000 | 0.0000000 |
The Dunn test result is significanty different, with the test Z statistic consistently negative in the original Dunn results and consistently positive, almost to the same degree. All of the pairwise differences are still statistically significant, which matches the findings of the original authors.
Sterling III’s original study used a binary logistic regression model to compare categorical hold grades and a census block group’s binary desigantion of whether plumbing rates were above or below average.
Using a stepAIC function within the \(MASS\) package and the HOLC Grade A as a reference group, we can remove microcolinearity for the final regression analysis. In this case, high correlation pair values will be removed to match the original methodology, a visual of the correlation table is also available.
black_pct | white_pct | indig_pct | asian_pct | hispan_Latino_pct | pct_below_poverty | pct_renters | pct_mobile_homes | pct_post_1980 | pct_incomplete_plumb | holc_gradeA | holc_gradeB | holc_gradeC | holc_gradeD | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
black_pct | 1.0000000 | -0.8698584 | -0.0426595 | -0.2700047 | -0.1852961 | 0.4280641 | 0.2735503 | -0.0134052 | -0.0123755 | 0.4514905 | -0.1314971 | -0.1147203 | 0.0490533 | 0.1554326 |
white_pct | -0.8698584 | 1.0000000 | 0.0007625 | -0.0871363 | -0.0961478 | -0.3741649 | -0.3558496 | 0.0284169 | -0.0263567 | -0.3674112 | 0.1672536 | 0.1275965 | -0.0853035 | -0.1754799 |
indig_pct | -0.0426595 | 0.0007625 | 1.0000000 | -0.0379227 | 0.0958943 | 0.1150724 | 0.1062556 | 0.0744332 | 0.0173740 | 0.0312798 | -0.0452279 | -0.0627824 | -0.0112927 | 0.0283662 |
asian_pct | -0.2700047 | -0.0871363 | -0.0379227 | 1.0000000 | 0.0312383 | -0.1902106 | 0.0381656 | -0.0718153 | 0.1166294 | -0.1904061 | 0.0108093 | 0.0431247 | 0.0233518 | -0.0307498 |
hispan_Latino_pct | -0.1852961 | -0.0961478 | 0.0958943 | 0.0312383 | 1.0000000 | 0.0488876 | 0.2275354 | 0.0287248 | -0.0055973 | -0.0861892 | -0.1181908 | -0.0641764 | 0.0760293 | 0.1038673 |
pct_below_poverty | 0.4280641 | -0.3741649 | 0.1150724 | -0.1902106 | 0.0488876 | 1.0000000 | 0.5889550 | 0.0598075 | 0.0820857 | 0.3691761 | -0.1990502 | -0.1826611 | 0.0271861 | 0.2220992 |
pct_renters | 0.2735503 | -0.3558496 | 0.1062556 | 0.0381656 | 0.2275354 | 0.5889550 | 1.0000000 | -0.0250379 | 0.2541901 | 0.1306942 | -0.2698780 | -0.1901436 | 0.0738958 | 0.2218243 |
pct_mobile_homes | -0.0134052 | 0.0284169 | 0.0744332 | -0.0718153 | 0.0287248 | 0.0598075 | -0.0250379 | 1.0000000 | 0.1079025 | 0.0659484 | -0.0509174 | -0.0633809 | -0.0000382 | 0.0587096 |
pct_post_1980 | -0.0123755 | -0.0263567 | 0.0173740 | 0.1166294 | -0.0055973 | 0.0820857 | 0.2541901 | 0.1079025 | 1.0000000 | -0.1139085 | -0.0851896 | -0.1320052 | -0.0555425 | 0.1750454 |
pct_incomplete_plumb | 0.4514905 | -0.3674112 | 0.0312798 | -0.1904061 | -0.0861892 | 0.3691761 | 0.1306942 | 0.0659484 | -0.1139085 | 1.0000000 | -0.0979310 | -0.1041727 | 0.0330310 | 0.1355664 |
holc_gradeA | -0.1314971 | 0.1672536 | -0.0452279 | 0.0108093 | -0.1181908 | -0.1990502 | -0.2698780 | -0.0509174 | -0.0851896 | -0.0979310 | 1.0000000 | -0.1971853 | -0.2590029 | -0.1783255 |
holc_gradeB | -0.1147203 | 0.1275965 | -0.0627824 | 0.0431247 | -0.0641764 | -0.1826611 | -0.1901436 | -0.0633809 | -0.1320052 | -0.1041727 | -0.1971853 | 1.0000000 | -0.4296020 | -0.2957843 |
holc_gradeC | 0.0490533 | -0.0853035 | -0.0112927 | 0.0233518 | 0.0760293 | 0.0271861 | 0.0738958 | -0.0000382 | -0.0555425 | 0.0330310 | -0.2590029 | -0.4296020 | 1.0000000 | -0.3885127 |
holc_gradeD | 0.1554326 | -0.1754799 | 0.0283662 | -0.0307498 | 0.1038673 | 0.2220992 | 0.2218243 | 0.0587096 | 0.1750454 | 0.1355664 | -0.1783255 | -0.2957843 | -0.3885127 | 1.0000000 |
## Start: AIC=8733.5
## pct_incomplete_plumb ~ holc_grade + (holc_grade + black_pct +
## white_pct + indig_pct + asian_pct + hispan_Latino_pct + pct_below_poverty +
## pct_renters + pct_mobile_homes + pct_post_1980)
##
## Df Deviance AIC
## - pct_renters 1 8695.5 8731.5
## - indig_pct 1 8696.1 8732.1
## <none> 8695.5 8733.5
## - white_pct 1 8704.9 8740.9
## - hispan_Latino_pct 1 8732.0 8768.0
## - pct_mobile_homes 1 8732.7 8768.7
## - holc_grade 9 8766.5 8786.5
## - black_pct 1 8781.6 8817.6
## - pct_post_1980 1 8781.9 8817.9
## - asian_pct 1 8857.7 8893.7
## - pct_below_poverty 1 9098.7 9134.7
##
## Step: AIC=8731.52
## pct_incomplete_plumb ~ holc_grade + black_pct + white_pct + indig_pct +
## asian_pct + hispan_Latino_pct + pct_below_poverty + pct_mobile_homes +
## pct_post_1980
##
## Df Deviance AIC
## - indig_pct 1 8696.1 8730.1
## <none> 8695.5 8731.5
## + pct_renters 1 8695.5 8733.5
## - white_pct 1 8704.9 8738.9
## - hispan_Latino_pct 1 8732.7 8766.7
## - pct_mobile_homes 1 8732.9 8766.9
## - holc_grade 9 8767.3 8785.3
## - black_pct 1 8781.8 8815.8
## - pct_post_1980 1 8783.7 8817.7
## - asian_pct 1 8860.2 8894.2
## - pct_below_poverty 1 9149.9 9183.9
##
## Step: AIC=8730.08
## pct_incomplete_plumb ~ holc_grade + black_pct + white_pct + asian_pct +
## hispan_Latino_pct + pct_below_poverty + pct_mobile_homes +
## pct_post_1980
##
## Df Deviance AIC
## <none> 8696.1 8730.1
## + indig_pct 1 8695.5 8731.5
## + pct_renters 1 8696.1 8732.1
## - white_pct 1 8705.4 8737.4
## - hispan_Latino_pct 1 8732.7 8764.7
## - pct_mobile_homes 1 8734.1 8766.1
## - holc_grade 9 8768.4 8784.4
## - black_pct 1 8782.2 8814.2
## - pct_post_1980 1 8784.6 8816.6
## - asian_pct 1 8860.2 8892.2
## - pct_below_poverty 1 9155.8 9187.8
Sterling calculates Average Marginal Effects (AME) using the \(margins\) package to gather different census-derived variables represented as binary values. Here, the results of the reproduction’s model are compared with that of the original model.
Unplanned Deviation The average marginal effects results have some factors that are not designed to be used, with holc grades of A, E, F, and other variables that seem to end with a space. Additionally, reducing microcolinearity seemed to remove the indigenous population variable, rather than the white population variable.
factor | AME | original_AME | difference |
---|---|---|---|
asian_pct | -0.1392510 | NA | NA |
black_pct | 0.1191453 | NA | NA |
hispan_Latino_pct | -0.0563915 | NA | NA |
holc_gradeB | 0.0022993 | NA | NA |
holc_gradeC | 0.0507086 | NA | NA |
holc_gradeD | 0.1022529 | NA | NA |
pct_below_poverty | 0.1887591 | NA | NA |
pct_mobile_homes | 0.0629468 | NA | NA |
pct_post_1980 | -0.0854183 | NA | NA |
Here the difference is very noticeable among race/ethnicity related data and poverty line data, but among all other factors, the difference remains very small. Notable, the AME values are very similar across HOLC values.
Data is presented as 3 figures, 2 tables, and 7 supplementary tables. We’ll reproduce each figure and table and note differences between our reproduction and the figures from the original manuscript.
This map is somewhat different than Sterling’s map, revealing one of the potential reasons behind differing values in our AME and other various comparisons. A different set of HOLC polygons is mapped in this version than in Sterling’s results, but despite us ending up with MORE polygons in our analysis total, we seem to be missing several of the holc zones/sections of the map.
We will map another city to test and see if this is a widespread phenomenon.
The map of buffalo is much more intact, with only one or two sections across the city missing. The source of this loss is random- it could have to do with our multipart to single part transformation, the removal of broken geometries (of which we had more in r than in ArcGIS), or somewhere in our AWR step. Given that this effected multiple cities, it’s likely that it is a widespread issue.
This finds the same results as the original study, with the only two extremely correlated variables being the white and black population percentages.
Aside from the missing HOLC polygons, the incomplete plumbing percentages appear to have similar percentages to the figures in the original manuscript.
factor | AME |
---|---|
asian_pct | -0.1392510 |
black_pct | 0.1191453 |
hispan_Latino_pct | -0.0563915 |
holc_gradeB | 0.0022993 |
holc_gradeC | 0.0507086 |
holc_gradeD | 0.1022529 |
pct_below_poverty | 0.1887591 |
pct_mobile_homes | 0.0629468 |
pct_post_1980 | -0.0854183 |
HOLC Grade | Number of Polygons | Original | Difference |
---|---|---|---|
A | 1020 | 1040 | -20 |
B | 2367 | 2332 | 35 |
C | 3465 | 3381 | 84 |
D | 2037 | 2118 | -81 |
Differences in the number of polygons per grade are seen, with between a 20 and 84 polygon difference between the two datasets.
Sterling III hypothesizes that areas of present-day incomplete plumbing within U.S. cities (i.e., communities with a proportion of homes lacking complete plumbing above the national average) are significantly associated with HOLC neighborhood designations. The study finds evidence for this hypothesis, with lower HOLC grades correlated with higher levels of incomplete household plumbing. That being said, despite following the original study’s listed methods, we found numerically different results. The most likely cause of this is the differing intial HOLC geometries between our study and the original, as seen the first figure of our results section. We went to great lengths to repair the HOLC geometries in R, but ended up with an interesting subset of the polygons remaining. Geometry issues are obviously compounding, as census data aggregated into incorrect polygons will result in incorrect summary statistics. This seems to be an R/SF issue; when loading the geometries into either ArcGIS or QGIS there are only 2 geometries that need repairing. While we’re not sure exactly what caused this issue, it’s worth keeping in mind for future reproductions.
The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.
This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)
Sterling III, Charles W., et al. “Connections between present-day water access and historical redlining.” Environmental Justice (2023). DOI:[10.1089/env.2022.0115](https://doi.org/10.1089/env.2022.0115)
Shiloh Deitz and Katie Meehan. “Plumbing Poverty: Mapping Hot Spots of Racial and Geographic Inequality in U.S. Household Water Insecurity.” Annals of the American Association of Geographers 109 (2019): 1092–1109.
Tyler Fricker and Douglas L. Allen. “A Place-Based Analysis of Tornado Activity and Casualties in Shreveport, Louisiana.” Natural Hazards 113 (2022): 1853–1874.