This is a study of gerrymandering in Alabama. We will test three methods of shape-based compactness scores, assess representativeness of districts based on prior presidential elections and race. We will then extend prior studies by calculating representativeness of the convex hull of district polygons.
Key words
: Gerrymandering, Compactness, Gerrymandering,
Convex Hull, Political Representation.Subject
: Social and Behavioral Sciences: Geography:
Geographic Information SciencesDate created
: 2025-02-17Date modified
: 2025-02-17Spatial Coverage
: Alabama OSM:161950Spatial Resolution
: Census Block GroupsSpatial Reference System
: EPSG:4269 NAD 1983 Geographic
Coordinate SystemTemporal Coverage
: 2020-2024 population and voting
dataTemporal Resolution
: Decennial censusThis is an original study based on literature on gerrymandering metrics.
It is an exploratory study to evaluate usefulness of a new gerrymandering metric based on the convex hull of a congressional district and the representativeness inside the convex hull compared to the congressional district.
I plan on using: - the tidyverse
package for general
dataset processing - the here
package for file locations -
the sf
package for spatial processing - the
tmap
package for displaying data - the
tidycensus
package for gathering census data - the
lwgeom
package for minimum bounding circles - the
knitr
package for better table displays - the
patchwork
package allows for multiple graphs in one display
- the cowplot
package allows for extraction of plot
legends
We plan on using data scources: precincts20 districts23, blockgroups2020
Title
: Voting Precincts 2020Abstract
: Alabama voting data for 2020 elections by
precinct.Spatial Coverage
: AlabamaSpatial Resolution
: Voting precinctsSpatial Reference System
: EPSG 4269 NAD 1983 Geographic
Coordinate SystemTemporal Coverage
: voting precincts used for tabulating
the 2020 electionTemporal Resolution
: annual electionLineage
: Saved as geopackage format. Processing prior
to download is explained in al_vest_20_validation_report.pdfDistribution
: Data available at Redistricting Data
HubConstraints
: Permitted for noncommercial and
nonpartisan use only. Copyright and use constraints explained in
redistrictingdatahub_legal.txtData Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
VTDST20 | … | Voting district ID | … | … | … | … | … |
GEOID20 | … | Unique Geographic ID | … | … | … | … | … |
G20PRERTRU | … | total votes for Trump in 2020 | … | … | … | … | … |
G20PREDBID | … | total votes for Biden in 2020 | … | … | … | … | … |
precincts20 <- st_read(here("data","raw","public","districts.gpkg"), layer = "precincts20")
## Reading layer `precincts20' from data source
## `/Users/jorredahl/Documents/GitHub/OR-Gerrymander-Alabama/data/raw/public/districts.gpkg'
## using driver `GPKG'
## Simple feature collection with 1972 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.14442 xmax: -84.88825 ymax: 35.00803
## Geodetic CRS: NAD83
Title
: Voting Districts 2024Abstract
: Alabama voitng districts approved in 2023 for
use in 2024Spatial Coverage
: AlabamaSpatial Resolution
: US Congressional DistrictsSpatial Reference System
: EPSG 3857 WGS 1984 Web
Mercator projectionTemporal Coverage
: approved in 2023 for 2024 useTemporal Resolution
: districting based on 2020 census
data (updated every 10 years)Lineage
: Loaded into QGIS as ArcGIS feature service
layer and saved in geopackage format. Extraneous data fields were
removed and the FIX GEOMETRIES tool was used to correct geometry
errors.Distribution
: Alabama State GIS via ESRI feature
service at https://services7.arcgis.com/jF2q3LPxL7PETdYk/arcgis/rest/services/2023_Court_Ordered_Congressional_Plan/FeatureServerConstraints
: Public Domain data free for use and
redistribution.Data Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
DISTRICT | … | US Congressional District Number | … | … | … | … | … |
POPULATION | … | total population (2020 census) | … | … | … | … | … |
WHITE | … | total white population (2020 census) | … | … | … | … | … |
BLACK | … | total black population (2020 census) | … | … | … | … | … |
districts23 <- st_read(here("data","raw","public","districts.gpkg"), layer = "districts23")
## Reading layer `districts23' from data source
## `/Users/jorredahl/Documents/GitHub/OR-Gerrymander-Alabama/data/raw/public/districts.gpkg'
## using driver `GPKG'
## Simple feature collection with 7 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.14443 xmax: -84.88825 ymax: 35.00803
## Geodetic CRS: NAD83
Title
: 2020 Census block groupsAbstract
: Vector polygon layer of census block groups
and their demographic dataSpatial Coverage
: AlabamaSpatial Resolution
: Census block groupsSpatial Reference System
: EPSG 4269 NAD 1983 Geographic
Coordinate SystemTemporal Coverage
: 2020 censusTemporal Resolution
: 10-year censusLineage
: Downloaded data from US Census API “p1” public
law summary file using tidycensus in R.Distribution
: US Census APIConstraints
: Public Domain data free for use and
redistribution.Data Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
GEOID | … | Code to uniquely identify tracts | … | … | … | … | … |
P4_001N | … | Total Population, 18 years or older | … | … | … | … | … |
P4006N | … | Total: Not Hispanic or Latino, Population of one race, Black or African American alone, 18 years or older | … | … | … | … | … |
G20PREDBID | … | Total institutionalized population in correctional facilities for adults 18 years or older | … | … | … | … | … |
census_metadata_file <- here("data", "metadata", "census2020pl_vars.csv")
if(file.exists(census_metadata_file)){
census2020pl_vars <- read.csv(census_metadata_file)
} else {
census2020pl_vars <- load_variables(2020, "pl")
write.csv(census2020pl_vars, here("data", "metadata", "census2020pl_vars.csv"))
}
blockgroup_file <- here("data", "raw", "public", "block_groups.gpkg")
# if the data is already downloaded, just load it
# otherwise, query from the census and save
if(file.exists(blockgroup_file)){
blockgroups <- st_read(blockgroup_file)
} else {
blockgroups <- get_decennial(geography = "block group",
sumfile = "pl",
table = "P3",
year = 2020,
state = "Alabama",
output = "wide",
geometry = TRUE,
keep_geo_vars = TRUE)
st_write(blockgroups, blockgroup_file)
}
## Reading layer `block_groups' from data source
## `/Users/jorredahl/Documents/GitHub/OR-Gerrymander-Alabama/data/raw/public/block_groups.gpkg'
## using driver `GPKG'
## Simple feature collection with 3925 features and 83 fields (with 1 geometry empty)
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.22333 xmax: -84.88908 ymax: 35.00803
## Geodetic CRS: NAD83
I have not looked at this data before.
Modifiable Areal Unit Problem: demographic data is collected at both the district and the block level - This experiment tries to combat many of the problems with edge effects associated with studies of gerrymandering.
Find the sum of Black or African American people by gathering all variables that include the term “Black”.
black_vars <- census2020pl_vars |>
dplyr::filter(str_detect(name, "P3"),
str_detect(label, "Black")) |>
select(-concept)
black_vars |> kable()
X | name | label |
---|---|---|
151 | P3_004N | !!Total:!!Population of one race:!!Black or African American alone |
158 | P3_011N | !!Total:!!Population of two or more races:!!Population of two races:!!White; Black or African American |
163 | P3_016N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; American Indian and Alaska Native |
164 | P3_017N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Asian |
165 | P3_018N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Native Hawaiian and Other Pacific Islander |
166 | P3_019N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Some Other Race |
174 | P3_027N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; American Indian and Alaska Native |
175 | P3_028N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Asian |
176 | P3_029N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander |
177 | P3_030N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Some Other Race |
184 | P3_037N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Asian |
185 | P3_038N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander |
186 | P3_039N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Some Other Race |
187 | P3_040N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander |
188 | P3_041N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Some Other Race |
189 | P3_042N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race |
195 | P3_048N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Asian |
196 | P3_049N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander |
197 | P3_050N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Some Other Race |
198 | P3_051N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander |
199 | P3_052N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Some Other Race |
200 | P3_053N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race |
205 | P3_058N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander |
206 | P3_059N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Some Other Race |
207 | P3_060N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race |
208 | P3_061N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
211 | P3_064N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander |
212 | P3_065N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Some Other Race |
213 | P3_066N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race |
214 | P3_067N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
216 | P3_069N | !!Total:!!Population of two or more races:!!Population of five races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
218 | P3_071N | !!Total:!!Population of two or more races:!!Population of six races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
Next, calculate new columns. Black
: sum of all columns
as any combination of groups including black Total
: equal
to P3_001N
, total population 18 or over
PctBlack
: Percentage of people listed in
Black
CheckPct
: sum of P3_003N
and Black
percentages, this value should not exceed
100%.
blockgroups_calc <- blockgroups |>
rowwise() |>
mutate(Black = sum(c_across(all_of(black_vars$name)))) |>
ungroup() |>
mutate(Total = P3_001N,
PctBlack = Black / Total * 100,
CheckPct = (Black + P3_003N) / Total * 100) |>
select(GEOID, Black, Total, PctBlack, CheckPct)
Save the results as blockgroups_calc.gpkg
st_write(blockgroups_calc,
here("data", "derived", "public", "blockgroups_calc.gpkg"),
append=FALSE)
## Deleting layer `blockgroups_calc' using driver `GPKG'
## Writing layer `blockgroups_calc' to data source
## `/Users/jorredahl/Documents/GitHub/OR-Gerrymander-Alabama/data/derived/public/blockgroups_calc.gpkg' using driver `GPKG'
## Writing 3925 features with 5 fields and geometry type Multi Polygon.
Calculate area and perimeter of districts
sf_use_s2(FALSE)
## Spherical geometry (s2) switched off
districts23 <- districts23 |>
mutate(
area = st_area(geom),
perim = st_length(st_cast(st_cast(geom, "MULTIPOLYGON"), "MULTILINESTRING"))
)
Area weighted re-aggregation to gather Percent Democrat in each district. Using precincts20 layer and assuming population stays constant across each voting precinct.
The intersection of voting precincts and districts was gathered, calculating the area of each precinct beforehand. Then an area weight was assigned to each fragment, where the weighted democrat vote count and weighted total vote count is counted. Using these values to re-aggregate by district, I summarize by district to get vote counts for each, getting percent democrat in each district.
For area weighted re-aggregation, there is a level of bias stemming from the assumption that pieces of the geometry becoming fragmented are spatially homogenous. While this experiment acknowleges that this is not possible, an area weighted re-aggregation at a small scale of voting precinct or census block group minimizes this bias.
sf_use_s2(FALSE)
precincts20 <- precincts20 |>
st_transform(crs = 4269) |>
mutate(
precinctarea = st_area(geom),
total_vote = G20PRERTRU + G20PREDBID + G20PRELJOR + G20PREOWRI
)
precincts_int_districts <- st_intersection(precincts20, districts23) |>
mutate(f_area = st_area(geom),
aw = as.numeric(f_area / precinctarea),
aw_dem = aw * G20PREDBID,
aw_total = aw * total_vote)
## although coordinates are longitude/latitude, st_intersection assumes that they
## are planar
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
districts_dem <- precincts_int_districts |>
group_by(DISTRICT) |>
summarize(
sumvote = sum(aw_total),
sumdem = sum(aw_dem)
) |>
mutate(
pct_dem = as.numeric(sumdem / sumvote)
)
## although coordinates are longitude/latitude, st_union assumes that they are
## planar
districts23 <- districts23 |>
mutate(
pct_dem = districts_dem$pct_dem[match(DISTRICT, districts_dem$DISTRICT)]
)
Area weighted re-aggregation to gather Percent Black in each district. Using blockgroups_calc layer and assuming population stays constant across each voting precinct.
The intersection of census block groups and districts was gathered, calculating the area of each block group beforehand. Then an area weight was assigned to each fragment, where the black population and weighted total population is counted. Using these values to re-aggregate by district, I summarize by district to get demographics for each, getting percent black in each district.
sf_use_s2(FALSE)
blockgroups_calc <- blockgroups_calc |>
st_transform(crs = 4269) |>
mutate(
blockarea = st_area(geom),
)
blocks_int_districts <- st_intersection(blockgroups_calc, districts23) |>
mutate(f_area = st_area(geom),
aw = as.numeric(f_area / blockarea),
aw_black = aw * Black,
aw_total = aw * Total)
## although coordinates are longitude/latitude, st_intersection assumes that they
## are planar
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
districts_black <- blocks_int_districts |>
group_by(DISTRICT) |>
summarize(
sumblack = sum(aw_black),
sumpop = sum(aw_total)
) |>
mutate(
pct_black = as.numeric(sumblack / sumpop)
)
## although coordinates are longitude/latitude, st_union assumes that they are
## planar
districts23 <- districts23 |>
mutate(
pct_black = districts_black$pct_black[match(DISTRICT, districts_black$DISTRICT)]
)
Find compactness using the Polsby-Popper isoperimetric ratio
districts23 <- districts23 |>
mutate(
ppir_compact = round(as.numeric((4 * pi * area) / perim^2),2)
)
Calculate the compactness of each district using the ratio of convex-hull area to actual area
sf_use_s2(FALSE)
districts23_hull <- districts23 |> st_convex_hull()
districts23_hull <- districts23_hull |>
mutate(
hullarea = st_area(geom)
)
districts23 <- districts23 |>
mutate(
hullarea = districts23_hull$hullarea[match(DISTRICT, districts23_hull$DISTRICT)],
ch_compact = round(as.numeric(area / hullarea),2)
)
Calculate the compactness of each district using the ratio of minimum bounding circle area to actual area
sf_use_s2(FALSE)
districts23_mbcircle <- districts23 |> st_minimum_bounding_circle()
districts23_mbcircle <- districts23_mbcircle |>
mutate(
circlearea = st_area(geom)
)
districts23 <- districts23 |>
mutate(
circlearea = districts23_mbcircle$circlearea[match(DISTRICT, districts23_mbcircle$DISTRICT)],
mbc_compact = round(as.numeric(area / circlearea),2)
)
Using the convex hulls of the districts, we can perform area weighted re-aggregation on each for the block groups to find the percent black of areas surrounding the hull.
sf_use_s2(FALSE)
blocks_int_districtshull <- st_intersection(blockgroups_calc, districts23_hull) |>
mutate(f_area = st_area(geom),
aw = as.numeric(f_area / hullarea),
aw_black = aw * Black,
aw_total = aw * Total)
## although coordinates are longitude/latitude, st_intersection assumes that they
## are planar
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
districtshull_black <- blocks_int_districtshull |>
group_by(DISTRICT) |>
summarize(
sumblack = sum(aw_black),
sumpop = sum(aw_total)
) |>
mutate(
pct_black = as.numeric(sumblack / sumpop)
)
## although coordinates are longitude/latitude, st_union assumes that they are
## planar
districts23 <- districts23 |>
mutate(
hull_pct_black = districtshull_black$pct_black[match(DISTRICT, districtshull_black$DISTRICT)]
)
Now, I can calculate a metric for how unusually concentrated a
district is with black or non-black voters through
abs(pct_black - hull_pct_black)
.
districts23 <- districts23 |>
mutate(
concentration_val = abs(pct_black - hull_pct_black)
)
Here are values for districts and their percentages of black voting-age people.
districts_labels <- tm_shape(districts23) +
tm_polygons(fill_alpha = 0,
col = "red") +
tm_labels(text = "DISTRICT",
col="red",
bgcol = "white",
bgcol_alpha = 0.5,
on_surface = TRUE,
just = c("center", "center")
)
##
## ── tmap v3 code detected ───────────────────────────────────────────────────────
## [v3->v4] `tm_text()`: migrate the layer options 'just' to 'options =
## opt_tm_text(<HERE>)'
## [tm_text()] Argument `on_surface` unknown.
black_blocks <- tm_shape(blockgroups_calc) +
tm_polygons(
fill = "PctBlack",
col_alpha = 0.2,
lwd = 0.1,
col = "grey90"
)
black_blocks +
districts_labels
Here are values for districts and their percentages of democrat votes on main-ticket candidates in the 2020 presidential election.
precincts20 <- precincts20 |>
mutate(
pct_dem = G20PREDBID / total_vote
)
dem_precincts <- tm_shape(precincts20) +
tm_polygons(
fill = "pct_dem",
col_alpha = 0.2,
lwd = 0.1,
col = "grey90"
)
dem_precincts +
districts_labels
Here are values for districts on their score for black concentration. A higher number means black people or are more unusually concetrated in this district.
districts_representation <- tm_shape(districts23) +
tm_polygons(
fill = "concentration_val",
col_alpha = 0.2,
lwd = 0.1,
col = "grey90"
)
districts_representation +
districts_labels
Lastly here’s a table with each District, and each of the calculated metrics.
districts23 |>
st_drop_geometry() |>
select(DISTRICT, ppir_compact, ch_compact, mbc_compact, pct_dem,
pct_black, concentration_val) |>
kable()
DISTRICT | ppir_compact | ch_compact | mbc_compact | pct_dem | pct_black | concentration_val |
---|---|---|---|---|---|---|
1 | 0.15 | 0.59 | 0.19 | 0.2483071 | 0.1622268 | 0.0220864 |
2 | 0.14 | 0.60 | 0.20 | 0.5535744 | 0.4861677 | 0.1284007 |
3 | 0.35 | 0.81 | 0.47 | 0.2891686 | 0.2069872 | 0.0609139 |
4 | 0.20 | 0.61 | 0.32 | 0.1890682 | 0.0761442 | 0.0004853 |
5 | 0.40 | 0.91 | 0.32 | 0.3547979 | 0.1832956 | 0.0919157 |
6 | 0.20 | 0.71 | 0.46 | 0.3021344 | 0.1753046 | 0.0186869 |
7 | 0.21 | 0.71 | 0.47 | 0.6394241 | 0.5186634 | 0.1297934 |
Each of the three compactness scores ranks the scores of each
district differently. As a result, it’s unclear which score most aptly
fits to the cacluated concentration_val
statistic for
representation. Here, each compactness metric is mapped against each
other to find correlation with both each other and representation.
districts23_results_plot1 <- districts23 |> ggplot() +
aes(x = ch_compact, y = ppir_compact) +
geom_smooth(method="lm", col = "grey30") +
geom_label(aes(label = DISTRICT, fill = pct_black)) +
scale_fill_distiller(type = "div", palette = "PRGn") +
theme(legend.position = "none")
districts23_results_plot2 <- districts23 |> ggplot() +
aes(x = mbc_compact, y = ch_compact) +
geom_smooth(method="lm", col = "grey30") +
geom_label(aes(label = DISTRICT, fill = pct_black)) +
scale_fill_distiller(type = "div", palette = "PRGn") +
theme(legend.position = "none")
districts23_results_plot3 <- districts23 |> ggplot() +
aes(x = ppir_compact, y = mbc_compact) +
geom_smooth(method="lm", col = "grey30") +
geom_label(aes(label = DISTRICT, fill = pct_black)) +
scale_fill_distiller(type = "div", palette = "PRGn") +
theme(legend.position = "none")
legend_plot <- districts23 |> ggplot(aes(x = ppir_compact, y = ch_compact, fill = concentration_val)) +
geom_point() +
scale_fill_distiller(type = "div", palette = "PRGn") +
theme_minimal() +
theme(legend.position = "right")
legend <- patchwork::wrap_elements(get_legend(legend_plot))
## Warning in get_plot_component(plot, "guide-box"): Multiple components found;
## returning the first one. To return all, use `return_all = TRUE`.
(districts23_results_plot1 + districts23_results_plot2) / (districts23_results_plot3 + legend)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
Here, you can see that districts 2 and 7 with the highest concentrations of black people over the age of 18 had the highest value of unusual concentration of black people, with an over 12% difference in black population percentage inside the district as compared to a district with the same shape as its convex hull.
Overall, compactness metrics did not show this representation value well. While compactness might not be a great indicator of whether a district was unfairly created, statistics like these can illuminate the advantages for some and disadvantages for others maps like these create.
I completed this preregistration to the best of my knowledge and that no other preregistration exists pertaining to the same hypotheses and research.
This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)