Code
# import necessary libraries to perform analysis
library(tidyverse)
library(sf)
library(stars)
library(terra)
library(dplyr)
library(spData)
library(spDataLarge)
library(tmap)
library(raster)
library(ggspatial)
library(cowplot)
December 1, 2023
February 2021, the state of Texas suffered a major energy crisis, which came about as a result of three severe winter storms sweeping across the United States on February 10–11, 13–17, and 15–20(POWER Committee, n.d.). The energy crisis that occurred resulted in a loss of power for millions of Texans, fifty-seven deaths, and billions in property damages(POWER Committee, n.d.).
In the following exploration adapted from Dr. Ruth Oliver we will be:
- estimating the number of homes in Houston that lost power as a result of the first two storms
- investigating if socioeconomic factors are predictors of communities susceptibility to power outages
Analysis will be based on remotely-sensed night lights data, acquired from the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In particular, we will detect differences in night lights before and after the storm to identify areas that lost electric power.
This section illustrates the code utilized to explore the effects of the storms.
Loading and Exploring our Data
To investigate the effects of the Houston, TX storms we will use raster data primarily. The following code block uses SQL and the stars package to load in the data needed for the project.
For our raster data we used NASA’s Worldview data for February 7, 2021 (before the power outage) and February 16, 2021 (during the power outage). These dates were selected as other days during the time frame had cloud cover that would have skewed our investigation.
VIIRS data is distributed through NASA’s Level-1 and Atmospheric Archive & Distribution System Distributed Active Archive Center (LAADS DAAC). Data was downloaded and prepped in advance and stored in the .gitignore
.
# read in night lights tiles
night1 <- read_stars('data/VNP46A1/VNP46A1.A2021038.h08v05.001.2021039064328.tif')
night2 <- read_stars('data/VNP46A1/VNP46A1.A2021038.h08v06.001.2021039064329.tif')
night3 <- read_stars('data/VNP46A1/VNP46A1.A2021047.h08v05.001.2021048091106.tif')
night4 <- read_stars('data/VNP46A1/VNP46A1.A2021047.h08v06.001.2021048091105.tif')
# combine tiles into stars object for 2021-02-07 and 2021-02-16
night_07 <- st_mosaic(night1, night2)
night_16 <- st_mosaic(night3, night4)
# view 2021-02-07 raster
plot(night_07, main = "Raster of 2021-02-07")
Next we will minimize light emanating from major highways systems, we used geographic data from OpenStreetMap (OSM) from Geofabrik’s download sites. These data were downloaded and prepped in advance to contain a subset of highway and roads that intersect with the Houston metropolitan area. Data from Geofabrik’s download sites for residences in the Houston metropolitan area were also accessed.
# creates the query to select motorways
highway_query <- "SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"
# load highway data that fulfills query
highways <- st_read("data/gis_osm_roads_free_1.gpkg", query = highway_query)
# reproject highwyas to crs 3083
highways_reproj <- st_transform(highways, crs = 3083)
# creates the query to select residential buildings
residential_query <- "SELECT * FROM gis_osm_buildings_a_free_1 WHERE (type IS NULL AND name IS NULL) OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')"
# load buildings data that fulfills query
residential <- st_read("data/gis_osm_buildings_a_free_1.gpkg", query = residential_query)
# reproject buildings to crs 3083
residential_reproj <- st_transform(residential, crs = 3083)
For the final data set utilized, data from the U.S. Census Bureau’s American Community Survey for census tracts in 2019 from an ArcGIS file geodatabase was accessed. The metadata for each layer is available at ACS metadata.
# load data that contains census tract geometries
texas_census <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb", layer = "ACS_2019_5YR_TRACT_48_TEXAS")
# reproject census data to crs 3083
texas_census_reproj <- st_transform(texas_census, crs = 3083)
# load data that contains census tract data and income data
texas_income <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb", layer = "X19_INCOME")
Blackout Mask
To find the change in night lights intensity (presumably) caused by the storm we will create a blackout mask. In order to do this we will be reclassifying the difference between the two rasters. We will be classifying any location that experienced a drop of more than 200 nW cm-2sr-1 experienced a blackout, and any location less than this will be classified to a NA
# find the change in night light intensity between the two dates
light_difference <- night_07-night_16
# reclassifying raster to show blackouts above 200, and label areas that experienced an outage outage, and that didn't NA
light_difference_outage <- cut(light_difference, c(200, Inf), labels = 'outage')
The blackout mask will then be vectorized and fixed for any invalid geometries.
The vectorized mask was then cropped to our region of interest of Houston, TX with the following coordinates being utilized: (-96.5, 29), (-96.5, 30.5), (-94.5, 30.5), (-94.5, 29). The blackout mask was then cropped to the size of Houston , TX and ultimately reprojected to EPSG:3083. The EPSG:3083 is also known as NAD83 / Texas Centric Albers Equal Area.
# defining houston coordinates
houston <- cbind(x = c(-96.5, -96.5, -94.5, -94.5, -96.5),
y = c(29, 30.5, 30.5, 29, 29))
#turn houston coordinates into polygon
houston <- st_sfc(st_polygon(list(houston)), crs = 4326)
# create a mask for the houston area
houston_mask <- st_intersects(blackouts, houston, sparse = FALSE)
# crop the blackout_mask to the houston region of interest
houston_blackouts <- blackouts[houston_mask,]
# reprojecting houston blackouts to Texas specific Datum
houston_blackouts_reproj <- st_transform(houston_blackouts, crs = 3083)
Highways and Residences Selection
We then excluded highways, and created a 200 meter buffer to include buildings outside of 200 meters from a highway. Additionally we filtered to the residential buildings only within the blackout mask. These processes allow us to identify the number of homes impacted by the blackouts.
# identify areas within 200m of all highways
highways_buffer <- st_buffer(highways_reproj, dist = 200) %>%
st_union()
# find areas that had blackouts that are further than 200m
blackouts_outside_highway <- st_difference(houston_blackouts_reproj, highways_buffer)
# dataframe where residential buildings are kept that are in houston outside of our highway buffer
blackout_residential <- residential_reproj[blackouts_outside_highway, , op = st_intersects]
#counting the number of residential buildings that got hit
print(paste0(nrow(blackout_residential), ' residential buildings were affected by the blackouts'))
[1] "157411 residential buildings were affected by the blackouts"
Socioeconomic Effects
To explore if blackouts were correlated with socioeconomic factors like median income, we used census tract data. We used spatial joins to join the census tract data, and buildings that were and were not impacted by blackouts.
# create data frame that contains median income data and geoID
texas_median_income <- texas_income %>%
dplyr::select(B19013e1, GEOID) %>%
rename(median_income = B19013e1, GEOID_Data = GEOID)
# join the income data to census tract geometries in Texas
texas_income_census <- left_join(texas_census_reproj, texas_median_income, by = "GEOID_Data")
# join the census tract data with residential buildings determined to be impacted by blackouts
texas_census_blackouts <- st_filter(texas_income_census, blackout_residential)
Comparison of median income for census tracts impacted by blackouts and those that were not
To compare incomes of impacted census tracts and unimpacted census tracts a map of median income per census tract with the tracts that had blackouts outlined was created.
#transforming houston boundary to Texas datum
houston_reproj <- st_transform(houston, crs = 3083)
# data frame that has all median income is in houston census tracts
houston_income_census <- st_filter(texas_income_census, houston_reproj)
# create visualization
median_income_map <- ggplot() +
geom_sf(data = houston_income_census, # plot the median income of all of houston
aes(fill = median_income)) +
scale_fill_viridis_c() +
geom_sf(data = texas_census_blackouts, aes(color = "red", # outline census tracts that experienced blackouts in red
fill = median_income)) +
scale_color_hue(labels = c("Blackout Census Tracts")) + # alter legend titles
labs(fill='Median Income') +
theme_void() +
annotation_north_arrow( # add north arrow
height = unit(1.5, "cm"),
width = unit(1.5, "cm"),
pad_x = unit(7.5, "cm"),
pad_y = unit(0.15, "cm"),
rotation = NULL,
style = north_arrow_fancy_orienteering
) +
annotation_scale( # add scale bar
plot_unit = NULL,
bar_cols = c("black", "white"),
line_width = 1,
height = unit(0.15, "cm"),
pad_x = unit(2, "cm"),
pad_y = unit(0.15, "cm"),
text_pad = unit(0.15, "cm"),
text_cex = 0.7,
text_face = NULL,
text_family = "",
tick_height = 0.6
) +
labs(title= "Median Income of Census Tracts in Houston, TX", color = "") # add map title
median_income_map
To further explore the impacts of blackouts on census tracts the distribution of income in impacted and unimpacted tracts was plotted.
# find areas that did not have blackouts within the houston area
unimpacted <- houston_income_census[texas_census_blackouts, , op = st_difference]
# create a histogram showing median income spread for homes in houston not impacted by blackouts
unimpacted_figure <- ggplot(data = unimpacted, aes(median_income)) +
geom_histogram(bins=50, fill = '#481467FF', color = 'white') +
labs(title = "Unimpacted by Blackouts", x = "Median Income ($)", y = "Census Tracts") +
theme_classic() +
ylim(0,105)
#unimpacted_figure
# create a histogram showing median income spread for homes in houston impacted by blackouts
impacted_figure <- ggplot(data = texas_census_blackouts, aes(median_income)) +
geom_histogram(bins=50, fill = '#481467FF', color = 'white') +
labs(title = "Impacted by Blackouts", x = "Median Income ($)", y = "Census Tracts") +
theme_classic() +
ylim(0,105)
# position histograms side by side for easier comparison
impact_unimpact_plot <- plot_grid(unimpacted_figure, impacted_figure, labels = c('', ''), label_size = 12)
impact_unimpact_plot
The median income of homes in Houston, TX affected and not affected by blackouts does not appear to have a significant difference. The blackouts do appear to have centered around the center of the city, but this area also has a wide variety of median incomes. Additionally, when comparing the median incomes of homes affected and not affected the distributions appear to be similar for the spread of wealth. Residents not impacted by the storm do appear to have slightly higher median incomes which could coincide with living farther outside the city center in the suburbs of Houston. A limitation of this analysis is that we did reduce the number of buildings counted that fell close to highways, which could be areas that have lower median incomes. Additionally, in exploring the power outages in Houston, TX we only compared the blackouts from the first two storms, without accounting for the third storm. Without accounting for the third storm the total number of outages that occurred is being underestimated.
This goal of this exploration was to become more familiar with spatial data. The results and findings of this investigation are not conclusive and should not be utilized without additional investigations. Overall, I hope this blog post was helpful in learning how different packages and functions can be used for working with spatial data.
@online{widas2023,
author = {Widas, Melissa},
title = {Houston, {Texas} {Power} {Outages}},
date = {2023-12-01},
url = {http://mwidas.github.io/2023-12-01-houston-outages},
langid = {en}
}