# Import necessary libraries for data manipulation and visualization
import pandas as pd
import matplotlib.pyplot as plt
# Read in AQI data for the years 2017 and 2018 from online sources
= pd.read_csv("https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2017.zip")
aqi_17 = pd.read_csv("https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2018.zip") aqi_18
GitHub Link: https://github.com/maxwellpatt/eds220-pres-repo
Full repository can be found using this link above!
(NBC 2019)
About
Purpose
This notebook is designed to perform a comprehensive environmental analysis using two distinct approaches: Air Quality Index (AQI) trend analysis for Santa Barbara County from 2017-2018 and remote sensing data visualization of the 2017 Thomas Fire in California. The primary objectives include demonstrating data manipulation and visualization techniques, time-series analysis for AQI, and the application of true and false color imagery in assessing wildfire impacts. Furthermore, the two parts of the analysis are complementary as the AQI trend analysis allows for the temporal scope of the fire, and the remote sensing segment reveals the spatial scope of the event.
Highlights of Analysis
- AQI Trend Analysis (2017-2018): Retrieval and preparation of AQI data from the EPA, focusing on data cleaning, concatenation, and column modification. Implementation of a 5-day rolling average for AQI to smooth daily variations and a detailed visualization of these trends over time.
- Remote Sensing Data Visualization: Utilization of Landsat 8 satellite imagery for creating true and false color images of Santa Barbara. Integration of California fire perimeter data to assess the spatial impact of the Thomas Fire, enhancing the understanding of wildfire effects through geospatial analysis.
- Data Concatenation and Cleaning: Merging AQI datasets for two consecutive years, followed by data cleaning processes such as modifying column names for consistency and dropping unnecessary columns.
- Visualization Techniques: Development of plots and maps to compare daily AQI values with the 5-day average and to overlay wildfire perimeters on satellite imagery, providing a clear visual representation of both air quality trends and the extent of wildfire damage.
Dataset Description
The analysis leverages two primary datasets:
- Air Quality Index (AQI) Data: Daily AQI measurements by county for 2017 and 2018, sourced from the EPA, providing insights into air quality trends over the two-year period.
- Landsat 8 Satellite Imagery and Fire Perimeter Data: High-resolution imagery capturing various spectral bands, combined with shapefile data of the 2017 California fire perimeters, to visualize and analyze the impact of wildfires. The data is pulled from the Microsoft PLanetary Computer where it has been pre-processed to remove data outside land and improve the spatial resolution.
References to Datasets
- EPA Air Quality Data: “Daily AQI.” EPA, October 16, 2023. https://aqs.epa.gov/aqsweb/airdata/download_files.html#AQI.
- NASA EarthData: “Microsoft Planetary Computer.” Planetary Computer. Accessed December 11, 2023. https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2.
- California Fire Perimeters 2017: “California Fire Perimeters” California State Geoportal. Accessed December 11, 2023. https://gis.data.ca.gov/datasets/CALFIRE-Forestry::california-fire-perimeters-all-1/about.
Air Quality Index Data
Importing Data
This section involves importing necessary libraries like pandas and matplotlib for data manipulation and visualization. The AQI data for 2017 and 2018 is fetched from online sources, ensuring access to the most relevant and up-to-date air quality information.
Analysis
Data Cleaning and Preparation
The datasets from different years are concatenated for a comprehensive analysis. Column names are cleaned for consistency, and irrelevant columns are removed, streamlining the data for effective analysis. The conversion of the ‘date’ column to a datetime object and setting it as an index is verified to ensure proper time-series analysis.
# Concatenate the two data frames for combined analysis
= pd.concat([aqi_17, aqi_18])
aqi
# Cleaning column names for ease of use
= aqi.columns.str.lower().str.replace(' ', '_')
aqi.columns
# Filtering data for Santa Barbara County
= aqi[aqi['county_name'] == 'Santa Barbara']
aqi_sb
# Removing unnecessary columns
= ['state_name', 'county_name', 'state_code', 'county_code']
remove = aqi_sb.drop(columns=remove)
aqi_sb
# Convert 'date' column to datetime object and set as index
'date'] = pd.to_datetime(aqi_sb['date'])
aqi_sb[= aqi_sb.set_index('date') aqi_sb
Time-Series Analysis
A 5-day rolling average for AQI is calculated and checked by displaying the first few entries. This step is critical for smoothing out daily fluctuations and observing longer-term trends in air quality.
# Create a 5-day rolling average for AQI
'five_day_average'] = aqi_sb.aqi.rolling('5D').mean() aqi_sb[
Final Output
The output visualizes the Air Quality Index (AQI) in Santa Barbara County across two years: 2017 and 2018. The blue line represents the daily AQI values, showing considerable variability with several peaks indicating days of poor air quality. The orange line depicts the 5-day rolling average of AQI, which smooths out the daily fluctuations to reveal the underlying trends more clearly. Notably, there is a significant peak at the start of December 2017, marked by the dashed vertical line, which correlates to a big fire in the area. Overall, the visualization effectively communicates the temporal changes in air quality and the utility of using a rolling average to understand longer-term trends.
# Plotting daily AQI and 5-day average AQI
'aqi'].plot(label='Daily AQI', color='blue')
aqi_sb['five_day_average'].plot(label='5-Day Average AQI', color='orange', linewidth=2)
aqi_sb['2017-12-01'), color='black', linestyle='--', label='Start of December 2017')
plt.axvline(pd.Timestamp('Daily AQI vs. 5-Day Average AQI')
plt.title('Date')
plt.xlabel('AQI Value')
plt.ylabel(
plt.legend()'images/aqi_averages.png')
plt.savefig( plt.show()
<Figure size 1728x1152 with 0 Axes>
This visual shows the exterme impact that the Thomas Fire in December 2017 had on air quality index levels in Santa Barbara. The dotted line represents the start of December 2017 with the fire starting just a few days into the month. and the AQI levels proceed to spike significantly as a result. This is seen in both the daily and 5-day rolling average.
False Color Image
Importing Data
In this initial step, libraries essential for processing geospatial data, such as NumPy, Pandas, GeoPandas, and xarray, are imported. These tools enable the handling of complex raster and vector data formats necessary for environmental and geographical analyses.
# Import necessary libraries for handling geospatial and raster data
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import xarray as xr
import rioxarray as rioxr
import geopandas as gpd
from rasterio.features import rasterize
from rasterio.crs import CRS
# File path for raster data
= os.path.join(os.getcwd(), "data", "landsat8-2018-01-26-sb-simplified.nc")
data_path
# Open raster file
= rioxr.open_rasterio(data_path)
landsat
# Read fire data
= gpd.read_file("data/California_Fire_Perimeters_2017/California_Fire_Perimeters_2017.shp") fire
/Users/maxwellpatterson/opt/anaconda3/lib/python3.8/site-packages/scipy/__init__.py:138: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.4)
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion} is required for this version of "
Geographical Context
The geographical context is established by loading Landsat 8 satellite imagery for the Santa Barbara region and fire perimeter data from 2017. This step situates the analysis within the specific area affected by the Thomas Fire, setting the stage for a targeted examination of the landscape.
Data Preparation and Alignment
This section involves transforming the CRS of the fire perimeter data to match the Landsat data. The successful alignment of these datasets is confirmed, which is imperative for precise spatial overlay in the analysis.
# Reduce dimensions
= landsat.squeeze(['band'])
landsat_new
= landsat_new.drop(['band'])
landsat_new
# Display updated landsat
landsat_new.values
<bound method Mapping.values of <xarray.Dataset>
Dimensions: (x: 870, y: 731)
Coordinates:
* x (x) float64 1.213e+05 1.216e+05 ... 3.557e+05 3.559e+05
* y (y) float64 3.952e+06 3.952e+06 ... 3.756e+06 3.755e+06
spatial_ref int64 0
Data variables:
red (y, x) float64 ...
green (y, x) float64 ...
blue (y, x) float64 ...
nir08 (y, x) float64 ...
swir22 (y, x) float64 ...>
Analysis
Creating True and False Color Images
The creation of true and false color images utilizes specific bands from the Landsat data. These images are normalized to enhance visual contrast, aiding in the identification of different land features. Checks confirm the normalization process has occurred correctly.
True Color Image
# Select R, G, B bands
= landsat_new['red']
red_band = landsat_new['green']
green_band = landsat_new['blue']
blue_band
# Stack the bands along the 'color' dimension
= xr.concat([red_band, green_band, blue_band], dim='color')
rgb_image = (rgb_image - rgb_image.min()) / (rgb_image.max() - rgb_image.min())
rgb_image
# Plot the RGB image
'y', 'x', 'color'))
plt.imshow(rgb_image.transpose(
# Visualize map
plt.show()
False Color Image
# Plot the RGB image using imshow
'swir22', 'nir08', 'red']].to_array().plot.imshow(robust= True) landsat_new[[
<matplotlib.image.AxesImage at 0x7fd870c1f220>
Final Output
The final output is a composite image that illustrates the affected area during the Thomas Fire. This output serves as a potent visual tool for understanding the spatial extent of wildfires and highlights the value of remote sensing in environmental monitoring and disaster assessment.
# Filter for Thomas fire
= fire[fire['FIRE_NAME']=="THOMAS"]
thomas_fire
# Convert thomas_fire to GeoDataFrame to same crs as landsat
= thomas_fire.to_crs(landsat_new.rio.crs)
thomas_fire
# Store false color map
= landsat_new[['swir22', 'nir08', 'red']].to_array() false_color
# Initiate figure
= plt.subplots(figsize=(6,6))
fig, ax
# Plot outline of California and create key for legend
=ax, robust=True)
false_color.plot.imshow(ax
# Plot the outline of the Thomas Fire bounding box and create key for legend
# Set facecolor to 'none' to only show the border
=ax, facecolor='none', edgecolor='red', linewidth=0.5)
thomas_fire.plot(ax= mpatches.Patch(edgecolor='red', facecolor='none', label='Thomas Fire')
thomas_patch
# Create legend
=[thomas_patch], frameon=True, loc='upper right', bbox_to_anchor=(1.4, 1))
ax.legend(handles'False Color Image of Santa Barbara with Thomas Fire Boundary')
ax.set_title(
# Save the figure
'images/thomas_fire_boundary.png') plt.savefig(
This visual provides context on the area that was impacted by the Thomas Fire. Paired with the air quality index temporal data, these visual outputs give us a better understanding of the scope of this catastrophic event in Santa Barbara.