2 Temporal analysis for remote sensing data
2.1 Temporal trend analysis for bushfire
Aim
This code is designed to compute the linear trend (slope and intercept) of burned area over time for each grid cell in a shapefile using linear regression, and save the results as a new shapefile.
Description of steps
- Load necessary libraries (sf for spatial data and dplyr for data manipulation).
- Read the shapefile containing burned area data.
- Extract burned area column names corresponding to years 2000-2024.
- Create a sequence of years as the independent variable for regression.
- Perform linear regression for each grid cell to estimate slope and intercept:
- Use lm() to fit a linear model with years as the independent variable.
- Extract the slope (trend of burned area change) and intercept.
- Remove the regression model objects to keep only numerical results.
- Save the updated shapefile with computed slope and intercept values.
Code (R version):
# Load necessary libraries
library(sf)
library(dplyr)
# Read the shapefile
<- st_read("/your_path_here/grid5_Sta.shp")
grid
# Extract column names (ensure column order matches years)
<- paste0("BA_", 2000:2024) # Burned area for 25 years
ba_cols
# Create a sequence of years (independent variable)
<- 2000:2024
years
# Compute the regression slope and intercept for each grid cell
<- grid %>%
grid rowwise() %>%
mutate(
lm_model = list(lm(as.numeric(c_across(all_of(ba_cols))) ~ years)), # Linear regression
slope = coef(lm_model)[2], # Slope
intercept = coef(lm_model)[1] # Intercept
%>%
) select(-lm_model) %>% # Remove model object
ungroup()
# Save the new shapefile
st_write(grid, "/your_path_here/grid5_Linear.shp", delete_layer = TRUE)
Code (Python version):
import geopandas as gpd
import numpy as np
from scipy.stats import linregress
# Read the Shapefile
= gpd.read_file("/your_path_here/grid5_Sta.shp")
gdf
# Extract column names for the time series
= list(range(2000, 2025)) # 2000 to 2024
years = [f"BA_{year}" for year in years]
columns
# Ensure all columns exist in the dataset
= [col for col in columns if col in gdf.columns]
columns
# Time axis
= np.array(years[:len(columns)])
time
def compute_slope(row):
= row[columns].values.astype(float) # Retrieve the time series data for the pixel
y_values if np.all(np.isnan(y_values)): # If all values are NaN, return NaN
return np.nan
= linregress(time, y_values)
slope, _, _, _, _ return slope
# Compute the slope and store it in a new column
"BA_slope"] = gdf.apply(compute_slope, axis=1)
gdf[
# Save the results
"/your_path_here/grid5_Sta_slope.shp")
gdf.to_file(
print("Linear regression calculation completed, results saved.")
2.2 Temporal correlation analysis between bushfire and climate
Aim
This code is designed to compute the Spearman correlation coefficients and p-values between burned area and precipitation/temperature for each grid cell over the years 2000-2024, and save the results as a new shapefile.
Description of steps
- Load necessary libraries (sf for spatial data and dplyr for data manipulation).
- Read the shapefile containing burned area, precipitation, and temperature data.
- Extract column names corresponding to:
- Burned area (BA_2000 to BA_2024)
- Precipitation (Pre2000 to Pre2024)
- Temperature (Tem2000 to Tem2024)
- Compute the Spearman correlation for each grid cell:
- Burned area vs. Precipitation (cor_BA_Pre)
- Burned area vs. Temperature (cor_BA_Tem)
- Extract the correlation coefficient (estimate) and p-value (p.value) for statistical significance.
- Save the updated shapefile containing correlation results.
Code (R version):
# Load necessary libraries
library(sf)
library(dplyr)
# Read the shapefile
<- st_read("/your_path_here/grid5_Sta.shp")
grid
# Extract column names (ensure column order matches years)
<- paste0("BA_", 2000:2024) # Burned area columns
ba_cols <- paste0("Pre", 2000:2024) # Precipitation columns
pre_cols <- paste0("Tem", 2000:2024) # Temperature columns
tem_cols
# Compute the Spearman correlation and p-values for each grid cell
<- grid %>%
grid rowwise() %>%
mutate(
cor_BA_Pre = cor.test(as.numeric(c_across(all_of(ba_cols))),
as.numeric(c_across(all_of(pre_cols))),
method = "spearman", use = "complete.obs")$estimate,
p_BA_Pre = cor.test(as.numeric(c_across(all_of(ba_cols))),
as.numeric(c_across(all_of(pre_cols))),
method = "spearman", use = "complete.obs")$p.value,
cor_BA_Tem = cor.test(as.numeric(c_across(all_of(ba_cols))),
as.numeric(c_across(all_of(tem_cols))),
method = "spearman", use = "complete.obs")$estimate,
p_BA_Tem = cor.test(as.numeric(c_across(all_of(ba_cols))),
as.numeric(c_across(all_of(tem_cols))),
method = "spearman", use = "complete.obs")$p.value
%>%
) ungroup()
# Save the new shapefile
st_write(grid, "/your_path_here/grid5_Cor.shp", delete_layer = TRUE)
Code (Python version):
import geopandas as gpd
import numpy as np
import pandas as pd
from scipy.stats import pearsonr
# Read the Shapefile
= gpd.read_file("/your_path_here/grid5_Sta_slope.shp")
gdf
# Extract column names for the time series
= list(range(2000, 2025)) # 2000 to 2024
years = [f"BA_{year}" for year in years]
ba_columns = [f"Tem{year}" for year in years]
tem_columns = [f"Pre{year}" for year in years]
pre_columns
# Ensure all relevant columns exist in the dataset
= [col for col in ba_columns if col in gdf.columns]
ba_columns = [col for col in tem_columns if col in gdf.columns]
tem_columns = [col for col in pre_columns if col in gdf.columns]
pre_columns
# Check if any required columns are missing
if not ba_columns or not tem_columns or not pre_columns:
raise ValueError("BA, Tem, or Pre-related columns are missing. Please check the data file.")
print("BA Columns:", ba_columns)
print("Tem Columns:", tem_columns)
print("Pre Columns:", pre_columns)
# Ensure all data columns are numeric
for col in ba_columns + tem_columns + pre_columns:
= pd.to_numeric(gdf[col], errors='coerce')
gdf[col]
# Function to compute Pearson correlation (excluding zero values)
def compute_correlation(row, x_columns):
= row[ba_columns].values.astype(float)
ba_values = row[x_columns].values.astype(float)
x_values
# Check if arrays are empty
if len(ba_values) == 0 or len(x_values) == 0:
return np.nan, np.nan
# Filter out NaN and zero values
= (ba_values != 0) & (x_values != 0) & ~np.isnan(ba_values) & ~np.isnan(x_values)
mask = ba_values[mask]
ba_filtered = x_values[mask]
x_filtered
# Ensure both arrays have at least two valid values
if len(ba_filtered) < 2 or len(x_filtered) < 2:
return np.nan, np.nan
= pearsonr(ba_filtered, x_filtered)
correlation, p_value return correlation, p_value
# Compute Pearson correlation and significance level between BA and Tem
"BA_Tem_corr", "BA_Tem_pval"]] = gdf.apply(lambda row: compute_correlation(row, tem_columns), axis=1, result_type='expand')
gdf[[
# Compute Pearson correlation and significance level between BA and Pre
"BA_Pre_corr", "BA_Pre_pval"]] = gdf.apply(lambda row: compute_correlation(row, pre_columns), axis=1, result_type='expand')
gdf[[
# Save the results
"/your_path_here/grid5_Sta_corr.shp")
gdf.to_file(
print("Pearson correlation analysis between BA and Tem, as well as BA and Pre, has been completed (excluding zero values). Results saved.")