Wildfire Analysis with ggplot and leaflet: Part I

Aaron Long

2/04/2020


Back to Home

Introduction: Part I

Back in 2017, the Tubbs Fire in Santa Rosa had not only destroyed my family’s home, but also completely destroyed our neighborhood. With losing 95% of our belongings including irreplaceable photographs and mementos, it was a truly devastating event that traumatized me and my family.

For this report, I am exploring a dataset on the 1.88 million wildfires that occurred in the US from 1992-2015.This dataset can be found in Kaggle: https://www.kaggle.com/rtatman/188-million-us-wildfires. I will briefly look at all the data then dive into large fires only. This means fires in class size F (1000-5000)acres burned and size G with is wildfires burning over 5000+ acres. Some of the largest fires in CA for example burned hundreds of thousands of acres. I want to focus on California and Florida as this is where I live now and where the data would be most relevant to me.

Tubbs Fire Pictures

Before: Our home before the Tubbs Fire.

After: Our home a few days later after the Tubbs Fire.

After two years of hardship and dealing with the situation, we were fianlly able to begin building the house in 2019.

Getting Started

Load in the Libraries used for the Dataset

To begin the process, load in the necessary libraries needed for SQLite, ggplot, dplyr, map data, and leaflet.

Load the Data

Connect to the SQLite database and load in the data. Then, disconnect the connection to the data to free up computer resources such as memory or CPU.

By using glimpse, it provides an overall idea of how the data is presented. For instance, this data contains 39 columns and 1.8 million rows. Several useful columns includes, but not limited to, FIRE_YEAR, or FIRE_SIZE.

## Observations: 1,880,465
## Variables: 39
## $ OBJECTID                   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1...
## $ FOD_ID                     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1...
## $ FPA_ID                     <chr> "FS-1418826", "FS-1418827", "FS-1418835"...
## $ SOURCE_SYSTEM_TYPE         <chr> "FED", "FED", "FED", "FED", "FED", "FED"...
## $ SOURCE_SYSTEM              <chr> "FS-FIRESTAT", "FS-FIRESTAT", "FS-FIREST...
## $ NWCG_REPORTING_AGENCY      <chr> "FS", "FS", "FS", "FS", "FS", "FS", "FS"...
## $ NWCG_REPORTING_UNIT_ID     <chr> "USCAPNF", "USCAENF", "USCAENF", "USCAEN...
## $ NWCG_REPORTING_UNIT_NAME   <chr> "Plumas National Forest", "Eldorado Nati...
## $ SOURCE_REPORTING_UNIT      <chr> "0511", "0503", "0503", "0503", "0503", ...
## $ SOURCE_REPORTING_UNIT_NAME <chr> "Plumas National Forest", "Eldorado Nati...
## $ LOCAL_FIRE_REPORT_ID       <chr> "1", "13", "27", "43", "44", "54", "58",...
## $ LOCAL_INCIDENT_ID          <chr> "PNF-47", "13", "021", "6", "7", "8", "9...
## $ FIRE_CODE                  <chr> "BJ8K", "AAC0", "A32W", NA, NA, NA, NA, ...
## $ FIRE_NAME                  <chr> "FOUNTAIN", "PIGEON", "SLACK", "DEER", "...
## $ ICS_209_INCIDENT_NUMBER    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ICS_209_NAME               <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ MTBS_ID                    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ MTBS_FIRE_NAME             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ COMPLEX_NAME               <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ FIRE_YEAR                  <int> 2005, 2004, 2004, 2004, 2004, 2004, 2004...
## $ DISCOVERY_DATE             <dbl> 2453404, 2453138, 2453157, 2453185, 2453...
## $ DISCOVERY_DOY              <int> 33, 133, 152, 180, 180, 182, 183, 67, 74...
## $ DISCOVERY_TIME             <chr> "1300", "0845", "1921", "1600", "1600", ...
## $ STAT_CAUSE_CODE            <dbl> 9, 1, 5, 1, 1, 1, 1, 5, 5, 1, 1, 1, 9, 4...
## $ STAT_CAUSE_DESCR           <chr> "Miscellaneous", "Lightning", "Debris Bu...
## $ CONT_DATE                  <dbl> 2453404, 2453138, 2453157, 2453190, 2453...
## $ CONT_DOY                   <int> 33, 133, 152, 185, 185, 183, 184, 67, 74...
## $ CONT_TIME                  <chr> "1730", "1530", "2024", "1400", "1200", ...
## $ FIRE_SIZE                  <dbl> 0.10, 0.25, 0.10, 0.10, 0.10, 0.10, 0.10...
## $ FIRE_SIZE_CLASS            <chr> "A", "A", "A", "A", "A", "A", "A", "B", ...
## $ LATITUDE                   <dbl> 40.03694, 38.93306, 38.98417, 38.55917, ...
## $ LONGITUDE                  <dbl> -121.0058, -120.4044, -120.7356, -119.91...
## $ OWNER_CODE                 <dbl> 5, 5, 13, 5, 5, 5, 5, 13, 13, 5, 5, 5, 5...
## $ OWNER_DESCR                <chr> "USFS", "USFS", "STATE OR PRIVATE", "USF...
## $ STATE                      <chr> "CA", "CA", "CA", "CA", "CA", "CA", "CA"...
## $ COUNTY                     <chr> "63", "61", "17", "3", "3", "5", "17", N...
## $ FIPS_CODE                  <chr> "063", "061", "017", "003", "003", "005"...
## $ FIPS_NAME                  <chr> "Plumas", "Placer", "El Dorado", "Alpine...
## $ Shape                      <blob> blob[60 B], blob[60 B], blob[60 B], blo...

Exploring the Data

Bar Charts

With the data provided, I am using ggplot to create barplots that will initially help with getting a better understanding of the data. For this example, I’m only reveaing a small portion of the code as a sample (for an easier understanding).

Note: Feel free to message or email me if you are interested in looking at the R file with the rest of the code in detail.

This chart illustrates all the wildfires in the Unites States from 1992 to 2015. The blue dotted-line demonstrates a small uptick in the number of wildfires in the US from 1992-2015. This indicates that there is an increase in fires from 1992 to 2015. As you can see, 2006 has the most fires by approximately 56,000 more wildfires, while 1997 has the least amount.

This graph presents 13 different causes of wildfires from 1992-2015. According to the data, the four major causes of wilfires include debris burning, miscellaneous, arson, and lightening. Surprisingly, the three lowest causes are structures, fireworks, and powerlines.

For this graph, the wildfires are divided into seven class size categories. According to the data, 0.25-9.9 has the highest amount of wildfires having approximately 94,000 wildfires.

Map Data

To start creating maps, load in the data for the map. I also created new variables such as region and fire duration.

US Maps

Below are two US Maps that illustrate the number of wildfires by each state from 1992 to 2015.

This first maps displays the total count of wildfires. Based on the data, California, Texas, and surprisingly Georgia had the most fires.

The second map only displays large fires that are over 1000 acres or more. According to the map, it displays that large fires occur more across the West of the US Map. The states with the largest amount of fires includes California, Texas, and now Idaho. The gray states (Vermont, Massachusetts, New Hampshire, and Rhode Island) indicate a lack of large wildfires because these states are small in size and are generally colder climates.

Large Wildfires in California

This map illustrates large wildfires in California between 1992-2015 in increments of 5 years. It indicates that the years from 1995-2000 and 2005-2010 had the most large wildfires in California. It also reveals that most of the 1992-1995 wildfires where from Southern California, and then after 2005, there was a shift of large fires to Northern California.

This graph presents the acreage burned in California. There is a clear up trend in the total amount of acreage burned each year.

This map presents 13 causes of large wildfires in California. The top three causes of large wildfires in California are lightning, miscellaneous, and equipment use. In Northern California, there is a large amount of fires caused by lightning equipment, for example, the Tubbs Fire that destroyed my home in 2017.

Large Wildfires in Florida

This map illustrates large wildfires in Florida between 1992-2015 in increments of 5 years. It indicates that the years from 1995-2005 had the most large wildfires in Florida. It also reveals that most of of the wildfires are in Southern Florida, near the Everglades. This graph presents the acreage burned in Florida. There is a clear but small up trend in the total amount of acreage burned each year. This map presents 9 causes of large wildfires in Florida. The top major cause of wildfires in Florida is lightning, which is atleast four times as much as any other cause. It comes to no surprise since Florida is the Thunderstorm Capital of America.

Conclusion:

Looking at large wildfires there does seem to be an increase in the number of large wildfires across the U.S. and across California and Florida. There are likely many causes of this but it does seem climate change is a strong cause of this. As the temperatures are increasing and weather is getting more extreme it is increasing the number of wildfires in the U.S. Check out part II to see some interactive leaflet google maps.