Wildfires are caused by two entities coming together- an ignition source, and suitable conditions for a fire to burn. Without one, it is unlikely the other entity can solely be responsible for a wildfire. Using this intuition, we have built a web-application backed by a predictive model powered by machine learning.
We used NASA, CSA and USFS (US Forest Service) datasets to aggregate data about the occurrence of wildfires (in California, USA) in the past, the weather conditions just before the wild fires, the proximity of the location to human presence, and the vegetation cover of the region. This gives a comprehensive list of environmental and human-related data points prior to the wildfire. By using machine learning algorithms on this dataset, we were able to find common patterns that existed across regions (in Califonia) prior to the occurence of the wildfire. By using our ML model, we are now able to predict the occurence of a wildfirefor a future date based on the environmental and human factors for that date.
To visualise our results, we have built an interactive web-application hosted on the cloud which allows the user to monitor regions in California that are at a high-risk of a wildfire over the next 10 days.
The web-application can be accessed at: wildfire hotspot predictor
By developing a predictive solution, we are able to help government agencies mitigate the risk of wildfires before they become a serious hazard. The proactive and predictive approach of our solution (as compared to a reactive one) can significantly help reduce the economic and ecological tolls wildfires cause.
After studying multiple solutions to the same problem from previous editions of NASA SpaceApps, we realized that all of them dealt with solving the problem after a wildfire was burning. We also realized that:
a) 85% of the wildfires are man-made. This means, there is some, predictable human behaviour/activities causing the fires. Example: camp fires, lit cigarettes, power transmission lines, etc.
b) Apart from the triggers (campfires, lightning, etc), the environmental conditions of that region play a huge rule in determining if a lit cigarette escalates into a raging inferno, or not. Again, this gives us predictable environmental conditions (temprature, rainfall, wind, etc) that can lead to wildfires.
With these ideas forming our foundation, we collected and studied data pertaining to past wildfires and the environmental conditions during the events. We used the following data points to correlate the occurance of wildfires with the environmental factors:
a) weather-related data: temprature, humidity, precipitation, UV radiation, wind speed, wind direction.
b) human factors: proximity to camp sites, population density.
c) other factors: proximity to power stations, number of power transmission lines in the region.
With vast amounts of data on these data points, we were able to train a machine learning model to find patterns in occurences of wildfires with respect to the environmental factors. We used a random forest classifier as our machine learning model. With more computational resources and time, we could have experimented with deep neural networks.
The results of our prediction are illustrated using a web-application that allows the user to monitor regions for posibillity of wildfires based on the weather conditions and human-factors pertaining to that location. For now, the application only focuses on predicting the occurence of
Technology stack of our project:
Programming languages used: Python, JavaScript, HTML and CSS
Tools and platforms used: Docker, Angular, Heroku, arcGIS
Datasets obtained from NASA, CSA, USFS, and USGS we used to build a comprehensive dataset and train the model.
We used multiple datasets provided by NASA, CSA, US Forest Services, and US Geological Survey .
We used a wildfire dataset provided by USFS. Using this, we were able to collect all occurences of wildfires (from 1992-2015), their dates, the area burned by them in acres and the number of fires per year in a given region. To simplify our task, we considered data only for the state of California. For all the dates on which there were wildfires, the weather before and during the wildfires was gathered from cimis data. The weather data points we aggregated for each of the days were the temperature, precipitation, wind speed, wind direction, and UV radiation. We also included vegetation cover in the dataset as this has a strong correlation with the size of forest fires.
Since, human activities cause 85% of the wildfires through irresponsible use of camp fires or disposal of cigarettes, we knew it was crucial to include human-related factors in the dataset. Using publicly available datasets, we added values for population density, number of campsites, and number of power lines for every region present in the dataset.
Once we had aggregated all the datasets into one, it was time to define the problem we would solve using the data. We decided to predict the likelihood of a wildfire occuring at a given region/location based on the weather metrics, vegetation cover, and human-factors such as proximity to camp sites and power lines.
We trained a random forest classifier to predict whether a given region was at a high-risk of having a fire or not. pandas and sk-learn were used to train the models. To visualise the data, we build a web-application to predict the probability wildfires across regions in California. This web-app has been deployed on the cloud and can be accessed here: wildfire hotspot predictor.
1) CSA dataset to obtain CO (carbon monoxide) values : ftp://data.asc-csa.gc.ca/users/OpenData_DonneesOuvertes/pub/SCISAT/
2) CSA MOPITT dataset: ftp://data.asc-csa.gc.ca/users/OpenData_DonneesOuvertes/pub/MOPITT/
3) Wildfires from 1992 to 2015 dataset: link
4) Weather datset: ftp://ftpcimis.water.ca.gov/pub2/
5) Vegetation cover in California: link_1, link_2
6) Campsites in the US: link
7) Power lines in the US: link
8) Power substation in the US: link
9) Tree/vegetation density: ftp://data.asc-csa.gc.ca/users/OpenData_DonneesOuvertes/pub/SCISAT/Data%202004-2020/ACEFTS_L2_v4p1_CO.csv