Back-burn Boys has received the following awards and nominations. Way to go!
At a high level, our project uses historical fire and weather data to predict high risk fire zones across Australia (though can easily be scaled to the entire globe). The 2019-2020 bushfire season alone cost the Australian economy $2 billion from lost tourism, agricultural and retail income. In addition, the smoke haze clouding cities cost the economy an extra $500 million from lost productivity, spending and ill-health. Clearly, the cost to the Australian economy as a result of wildfires is extreme. By identifying danger zones prior to the annual fire season, we will be able to notify fire and rescue services of areas that will require particular attention regarding back-burning and other fire mitigation strategies.

At the front-end, we developed a web application which illustrates fire ratings (i.e. likelihood of a fire occurring) in various Australian cities on an interactive map. The user interacts with a date-time input to view how fire ratings across the country change as a function of time. The web application supports not only historic fire ratings, whereby users can become better informed about the history of wildfires across Australia, but more importantly, projected future fire ratings, estimated using machine learning models. Meteorologists (and other climate scientists) with future weather estimates for certain dates and locations are able to input this data into the application, which will then predict the likelihood of future wildfire prevalence.
At the back-end, we used machine learning to train various classification and regression models with historical Australian weather and fire data from 2015, and used these to predict the likelihood of future wildfires. Testing revealed that our implementation distinguishes between a wildfire occurring / not occurring with a 99.7% accuracy, and furthermore, estimates the numerical probability of it occurring with approximately a 60% accuracy.
Bushfires in Australia are devastating, not only for the economy but also the community, wildlife, and vegetation, and last summer proved to be one of the most devastating bushfire seasons in Australian history. Given that two members of our group are Australian, we were inspired to embark on this project during our first ever hackathon due to its personal significance - indeed, we feel very strongly about implementing a viable solution to what will likely continue to be a very grave concern in years to come, and we believe that our project reflects this passion.
Our project revolved around two core ideas: data visualisation and predicting future wildfires. This closely aligned with our team structure - a software engineer, a mathematician, and an engineer-economist.
In terms of data visualisation, our goal was to display historic fire data on a map of Australia to inform the Australian public about how fire hot spots change over time. This was primarily developed using a combination of Australian weather and NASA FIRMS data. The data was appropriately cleaned and processed into a .json file, allowing it to be developed in Javascript using the d3 library to allow for mapping visualisation.
We decided to use Python for machine learning as it has great online support and accessible libraries. Initially, we chose a simple regression model, but that gave very poor accuracies (sometimes even negative accuracies). Upon further investigation, we realized that the data was very skewed. Holistically, fires occur quite rarely, so in most instances, the likelihood of a fire occurring was 0 (in fact, only 7% of rows had non-zero values). This confused our regressor because it was attempting to find a middle ground between two extremes (0 and >0).
After realising this, we revisited the model selection, and decided to select a two-part model: a binary classifier followed by a regressor. Given our earlier problem of mostly 0 data, we decided it would be beneficial to first determine if a fire is going to occur in the first place, and then perform a regression to establish the probability of the fire actually happening.
For the binary classifier, we went with ExtraTreesClassifier from sklearn, and it gave amazing results - 99.7% accuracy across all data! Initially, we thought it was over-fitting, but upon closer inspection, realised that was not the case - we are only training this model on ~30% of the data, and scoring it on the entirety of the data. We were also aware of the skew towards 0 value, and only sampled 25% of 0-values for our training.
As for the regressor, we could not achieve similar results. This is partly due to lack of data - we did not have enough weather data to be able to match the fire data, and hence, we had to train the regressor purely based on the fire data. This was not a good dataset to use by itself as it did not have enough features to train the model over - it only provided us with the location and the satellite brightness sensor readings, giving us approximately 60% accuracy.
One of the biggest challenges that our team encountered was transforming multiple sources of data into a standard form that could be interpreted by the machine learning model for training. Initially, this frustrated us - this was the first hackathon for the majority of our group, and so we became quite concerned about just how much time this stumbling block was consuming.
However, after some extensive online research, we realised that this struggle is actually a very normal one in the process of data analytics. This discovery opened our eyes up to how even the most interesting of projects involve a balance of engaging activities and mundane tasks, and we now appreciate how some tedious tasks can make a final product feel so much more fulfilling!
In addition, we planned to develop a ‘backburning recommendation’ tool, where the same input weather data would be used to determine an optimal backburning strategy that firefighters can implement to mitigate the loss of vegetation, wildlife, housing, etc. Due to lack of time, we were unable to build a model for this purpose, so we recommend that future implementations of our solution attempt to incorporate this feature.
Finally, an additional feature that was out-of-scope for the current project, but would certainly be a useful addition to upscaled versions of our implementation, is the ability to not only predict fire likelihood from future weather data, but actually predict future weather data itself. This is a desirable tool so that the general public, and not just those with easy access to long-term weather forecasting, is able to be informed of and collaboratively fight against wildfires.
The primary space agency data we used came from FIRMS. This was used to identify the location and number of fires within Australia over a three year period.
In addition, we used Australian climate data from Kaggle to help us predict fire prone conditions. We wished to use data from the Bureau of Meteorology, Australia, but their 5-10 day waiting period meant that we were unable to use it.
https://www.youtube.com/watch?v=CdDC915Pd3Y&feature=youtu.be