Supervised Learning

Methods: Working off of Zhang et al and Toni et al’s papers on machine learning in wildfire detection, we used Tensorflow to implement our CNN. We scraped 202 images from the NASA MODIS sensor on the Aqua and Terra satellite, labelled them as having wildfires or not for classification purposes, and mounted them to our Colab notebook to use in our CNN. Our convolutional base is a stack of Convolutional 2D layers and MaxPooling2D layers, and we added several dense layers on top. We got peak accuracy with these layers at 76.216% accuracy with this configuration. Our training dataset is built off of only 202 satellite images, so we find this accuracy to be somewhat encouraging even though it is not comprehensive for the dataset we hoped to train on. In order to train a CNN that worked specifically on satellite images, we used the data augmentation techniques of horizontal and vertical flipping, rotating, cropping, adding Gaussian noise, and adjusting brightness. This augmented our base imageset by a factor of 6.

⁠

Results: We were able to achieve a 76.216% accuracy with our hyperparameter tuning, layer configuration, and data augmentation. As we trained on a base dataset including only 100 images in each category of wildfire and non-wildfire images, we feel these results are quite impressive for such a small training set. In addition, we were able to accomplish a 6% increase in accuracy simply through our data augmentation. We feel this is significant and speaks to the strength of our model’s architecture and the techniques used, despite not reaching our original goal accuracy of 80%.

⁠

Discussion: Our largest hurdle in the supervised portion of this project was data collection. We quickly realized we would need many more images than we had collected in order to successfully train our model. We explored a variety of techniques for data collection, including researching open source image databases for use in computer vision projects, CIFAR10 style data libraries, and manually web scraping the images we wanted ourselves. However, we still struggled to collect enough high-quality data to result in high accuracies.

We specifically wanted to train a CNN for satellite images, so this also made our data collection more difficult, as data sources are limited and often have these images in difficult formats. The NASA website layout also made image scraping more difficult than traditional web-scraping methods (like for example, downloading a thousand images of lilies off of Google images).

As there are many neural networks that are trained to detect fires in images with high accuracy, we hope that our satellite image specific model will add something a little unique to these efforts, and we will also be publishing our training and testing image sets on our public git repository for future work on this topic, as collecting our dataset was a significant portion of this battle.

Data augmentation did, however, significantly aid in the expansion of our dataset and significantly increased the model’s accuracy. As our dataset was already so small, we were never able to surpass 80% accuracy, as we hoped to be able to do with a balanced two-category classification problem, however, our methods did significantly increase the accuracy of the model, and with more time, we are confident we would have been able to achieve higher accuracies by adding new novel data, and applying our data augmentation techniques to this data.

No Templates

We were only told to find a problem and then told to apply our learnings in class to them

Did a lot of outside research

Spent a lot of time brainstorming with the team

Tried a lot of things (learned to deep dive) and how to start over

Really busy team

A lot of my team members were really busy that semester

Tried to delegate some easier tasks

Very time consuming

CNN especially would take hours for code to run

Made sure to multi-task

Learned to check over my work without running the code

Data accuracy

For both supervised and unsupervised we had to think of ways to get the accuracy up

Think out side of the box/ big picture

Use our resources (like TA’s)

There are no rows in this table

⁠

Skill I think I demonstrated

What are the key reasons for moving forward with this proposal despite the concerns?

Skill

Description

Analytical

There was not a lot of guidance given, we were basically given free reign over everything like the problem statement, our variables, what we were measuring, and how we would grade the accuracy

Ability to accept the situation as it is

After missing a couple of team deadlines, taking charge but also delegating without making anybody feel bad about not contributing

Eagerness to learn

I spent a lot of time looking into things and making sure I understood what was going on. Understanding every line of code was crucial

Time Management

CNN took like 4 hours to run

Think outside of the box

Like doing the data augmentation to increase the number of photos we had

There are no rows in this table

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.