Problem Statement: How do we most accurately model and predict the spread of wildfires in a given region using machine learning techniques?
In our project proposal we set out to use NASA data to help in the prediction and analysis of forest fires. During this project we implemented three algorithms. We implemented expectation–maximization and kmeans to the images in order to preprocess and detect the features for the unsupervised portion of the project. For the third supervised portion we implemented a CNN along with data augmentation.
Use supervised and unsupervised learning to try and see how ML can be used to identify forest fires.
climate change. :)
Inspired by a reading of a 2019 paper on Segmentation of Fire and Smoke from Infra-Red Videos and a 2013 paper on flame segmentation based on flame pixel identification, we wanted to use clustering to find patterns in images to aid in the detection of forest fires. Additionally, we performed segmentation based on smoke pixel identification. Both of these methods on their own fall prone to false positives; we performed segmentation separately considering both with the goal of later combining the segmentation results from both approaches to more fully utilize the potential of satellite imagery in identifying the presence of a wildfire.
Our input data was a series of labelled images from the NASA MODIS sensor on the Aqua and Terra satellite. We chose these images from satellites since they have a high resolution and can be downloaded in the form of RBG Jpegs. This made them ideal for the kind of image analysis we wanted to perform on them. The images ranged from featuring fires and/or smoke to regular satellite images that did not include any kind of fire-related natural disasters.
After applying EM:
After kmeans segementation to find smoke:
Process: In order to be able to feed the images through the algorithms, we needed to reduce the images into numbers.To start we standardized the images so that they were all the same size and then selected a portion of images from our dataset to run our unsupervised learning algorithms. We didn’t use all of the data set in order to save computation power. We then used EM and Kmeans to segment the colors in our images, firstly trying to cluster by warm tones featured in fires, but also by the color of smoke that differentiates it from fog, clouds, sandstorms, or terrain which might have a gray hue as well.
In our first attempt to apply Kmeans to our images we used the pixel colors as our features. As a result, when we clustered the images into two clusters (to simulate the binary fire vs not fire classifcation) we got two unequal clusters that were worse than just randomly assigning half the data into one cluster and the other half into another.
Then, after pre-processing the images by segmenting them out with Expectation Maximum we got clustering results that were better than average. Where, out of the 15 images, only 5 were clustered incorrectly in Cluster 0. And in Cluster 1, out of the 23 images, 9 were clustered incorrectly. Whereas in the previous classifications, in the 21 images 10 were clustered incorrectly. On average, it appears like applying the EM segmentation increased the accuracy by around 33%.
**After Applying two different techniques of Kmeans **
I've read this far.
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (