We used the available dataset and extract the feature variable. This project is important not only to trigger the cause of landslide but also for researcher to insight on the causes of landslide to the analyzed places of the world. The atmosphere experts and weather experts can look deep down to our analysis and could provide people with an alert message or some safety information before some disaster which leads to landslide occurs in future to come. We have analyzed the TOP 10 countries with most landslide occurrence, their fatality count and also their injury count.
The model predicts the causes behind the landslide such as due to downpour, rain, snow melting, earthquake to particular country for given date as feature input once provided to the model.
Through this project, or say model, the researcher could look behind the major causes for the particular country to face landslide and get to dive down onto it. We hope this project would bring insight to normal people and also the researcher to look into it.
We have used Python Programming and its model development libraries such as scikit learn, numpy, pandas, pydot, geopandas ,etc. We have preprocessed the data such as data cleaning, data integration, changing categorical values into numerical form, handling the missing values, removing redundant columns. We have selected 21 unique feature variables which consist of longitude, latitude, country name, source link, fatality count, injury count ,etc. We have passed these feature variable in two machine learning model i.e decision tree and random forest to predict the landslide_trigger value (i.e cause for landslide to occur). The major problem that we faced in developing this project are we force to do this project virtually so it was some how difficult as compared to when all the team members are addressed in same location. This was quite a fantastic and remembering experience for all the team member to work virtually to complete this project. In addition, we got a platform to work in our interested field i.e into data science and machine learning and had a good experience working differently in a unique way. We as a team want to thank all the individuals, sources , medium which helped us to achieve and get to know about this challenge.
We have used the data entitled " Global_Landslide_Catalog-Export.xlsx"
available at data.nasa.gov. We have used the required columns available in the dataset to train our model. We preprocessed and selected the required feature and target variable from the dataset. We train our model from the preprocessed data and fit into the model. We predict and calculated the accuracy of both the developed model i.e random forest and decision tree. We analyzed on different correlations between the columns and also viewed the data. We came to know the Top countries and places where there is high landslide possibility. We developed world map and landslide probabilities and help the researcher and enthusiast to get deep down from the analysis .
Global_Landslide_Catalog-Export.xlsx