Data Discovery for Earth Science

Websites like the NASA Earth Observatory showcase the many uses of satellite data to highlight interesting natural events. International partner instruments on NASA satellites such as Japan’s ASTER instrument and Canada’s MOPITT instrument, both onboard the Terra satellite, are also included as part of the Observatory. This challenge will ask you to devise a tool or technique to guide users to relevant datasets to study specific events.

Dots

Summary

Our project goes through the dataset identifying the most frequently repeating words. It, then, forms connections between these words, expressions used commonly with them, and their respective topics. These connections, later on, are utilized - when the researcher uses our service - to display more relevant data, rather than searching through just the keywords typed in. Initially, the application identifies the topic by looking for the main nouns that most adjectives in the text describes, then it filters the possibilities down by frequency. Multiple different articles are used per topic to make this step easier. It then relates the adjectives used to the topic in the database.

How We Addressed This Challenge

Our project takes a fundamentally different approach to formatting, analyzing, and searching data from what NASA does. NASA partnered websites typically only search articles through the keywords typed by the researcher, while our approach identifies the topic the researcher is looking for through the words used. This, consequently, displays more relevant data that, otherwise, might require much more time for the researcher to find. This is not only a quicker method of research, but it also makes such topics more accessible to the average curious person.

How We Developed This Project

As members, we have always had a passion for online research, and we constantly wished for an opportunity to make such task easier for researchers and more accessible for the average curious minds, and this challenge presented itself as the perfect medium for such an aim. We approached the problem by forming a vision of what the project would ideally be like, identifying the pros and cons of other existing search methods from our experiences. We, then, started figuring out how to implement such ideas, changing our plans slightly, during execution, as we saw fit. We decided to implement the back-end of our service using C as it is a low level programming language, making it faster during execution. We also used Flutter to develop an interactive user interface. Our main struggle was figuring out how exactly we wanted to process the dataset, but we overcame that by all setting some time apart from our designated tasks to look for different text and data analysis algorithms until we found one that best fits our idea.

How We Used Space Agency Data in This Project

Our dataset for word processing was gathered from various NASA and NASA partnered websites including NEO , NASA earth observatory, Aqua, and GMAO. Additionally, we used data from public domain research papers from Harvard and Nature Journal websites. Furthermore, we looked up existing reliable datasets for projects similar to ours, however, we barely found any, due to the uniqueness of our approach. This data was particularly useful because it is a sample of the exact same type of data we want to work with, not just mock up data.

Tags
#search #data
Judging
This project was submitted for consideration during the Space Apps Judging process.