Data Discovery for Earth Science

Websites like the NASA Earth Observatory showcase the many uses of satellite data to highlight interesting natural events. International partner instruments on NASA satellites such as Japan’s ASTER instrument and Canada’s MOPITT instrument, both onboard the Terra satellite, are also included as part of the Observatory. This challenge will ask you to devise a tool or technique to guide users to relevant datasets to study specific events.

CLIQUE spell checker for multi-error misspellings

Summary

CLIQUE is a spell checker for multi-errors in a misspelling. It combines four spell checking algorithms with different knowledges to solve the problem of a misspelling.

How I Addressed This Challenge
Develop: an algorithm named CLIQUE to solve multiple spelling errors in a misspelling.
Important: to help users search NASA Earth Dataset even when they misspell a word due to lack of knowledge of correct word, fast typing, or searching using long field words which are prone to spelling errors.
CLIQUE does: spell checking even with multiple errors in a word, between 2 - 5 errors. It handles the one error case of misspelling, yet it specializes in multiple errors in a word.
CLIQUEworks: by accepting one complete word, misspelling, better with no spaces or special characters. Then it invokes the first spell checking algorihm to give user upto 10 suggested words for his misspell. If user did not find his correct word after executing the first algorithm, he can invoke the second, third, and fouth algoithm until he finds the correct word. If the user did not find his correct word after invoking the four algorihm, then he either can re-type the word, misspell, or use other programs to spell check his misspelling, as CLIQUE can not help further.
Achieve: ease of finding the required dataset even when committing spelling errors.
How I Developed This Project
Inspiration: NASA Earth Data search engine requires spell checking as users searching the dataset commit misspellings due to lack of knowledge of correct spelling, fast typing, and also long words are prone to be misspelt.
Development Approach: developing an algorithm CLIQUE with the novelity of handling upto 5 mistakes in a word. CLIQUE combines 4 selected spell checking algorihms with different knowledges to solve the spell checking problem: these algorithms are selected after a comparison conducted via thorough experimental tests of 12 spell checking algorithms and 2 spell checking softwares. CLIQUE executes these algorithms one at a time with 10 suggested corrections. If the user found the correct word by executing the first algorithm, then he can either quit the program or enter another word to check for misspellings.
Coding Language: Java Programming Language
Achievement: a working program
How I Used Space Agency Data in This Project
NASA Earth Data Keywords are used with imposing spelling errors in a word, upto 5 errors. Spelling errors like: enviruunetla for "environmental" and atmusfer for "atmosphere". The results with screenshots are presented in the project slides.
Project Demo

CLIQUE webpage: https://multi-error-misspellings.github.io

Data & Resources

NASA Earth Data Keywords: https://search.earthdata.nasa.gov/search

Tags
#clique #spell checking #multi-error misspellings
Judging
This project was submitted for consideration during the Space Apps Judging process.