Our model is working with a accuracy of 95%.Some screenshot I want to attach as a proof.

Below is the screenshot of our dataset

We are using K nearest neighbour machine learning algorithm to solve this challenge. K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions). KNN has been used in statistical estimation and pattern recognition already in the beginning of 1970’s as a non-parametric technique.A case is classified by a majority vote of its neighbors, with the case being assigned to the class most common amongst its K nearest neighbors measured by a distance function. If K = 1, then the case is simply assigned to the class of its nearest neighbor. It should also be noted that all three distance measures are only valid for continuous variables. In the instance of categorical variables the Hamming distance must be used. It also brings up the issue of standardization of the numerical variables between 0 and 1 when there is a mixture of numerical and categorical variables in the dataset. Choosing the optimal value for K is best done by first inspecting the data. In general, a large K value is more precise as it reduces the overall noise but there is no guarantee. Cross-validation is another way to retrospectively determine a good K value by using an independent dataset to validate the K value. Historically, the optimal K for most datasets has been between 3-10. That produces much better results than 1NN.Tools that we have used are Google colab,Rstudio.
I have used dataset related to covid-19.In that there are many columns which decides the contents of air.
https://drive.google.com/file/d/1RYwCwjPmntJXDhtBc0lbESHXu-ZrNSOc/view?usp=sharing
https://github.com/RaviPrakash1264/NSAC-Environmental-club
Covid 19 dataset,
https://archive.ics.uci.edu/ml/index.php
https://www.kaggle.com/rohanrao/air-quality-data-in-india?select=station_day.csv