https://github.com/avid7-tech/Predicting-stars-galaxies-and-quasars-using-ML-algorithms
In this project, we aim to employ advanced Machine Learning (ML) algorithms to tackle the intricate task of classifying celestial objects within the vast expanse of the universe. Leveraging data obtained from the renowned Sloan Digital Sky Survey (SDSS), our primary objective is to develop robust classification models capable of accurately discerning between different types of celestial entities, including stars, galaxies, and quasars. However, we face several challenges inherent to the domain. One significant challenge is the presence of imbalanced categories within the dataset, where certain classes may have significantly more or fewer samples than others. This can lead to biased model predictions, where the classifier may prioritize the majority class and overlook minority classes. Additionally, the diverse nature of celestial data introduces complexities such as feature scaling, where disparate ranges among feature values can skew model predictions. Furthermore, we must address the potential for data leakage, where information from the test set inadvertently influences model training, leading to overly optimistic performance estimates. Despite these hurdles, we are determined to navigate through the intricacies of the dataset and explore a variety of ML techniques, with a particular focus on Decision Trees, K-Nearest Neighbors (KNN), and Logistic Regression. These algorithms will be meticulously trained and fine-tuned using the SDSS data, characterized by a multitude of distinct features for each observation. Through meticulous experimentation and validation, we will strive to overcome these challenges and ensure that our models maintain high prediction accuracy across all classes. ML and celestial object classification, we not only seek to reveal the cosmos but also pave the way for groundbreaking insights into the nature and dynamics of celestial phenomena.