Authors: Dr. J. Yogapriya, Abinithi S, Dhivya V, Harshavarthini V
Abstract: Survey-based research often faces data quality issues such as missing values, duplicate entries, inconsistent responses, and outliers, which can affect the accuracy of statistical analysis and deci-sion-making. This paper proposes an AI- assisted web-based system for automated survey data cleaning and statistical reporting. The system integrates rule-based techniques and Large Lan-guage Model (LLM) capabilities to detect missing values, remove duplicates, identify outliers, and standardize survey responses with minimal manual intervention. After preprocessing, the platform automatically generates descriptive statistical reports that help researchers quickly ana-lyze cleaned datasets. The proposed system improves the reliability and efficiency of survey data analysis while reducing manual preprocessing effort. It supports applications in healthcare sur-veys, educational research, and workforce studies, contributing to data-driven decision-making aligned with Sustainable Development Goals (SDGs) including Good Health and Well-Being, Quality Education, Decent Work and Economic Growth, Indutry Innovation and Infrastructure, and Partnerships for the Goals.
DOI: https://doi.org/10.5281/zenodo.19386018
