Constituent depletion and divination of hypothyroid prevalance using machine learning classification

5Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the vast growth of technology, the world is moving towards different style of instant food habits which lead to the irregular functioning of the body organs. One such victim problem we face is the existence of hypothyroid in the body. Hypothyroid is the under active thyroid circumstance, where the thyroid gland does not produce required amount of essential hormones. The prediction of hypothyroid still remains as a challenging task due to the non availability of exact symptoms. By keeping this analysis in mind, this paper focus on prediction of hypothyroid based on the clinical parameters. The hypothyroid dataset from the UCI machine learning repository is used for predicting the existence of hypothyroid using machine learning classification algorithms. The prediction of existence of hypothyroid is carried out in four ways. Firstly, the raw data set is fitted with various classification algorithms to find the existence of hypothyroid. Secondly, the data set is tailored by the Ada Boost Regressor algorithm to extract the important features from the hypothyroid dataset. Then the extracted feature importance of the hypothyroid dataset is then fitted to the various classification algorithms. Thirdly, the hypothyroid dataset is subjected to the dimensionality reduction using principal component analysis. The PCA reduced hypothyroid dataset is then fitted with classification algorithms to predict the existence of hypothyroid. Fourth, the performance analysis is done for the raw data set, Feature importance AdaBoost hypothyroid dataset and PCA reduced hypothyroid dataset by comparing the performance metrics like precision, recall, FScore and Accuracy. This paper is implemented by python scripts in Anaconda Spyder Navigator. Experimental Result shows that the Random Forest, Naive Bayes and Logistic regression have the accuracy of 99.5 for the raw dataset, feature importance reduced dataset and the accuracy of 99.8 for the five component reduced PCA dataset.

Cite

CITATION STYLE

APA

Shyamala Devi, M., Shil, A., Katyayan, P., & Surana, T. (2019). Constituent depletion and divination of hypothyroid prevalance using machine learning classification. International Journal of Innovative Technology and Exploring Engineering, 8(12), 1607–1612. https://doi.org/10.35940/ijitee.L3150.1081219

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free