Minimizing the Subset of Features on BDHS Dataset to Improve Prediction on Pregnancy Termination

Ahmed, Faisal; Shultana, Shahana; Yasmin, Afrida; Prome, Junnatul Ferdouse

doi:10.1007/978-981-33-6984-9_6

Faisal Ahmed¹⁹,
Shahana Shultana¹⁹,
Afrida Yasmin²¹ &
…
Junnatul Ferdouse Prome²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1335))

Included in the following conference series:

Congress on Intelligent Systems

420 Accesses

Abstract

Predicting the pregnancy termination and controlling the child mortality rate has always been a great challenge for third world country. This research targets to extract out best subset of features to predict pregnancy termination more accurately relative to previous researches. To facilitate this noble purpose, we have carried out an extensive research on Bangladesh Demographic and Health Survey (BDHS) 2014, that find out the most contributing attributes of pregnancy termination in Bangladesh. Bivariate and multivariate analyses on this data shows interesting details to find out the recent causes for pregnancy termination. However, for finding out the intended features first demographically feature selection performed with Weka provided visualization tools and secondly Weka provided feature ranking attribute evaluators such as Correlation, Gain Ratio, One R, Symmetrical Uncertainty, Information Gain, Relief are used. After minimizing the subset of features, we apply three traditional machine learning classifiers (Naïve Byes, Bayesian Network, Decision Stump) along with the hybrid method which shows better performance in terms of performance metrics. This research improved accuracy 10.238% for Naïve Byes, 8.2657% for Bayesian Network, 3.5853% for Decision Stump and 9.03% for Hybrid.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, F., Shams, M.M.B., Shill, P.C., Rahman, M.: Classification on BDHS data analysis: hybrid approach for predicting pregnancy termination. In: 2nd International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–6. IEEE, Cox’s Bazar, Bangladesh (2019). https://doi.org/10.1109/ecace.2019.8679302
Lawn, J.E., Cousens, S., Zupan, J.: 4 Million neonatal deaths: When? Where? Why? Lancet 365(9462), 891–900 (2005)
Article Google Scholar
Kerber, K.J., de Graft-Johnson, J.E., Bhutta, Z.A., Okong, P., Starrs, A., Lawn, J.E.: Continuum of care for maternal, newborn, and child birth: from slogan to service delivery. Lancet 370, 1358–1369 (2007)
Article Google Scholar
Boetticher, G., Menzies, T., Ostrand, T.: Promise repository of empirical software engineering data. West Virginia University, Department of Computer Science (2007). http://promisedata.org/repository
Cameron, A.C., Trivedi, P.K.: Regression analysis of count data. 2nd edn, Econometric Society Monograph No. 53, Cambridge University Press, NY, USA (1998)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Ghosh, P., Hasan, M.Z., Jabiullah, M.I.: A comparative study of machine learning approaches on dataset to predicting cancer outcome. J. Bangladesh Electron. Soc. 18(1–2), 81–86 (2018)
Google Scholar
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005). https://doi.org/10.1109/TKDE.2005.66
Article MathSciNet Google Scholar
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003). https://doi.org/10.1109/TKDE.2003.1245283
Article Google Scholar
Jong, K., Marchiori, E., Sebag, M., van der Vaart, A.: Feature selection in proteomic pattern data with support vector machines. In: Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 41–48. IEEE, La Jolla, CA, USA (2004). https://doi.org/10.1109/cibcb.2004.1393930
Ilczuk, G., Mlynarski, R., Kargul, W., Wakulicz-Deja, A.: New feature selection methods for qualification of the patients for cardiac pacemaker implantation. In: Computers in Cardiology, vol. 200, pp. 423–426, IEEE, Durham, NC, USA (2007)
Google Scholar
de Souza, J.T., Japkowicz, N., Matwin, S.: STochFS: a framework for combining feature selection outcomes through a stochastic process. In: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases on Proceedings, Porto, Portugal, Oct 3–7, 2005, pp. 667–674
Google Scholar
Islam, A.Z., Mondol, M.N.I., Islam, M.R.: Prevalence and determinants of contraceptive use among employed and unemployed women in Bangladesh. Int. J. MCH AIDS 5(2), 92–102 (2016). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5187648/
Kamal, S.M.M.: Domestic violence, unwanted pregnancy and pregnancy termination among Urban women of Bangladesh. J. Fam. Reprod. Health 7(1), 11–22 (2013)
Google Scholar
Cherkassky, V., Mulier, F.M.: Learning from Data: Concepts, Theory, and Methods. 2nd edn, Wiley—IEEE Press (2007)
Google Scholar
Wozniak, M.: Hybrid Classifiers: Methods of Data, Knowledge, and Classifier Combination. Studies in Computational Intelligence, vol. 519. Springer (2014)
Google Scholar
Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer (2009)
Google Scholar
Rajaraman, A., Leskovec, J., Ullman, J.D.: Mining of Massive Datasets, 1st edn. Cambridge University Press, NY, USA (2011)
Book Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning. In: 13th International Join Conference on Artificial Intelligence, Vol. 2, pp. 1022–1027, Morgan Kaufmann, Chambe’ry, France (1993)
Google Scholar
Roobaert,D., Karakoulas, G., Chawla, N.V.: Information gain, correlation and support vector machines. In: Guyon I., Nikravesh M., Gunn S., Zadeh L.A. (eds.) Feature Extraction. Studies in Fuzziness and Soft Computing, vol. 207. Springer, Berlin, Heidelberg (2006). https://doi.org/10.1007/978-3-540-35488-8_23
Rajaraman, A., Leskovec, J., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press (2011)
Google Scholar
Iba, W.F., Langley, P.: Induction of one-level decision trees. In: Sleeman, D., Edwards, P. (eds.) Proceedings of the Ninth International Conference on Machine Learning, San Mateo, CA: Morgan Kaufmann, pp. 233–240 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of CSE, Daffodil International University, Dhaka, Bangladesh
Faisal Ahmed & Shahana Shultana
Department of CSE, Z. H. Sikder University of Science and Technology, Sikder, Bangladesh
Junnatul Ferdouse Prome
Department of Statistics, Jahangirnagar University, Jahangirnagar, Bangladesh
Afrida Yasmin

Authors

Faisal Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Shahana Shultana
View author publications
You can also search for this author in PubMed Google Scholar
Afrida Yasmin
View author publications
You can also search for this author in PubMed Google Scholar
Junnatul Ferdouse Prome
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Rajasthan Technical University, Kota, Rajasthan, India
Harish Sharma
Department of Computer Science and Engineering, Jaypee Institute of Information Technology, Noida, Uttar Pradesh, India
Mukesh Saraswat
National Institute of Technology, Jalandhar, Punjab, India
Anupam Yadav
Korea University, Seoul, Korea (Republic of)
Joong Hoon Kim
South Asian University, New Delhi, Delhi, India
Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, F., Shultana, S., Yasmin, A., Prome, J.F. (2021). Minimizing the Subset of Features on BDHS Dataset to Improve Prediction on Pregnancy Termination. In: Sharma, H., Saraswat, M., Yadav, A., Kim, J.H., Bansal, J.C. (eds) Congress on Intelligent Systems. CIS 2020. Advances in Intelligent Systems and Computing, vol 1335. Springer, Singapore. https://doi.org/10.1007/978-981-33-6984-9_6

Download citation

DOI: https://doi.org/10.1007/978-981-33-6984-9_6
Published: 02 June 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-6983-2
Online ISBN: 978-981-33-6984-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics