Abstract

Over the last few decades, there has been a gradual deterioration in higher education in all three areas: the academic setting (both staff and students), as well as research and development output (including graduates). All colleges and universities are essentially focused on improving management decision-making and educating pupils. High-quality higher education can be obtained through a variety of methods. One method is to accurately forecast pupils’ achievement in their chosen educational context. There are numerous prediction models from which to pick. While it is unclear whether there are any markers that can predict whether a kid will be an academic genius, a dropout, or an average performer, the researcher reports student achievement. This article presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data. BPNN is having better accuracy for classification and prediction of student academic performance.

1. Introduction

Data mining (DM) is the extraction and processing of valuable information from a large data warehouse. DM is a subset of data mining. The first step in data mining (DM) is to look at the data in various ways and find the most valuable information in the most summarized form. In marketing strategy, DM approaches are extremely beneficial since they minimize unnecessary data and save resources. They also help to discover consumer behavior patterns and are practical because of their simple knowledge. For the purposes of forecasting and prediction, DM techniques cover a populated region more quickly than previous methods. However, despite the hoopla, this discipline is making a significant influence in the education, industry, and science sectors. Also, new methodological advancements might be made as a result of this. In spite of the clear link between DM and statistical/mathematical data analysis, most approaches utilized in DM have so far arisen from the subject of statistics. As part of our investigation, we will be looking at some of the latest educational models and practices [1].

All colleges and universities are primarily concerned with improving the quality of managerial choices and educating students. High-quality higher education may be achieved in a number of ways. One is through accurate predictions of students’ success in their chosen educational setting. There are a variety of prediction models to choose from. While it is not clear if there are any indicators that can reliably forecast whether a student will be an academic genius, a dropout, or an ordinary performer, the researcher reports student achievement. Higher and professional education’s evolving needs have not been met by existing methods. The goal of this study is to identify issues and potential solutions related to the quality of education provided by universities and other higher learning institutions, as well as provide a framework for doing so [2, 3].

In higher education, students and alumnae are dealing with serious issues. A school would want to determine who will enroll in a specific program and who will require extra help to get their degree. Can you tell which pupils are more likely to switch schools than others? These and other difficulties, such as student enrollment management and the time it takes to complete a degree program, keep higher education institutions on their toes. The analysis and presentation of data, or DM, may be an effective means of addressing these issues for students and alumni. It is now possible for companies to make use of their existing reporting tools to find and analyze patterns in massive databases using DM. Individuals’ behavior may be predicted with great accuracy using these patterns in data mining models. Schools can better distribute resources and employees as a result of this discovery [4]. An institution’s DM might for example provide the institution with the information essential to take action before a student drops out, or provide an accurate forecast of how many students will attend a certain course [5, 6].

This article presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, the input data set is preprocessed using the AMF algorithm. Then, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data.

The literature survey section contains a review of existing work in education technology. The methodology section presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers–BPNN, RF and NB are used to classify student academic performance data. The result analysis section contains experimental set-up details and results of various algorithms. The conclusion section contains the contribution of the research article.

2. Literature Survey

An effective survey for evaluating and forecasting student performance in higher education institutions utilizing DM technology is presented in this work.

Data from a preoperative evaluation were used to forecast whether or not an individual will pass or fail a course, and the Bayesian, decision tree, and neural network algorithms were used to evaluate their prediction accuracy, ease of learning, and user-friendliness. By taking the proper measures at the right time, the researchers found that this system can assist students and teachers enhance student achievements and minimize the failure ratio [7].

To forecast pupils’ grades, Jayasingh et al. [8] utilized a Bayesian classification algorithm that relied on the previous year performance data. Teachers and students alike stand to benefit from the research, according to the findings of the researchers. Additionally, the research aids in identifying those pupils that require additional attention in order to lower the failure rate. In order to create a multiclassifier, they compared the performance of four distinct classifiers. A GA was used to reduce the error rate by at least 10% by classifying the data into three separate groups based on the attributes that were most important to the prediction.

Researchers found that it is feasible to automatically estimate student performance in the problem of performance prediction. As a result, the incorporation of this information into the learning process is simplified and uniform thanks to the use of the Bayesian network, an extensible categorization framework. This study demonstrates that strategies for performance prediction and further investigation of learning algorithms are both necessary and desirable [9].

There was a strong correlation between a student’s current performance and their previous performance in earlier courses, as demonstrated by these data (most likely a prerequisite course). Categorization trees are popular, according to [10], because their classification criteria are more easily comprehensible. In order to discover the best classifier for student data and forecast students’ performance on the end-of-semester assessment, researchers looked at commonly used decision tree classifiers. To classify data, CART is the best method, according to experimental findings obtained.

It was shown that the decision tree method may be used to accurately predict students’ academic achievement. Authors [11] claimed that DM might be used in higher education, namely to predict students’ ultimate achievement. Researchers used questionnaires to gather information from students on their attitudes toward learning and their academic achievement. This was followed by the implementation of DM strategies. Students’ final grades were predicted using a decision tree-based model and an SVM algorithm that implemented the model’s criteria for prediction. With the help of kernel k-means clustering, the pupils have formed groups.

To forecast the final exam scores of engineering students, It was important to develop prediction models that took into account all of a student’s individual characteristics, as well as their social, psychological, and other influences on their performance [12, 13]. When compared to ID3 and CART algorithms, the C4.5 approach has the greatest accuracy of 67.7778 percent. Some criteria were examined by Bharadwaj and Pal [14] to derive performance prediction indicators essential for instructors to measure, monitor, and evaluate their performance. Naive Bayes classification was shown to be the most effective method based on the data. In a study, they used educational data mining to forecast the likelihood that pupils would stay in school [15]. Learning algorithms (ID3, C4.5, and ADT) are used in the study machine to evaluate and extract data from previously collected student records. ADT, a machine learning algorithm, was shown to be able to develop predictive models using the previous year’s retention data by using their study’s predictive models. In Rossi’s groundbreaking work [16], he was the first to propose an ideal algorithm and system architecture suited for anticipating instructors’ performance, as well as recommending critical action to assist school administrators in making decisions based on the limitations of conventional approaches. In [16], school districts that use this technique to aid administrators in making better decisions and teachers in improving their performance may see an increase in students’ academic achievement. This is the method via which the goals will be met. Researchers Hemaid and Halees [12] carried out a similar research project in order to better understand the aspects that influence the appraisal of instructors’ performance. Teachers from Gaza City’s Ministry of Education and Higher Education participated in this survey, which was conducted in English. In each activity, they spent a significant amount of time discussing the relationship between their outcomes and the teacher’s performance. The technique developed by Ağaoğlu et al. [17] was designed with the main goal of improving student performance. A questionnaire on how they were taught and how they interacted with one another was given to the students as part of the course requirements. The performance of staff members who taught the relevant courses was assessed using an education mining-based categorization approach, which was used to evaluate their performance. In this investigation, the C4.5 classifier beat the competition, according to the findings. In our investigation, we discovered that a substantial number of the survey questions used to assess student satisfaction with the course were erroneous. According to Tripti et al., children’s social integration, emotional capacity, and intellectual accomplishment are all important factors in their development. Students in their third semester were examined using two separate classification approaches: j4.8 and the random tree method, both of which were used. When compared side by side, the random tree beat the j4.8 algorithm.

3. Methodology

This section presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data. The flowchart for the proposed methodology is shown in Figure 1.

In 1992, Kira and Rendell [18] created the relief algorithm, which was based on an instance-based learning methodology. Discovering feature-to-feature correlations may be accomplished via the use of a filtering mechanism. In the process of computing feature statistics, the nearest neighbors method is employed to account for the interactions between variables. This approach ignores data that has missing values or has a large number of classes.

The back propagation approach developed by Haykin and Anderson is one of the most widely used techniques for learning new knowledge. BPN is an excellent choice for pattern recognition and mapping jobs that are simple in nature. Backpropagation is a method of learning rather than a network. An algorithm example will train a network to provide the proper output for each pattern of input data that is presented to the algorithm. A result of this is that the network’s weights are recalculated. A training pair is comprised of an input and a target [19].

Bayes’ theorem may be broken down into its component parts. Naive Bayes is the creator of the imaginary figure. Naive Bayes, who also goes by the same name. It is a highly classified mechanism; thus, a great deal of focus is placed on it. A straightforward estimate of the iterative parameters is not required to get naive Bayesian models off to a good start. By using the Bayes theorem, we are able to determine the posterior likelihood of P(c|x) by combining the posterior likelihoods of P(c, x) and P(x|c). This allows us to compute the posterior likelihood of P(c|x). An estimate of the posterior probability may be obtained by first constructing a frequency table and then analyzing the data. After analyzing the frequency tables, naive Bayesian approaches are used in order to calculate the probabilities associated with each group included in the dataset. After determining the probabilities associated with each category, it is then feasible to choose the most appropriate classification [20].

In order to construct a tree reflecting multiple circumstances and potential values of target–class labels, the ID3 algorithm of decision trees is used. An if-else tree may be readily constructed in any programming language by layering if statements on top of each other. Entropy and information gain are used by ID3 to examine and describe training data statistics. It is a greedy algorithm that does not believe in backtracking in order to improve its judgments. As far as tree length is concerned, normal ID3 does not use any kind of pruning or optimization. In order to determine how homogeneous a subset of data is, the entropy of the data set is determined. The commonality of values is what is meant by homogeneity. The entropy of a collection of values is zero if they are all the same. Its entropy is equal to one if the values it contains are evenly dispersed. Entropy must be determined at the attribute level or a mixture of two attributes.

4. Results and Discussion

The university data set (http://archive.ics.uci.edu/ml/datasets/university) is used for the experimental study. This data set consists of 285 instances. This data set contains seventeen attributes. 240 instances are used for the training of machine learning classifiers and the remaining 45 instances are used for testing. Results are shown in Figures 24.

Parameters used in the experimental study are discussed:(i)Accuracy = (TP + TN)/(TP + TN + FP + FN)(ii)Sensitivity = TP/(TP + FN)(iii)Specificity = TN/(TN + FP)

where

TP = true positive,

TN = true negative,

FP = false positive,

FN = false negative.

5. Conclusion

There has been a gradual decline in higher education over the last few decades in all three areas: the academic setting (both faculty and students), as well as research and development output (both faculty and students) (including graduates). In essence, all colleges and universities are devoted to the improvement of management decision-making and the education of their students. A high-quality higher education can be obtained through a variety of different means and formats. In order to accurately predict students’ performance in their chosen educational context, one technique is to use predictive analytics. There are a plethora of different prediction models to choose from. While it is uncertain whether there are any signs that may indicate whether a child will be an academic genius, a dropout, or an ordinary performer, the researcher reports on student achievement in his or her research. In order to identify students and make accurate projections about their academic success, this article makes use of machine learning and metaheuristics. The implementation of a relief strategy helps to reduce the number of elements that need to be considered in the initial phase of the process. The information on the academic achievement of the students is classified using BPNN, RF, and NB, which are three different machine learning classifiers. The accuracy of BPNN as a tool for classifying and predicting the level of academic accomplishment attained by students is continually improving.

Data Availability

The data shall be made available on request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research work is self-funded.