Sentiment Analysis on E-commerce Product using Machine Learning and Combination of TF-IDF and Backward Elimination
Tommy Willianto1, Supryadi2, Antoni Wibowo3

1Tommy Willianto, Computer Science Department, Binus Graduate Program – Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia.
2Supryadi, Computer Science Department, Binus Graduate Program – Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia.
3Antoni Wibowo, Computer Science Department, Binus Graduate Program – Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia.
Manuscript received on March 12, 2020. | Revised Manuscript received on March 25, 2020. | Manuscript published on March 30, 2020. | PP: 2762-2767 | Volume-8 Issue-6, March 2020. | Retrieval Number: F7889038620/2020©BEIESP | DOI: 10.35940/ijrte.F7889.038620

Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: E-commerce is a website or mobile application platform that help people to buy products. Before purchasing the product, customer will decide to buy it or not by reading the review from previous buyer. There is a problem that there are a lot of review so it will take a long time for customer to read it all. This research will be using sentiment analysis method to classify the review data. Sentiment analysis or opinion mining is a machine learning approach to classify and analyse texts or documents about human’s sentiments, emotions, and opinions. In this research, sentiment analysis was used to classify product reviews from e-commerce websites into positive or negative classes. The results could be processed further and be used to summarize customers’ opinions about a certain product without reading every single review. The goal of this research is to optimize classification performance by using feature selection technique. Terms Frequency-Inverse Document Frequency (TF-IDF) feature extraction, Backward Elimination feature selection, and five different classifiers (Naïve Bayes, Support Vector Machine, K-Nearest Neighbour, Decision Tree, Random Forest) were used in analysing the sentiment of the reviews. In this research, the dataset used are Indonesian language and classified into two classes(positive and negative). The best accuracy is achieved by using TF-IDF, Backward Elimination and Support Vector Machine (SVM) with a score of 85.97%, which increases by 7.91% if compared to the process without feature selection. Based on the results, Backward Elimination feature selection succeeded in improving all performance for all classifiers used in this research.
Keywords: Backward Elimination, E-Commerce, Sentiment-Analysis, TF-IDF (Terms Frequency-Inverse Document Frequency)
Scope of the Article: E-Governance, E-Commerce, E-Business, E-Learning.