DNNRec: A novel deep learning based hybrid recommender system

https://doi.org/10.1016/j.eswa.2019.113054Get rights and content

Highlights

  • A novel hybrid deep learning based recommender system ‘DNNRec’ is proposed.

  • DNNRec leverages embeddings, combines side information and a very deep network.

  • DNNRec addresses cold start case and learns of non-linear latent factors.

  • Proposed solution is benchmarked against existing methods on accuracy and run time.

  • DNNRec outperforms state-of-the-art methods overall and in cold start case.

Abstract

We propose a novel deep learning hybrid recommender system to address the gaps in Collaborative Filtering systems and achieve the state-of-the-art predictive accuracy using deep learning. While collaborative filtering systems are popular with many state-of-the-art achievements in recommender systems, they suffer from the cold start problem, when there is no history about the users and items. Further, the latent factors learned by these methods are linear in nature. To address these gaps, we describe a novel hybrid recommender system using deep learning. The solution uses embeddings for representing users and items to learn non-linear latent factors. The solution alleviates the cold start problem by integrating side information about users and items into a very deep neural network. The proposed solution uses a decreasing learning rate in conjunction with increasing weight decay, the values cyclically varied across epochs to further improve accuracy. The proposed solution is benchmarked against existing methods on both predictive accuracy and running time. Predictive Accuracy is measured by Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and R-squared. Running time is measured by the mean and standard deviation across seven runs. Comprehensive experiments are conducted on several datasets such as the MovieLens 100 K, FilmTrust, Book-Crossing and MovieLens 1 M. The results show that the proposed technique outperforms existing methods in both non-cold start and cold start cases. The proposed solution framework is generic from the outperformance on four different datasets and can be leveraged for other ratings prediction datasets in recommender systems.

Introduction

Recommender systems are increasingly used for suggesting movies, music, videos, e-commerce products or other items. They accelerate the process of search for the users and help businesses maximize their sales (Lee, Huntsman, Fl, Huntsman & Fl, 2014). Given the research focus on recommender systems and the business benefits of higher predictive accuracy of recommender systems, there has been an increasing focus on building better solutions. Recommender systems predict the future preference for a set of items for a user either as a rating or as a binary score or as a ranked list of items. Popular recommender systems like the MovieLens recommender system, Amazon and Netflix express the user preference as a numeric rating. When the rating is numeric, one of the metrics like Root Mean Squared Error (RMSE), Mean Average Error (MAE), Mean Squared Error (MSE) or R-squared are used to measure the predictive accuracy of the recommender system. When the problem is one of ranking, metrics like NDCG (Normalized Discounted Cumulative Gain) are used. The scope of this paper is the ratings prediction case of recommender systems. The goal is one of making the predictions as accurate as possible. Along-with predictive accuracy, the other challenges in recommender systems include cold start (Lam, Vu & Le, 2008), sparsity and candidate generation (Covington, Adams & Sargin, 2016). One of the important problems in recommender systems is the cold start problem. Cold Start Problem is one of “Making recommendations where there are no prior interactions available for a user or an item”. The cold start problem has been classified into user cold start and item cold start, which refer to cases of insufficient examples of items and users respectively (Bernardi, Kamps, Kiseleva & Mueller, 2015). Over the last few years, several techniques have been developed and benchmarked on publicly available datasets. There have been several approaches in literature to solve the recommender system problem. These approaches have been classified into Content-based (CB), Collaborative Filtering (CF) based and Hybrid methods. Popular buckets include Heuristic methods, Matrix Factorization based Collaborative Filtering methods, Neighborhood based Collaborative Filtering methods and Machine Learning methods. To address the cold start problem, CB features extracted for users and items are used as inputs in several machine learning techniques like regularized regression and tree-based methods. Given the strategic importance of recommender systems both in research and business and the algorithmic advancements, improving predictive accuracy is an important task. Netflix released more than 100 million customer generated movie ratings as part of the Netflix Prize competition with a goal to reduce the RMSE of the test data by 10 percent, with a prize of USD 1 million to the first team to reach that goal (Netflix, 2009).

Deep Learning methods have seen huge success in computer vision beating the state-of-the-art results in many computer vision applications (Lecun, Bengio & Hinton, 2015). Deep Learning Methods are a branch of machine learning that build neural networks with complicated architectures (like multiple hidden layers, or loops, or convolutions with shared weights and pooling layers). The building blocks of Deep Learning are Neural networks, which are infinitely flexible functions. Deep Learning methods also have an ability to do all-purpose parameter fitting through advanced stochastic gradient descent algorithms, while achieving speed and scalability using Graphical Processing Units (GPU). The last few years have seen the increasing performance and availability of GPUs for parallel matrix computations, that are at the core of deep learning. The availability of frameworks like Pytorch (Facebook), TensorFlow (Google) and MXnet (Apache) have accelerated the use of deep learning methods. There are different types of deep learning methods like Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Recurrent Neural Network (RNN) types like Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks, unsupervised methods like Autoencoders and Generative Adversarial Networks or GANs (Lecun et al., 2015). All these have enabled rapid adoption and research in deep learning systems. Deep Learning systems are now increasingly being applied to structured datasets and form an area of active research. There have been several advances in Deep Learning on speeding up running time by effectively finding the optimal learning rates (Smith, 2017a), learning complex features from the data with minimum external contribution (Araque, Corcuera-Platas, Sánchez-Rada & Iglesias, 2017) and modern stochastic gradient interventions that can find the minima more efficiently to deliver state-of-the-art results.

In this light, the motivation for this study is to seek answers to the following questions:

  • 1.

    Can the predictive accuracy in comparison to the existing CF techniques be improved by the non-linear latent features engineered from the embeddings?

  • 2.

    Can the performance in the cold-start case benefit from the side information about the users and items that are incorporated into the deep neural network?

  • 3.

    Can the proposed deep neural network work across different datasets for ratings prediction so that it can be a generic approach to solve recommender system problems in the ratings prediction case?

  • 4.

    Given that learning rate and weight decay greatly influence the convergence of a DNN, how can they be optimized for highest predictive accuracy?

  • 5.

    What is the tradeoff in running time of deep learning-based recommender systems vs. existing methods?

In this paper, a new deep learning-based hybrid recommender system is proposed. The solution makes four major contributions. First, it alleviates the cold start problem by utilizing side information about users and items into a DNN, where-ever such auxiliary information is available. Second, it retains the benefits of traditional matrix factorization techniques by learning latent factors and builds on them to learn non-linear latent factors as well using embeddings. Third, it enables achieving highest predictive accuracy by using cyclical learning rates and decaying weights across epochs. This variation ensures that the solution does not get stuck in a local minima as explained in the solution section. The fourth contribution is showing that the solution is viable from both optimizing criteria (RMSE, R-squared, MSE and MAE) and satisficing criteria (mean and standard deviation of running times across seven runs). The performance is measured in both in overall and cold start cases on four datasets namely MovieLens 100 K, FilmTrust, Book-Crossing and MovieLens 1 M. Experimental results show that the proposed solution beats the existing benchmarks and alleviates the cold start problem. Recommender systems are an intelligent system that find application not only in e-commerce but in several industries. Our paper presents an approach to address the gaps in collaborative filtering approach regarding non-linear latent factors and performance in cold start cases. We design a recommender system to address the gaps in CF approaches and test it on five different datasets improving performance. We address different facets in building a more accurate recommender system solution using deep learning.

The remainder of this paper is organized as follows. Section 2 provides a taxonomy of related recommender system methods that are subsequently benchmarked against. Section 2 also describes related work on recommender systems and usage of deep learning on structured data and recommender systems. Section 3 outlines the proposed deep learning solution DNNRec and its theoretical framework, describing the key components of the same. Section 4 details the experimental setup comprising of the hardware, software packages, datasets and validation set creation. Section 5 summarizes the experimental results on four different datasets namely MovieLens 100 K, FilmTrust, MovieLens 1 M and Book-Crossing. Section 5 also highlights the superiority of the proposed solution against existing methods. Finally, Section 6 outlines the conclusions from the work and opportunities for future research.

Section snippets

Background and related work

The dominant approaches in recommender systems are broadly classified into three categories namely Content Based (CB), Collaborative Filtering (CF) and Hybrid methods. CB methods utilize the content of the items to create features to match user profiles. These methods leverage user features and item features to predict the ratings using several machine learning methods. CB methods could include Context based Recommender systems (Aggarwal, 2016) that use space and time features of the users and

DNNRec: a hybrid deep learning-based recommender system

In this section, the proposed DNNRec recommender system algorithm is described in detail. A recommender system is designed for generating recommended rating for each user and item pair. Consider U users of which there are UCScold start users (UCSU) and I items, of which there are ICScold start items (ICSI), Let N be the number of ratings. Rating ri,j, t denotes the rating by user i on item j at time t. Table 1 presents the list of abbreviations and notations used in this paper with their

Experimental setup and test bed

In this section, we outline the computing environment which is the hardware, software packages, datasets used and the validation dataset creation methodology. A brief exploratory analysis of the characteristics of the different datasets considered for experimentation is included in this section. The evaluation criteria and implementation of the competing techniques are also outlined in this section.

Results and discussion

The experimental results on the different datasets are detailed in this section. Table 7 shows the RMSE, MSE, MAE and R-squared values for the FilmTrust, Movie Lens 100 K, Book-Crossing and MovieLens 1 M datasets. As we see DNNRec outperforms the other techniques on all the four datasets. For the ML100K dataset, the MSE, RMSE, MAE and R-squared of DNNRec are 1.2%, 6.27%, 0.7% and 2.5% better than the next best technique respectively. For FilmTrust dataset, the values are 2% 1%, 0.5% and 7.8%

Conclusion and future scope of work

In this work, a hybrid recommender system using deep learning (DL) approach was proposed. The proposed solution leverages a complex architecture with a very deep neural network, uses embeddings to learn non-linear latent factors for users and items, combines the deep learning features with side information available about users and items to create a hybrid system, tunes the learning rates and weight decay (regularization for comparable running times and prevents overfitting through a

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (43)

  • G. Adomavicius et al.

    Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions

    IEEE Transactions on Knowledge and Data Engineering

    (2005)
  • C.C. Aggarwal

    An introduction to recommender systems

    Recommender Systems.

    (2016)
  • O. Araque et al.

    Enhancing deep learning sentiment analysis with ensemble techniques in social applications

    Expert Systems with Applications

    (2017)
  • A. Balakrishnan

    DeepPlaylist : Using recurrent neural networks to predict song similarity

    (2014)
  • T. Bansal et al.

    Ask the gru

  • J. Basiri et al.

    Alleviating the cold-start problem of recommender systems using a new hybrid approach

  • L. Bernardi et al.

    The continuous cold start problem in e-commerce recommender systems

  • L. Bottou

    Large-Scale machine learning with stochastic gradient descent

  • F. Braida et al.

    Transforming collaborative filtering into supervised learning

    Expert Systems with Applications

    (2015)
  • R. Catherine et al.

    TransNets

  • P. Covington et al.

    Deep neural networks for YouTube recommendations

  • Y.N. Dauphin et al.

    RMSProp and equilibrated adaptive learning rates for non-convex optimization

    NIPS ’15

    (2015)
  • A. De Brébisson et al.

    Artificial neural networks applied to taxi destination prediction

  • H.H. Do et al.

    Deep learning for aspect-based sentiment analysis: A comparative review

    Expert Systems with Applications

    (2019)
  • Y. Goldberg et al.

    word2vec parameter learning explained continuous bag-of-word model

  • C.A. Gomez-Uribe et al.

    The netflix recommender system: Algorithms, business value, and innovation

    ACM Trans. Manage. Inf. Syst.

    (2015)
  • G. Guo et al.

    A novel bayesian similarity measure for recommender systems

    IJCAI International Joint Conference on Artificial Intelligence

    (2013)
  • F.M. Harper et al.

    The movielens datasets

    ACM Transactions on Interactive Intelligent Systems

    (2015)
  • X. He et al.

    (NCF) neural collaborative filtering

    WWW

    (2017)
  • D. Jannach et al.

    What recommenders recommend: An analysis of recommendation biases and possible countermeasures

    User Modeling and User-Adapted Interaction

    (2015)
  • D.P. Kingma et al.

    Adam: A method for stochastic optimization

    International Conference on Learning Representations 2015

    (2015)
  • Cited by (0)

    View full text