Revisiting where are the hard knapsack problems? via Instance Space Analysis

https://doi.org/10.1016/j.cor.2020.105184Get rights and content

Abstract

In 2005, David Pisinger asked the question “where are the hard knapsack problems?”. Noting that the classical benchmark test instances were limited in difficulty due to their selected structure, he proposed a set of new test instances for the 0–1 knapsack problem with characteristics that made them more challenging for dynamic programming and branch-and-bound algorithms. This important work highlighted the influence of diversity in test instances to draw reliable conclusions about algorithm performance. In this paper, we revisit the question in light of recent methodological advances – in the form of Instance Space Analysis – enabling the strengths and weaknesses of algorithms to be visualised and assessed across the broadest possible space of test instances. We show where the hard instances lie, and objectively assess algorithm performance across the instance space to articulate the strengths and weaknesses of algorithms. Furthermore, we propose a method to fill the instance space with diverse and challenging new test instances with controllable properties to support greater insights into algorithm selection, and drive future algorithmic innovations.

Introduction

It has long been recognised that rigorous evaluation of algorithm performance is a topic deserving greater attention than it typically receives (Hooker, 1995). The standard practice of reporting “on average” performance of algorithms across a given test suite, without adequately justifying that the chosen test instances are fit for purpose, risks misleading conclusions based on biased experimental results (McGeoch, 2002). This approach also offers little insight into the unique strengths and weaknesses of algorithms for particular types of test instances with different characteristics that are potentially hidden within a test suite average.

Of course, to enable fair comparisons between studies, it is important to continue the standard practice of adopting common benchmarks, inherited and shared between researchers. However, it is essential that we establish whether these benchmark test instances have the kind of properties that can support valid conclusions; namely that they are demonstrably diverse, unbiased, representative of intended applications, discriminating and challenging for a wide variety of algorithms. If we establish that the inherited benchmarks are lacking in any of these properties, we need methods to generate new test instances with the required characteristics to support rigorous performance evaluation (Hall and Posner, 2010), and drive new algorithm development. It is clear that new methodologies to evaluate both the suitability of test instance benchmarks, and the comparative performance of algorithms – more insightful and nuanced than simple “on average” reporting – are still needed, despite calls for the development of a more empirical science of algorithms over 25 years ago (Hooker, 1994).

In recent years, a new approach – known as Instance Space Analysis (Muñoz et al., 2018, Muñoz and Smith-Miles, 2020, Smith-Miles et al., 2014, Smith-Miles and Bowly, 2015) – has been proposed to answer this call. Test instances from a variety of sources, whether they are randomly generated, real-world, or classical benchmarks, can be visualised in a 2D projection of the entire space of possible test instances. The instance space is constructed by summarising each test instance as a high-dimensional feature vector of metrics that capture the intrinsic hardness of a test instance using a combination of problem dependent and independent difficulty measures (Smith-Miles and Lopes, 2012). Utilising the upper and lower bounds of each feature, the boundary of the possible test instance space can be projected to a 2D plane using dimension reduction methods, and the location of existing test instances can be scrutinised in the instance space to establish their diversity, unbiasedness and real-world-likeness. Super-imposing algorithm performance metrics across the instance space, offers the opportunity to infer, using machine learning methods applied to the experimental data, the broader region where good performance can be statistically expected from each algorithm. This region is known as the algorithm footprint, and its area is an objective measure of comparative algorithmic power across the broadest possible test instance space. Unique regions of strength and weakness can be identified for each algorithm, and the features defining these regions can be explored to gain valuable insights into the conditions under which each algorithm is expected to perform well or poorly. Furthermore, this view of the available test instances offers the opportunity to recognise where the current benchmarks provide inadequate coverage of the instance space, and where the generation of new test instances would significantly augment our ability to understand algorithm performance across a wide range of test scenarios. Locating target points in the instance space where no test instances currently exist, evolutionary algorithms can be adopted to fill the instance space with evolved test instances with controllable properties, in a manner that is often not possible to achieve by manipulating instance generator parameters (Muñoz and Smith-Miles, 2020, Smith-Miles and Bowly, 2015).

Instance Space Analysis has now been successfully applied to a wide variety of combinatorial optimisation problems (Smith-Miles et al., 2014, Smith-Miles and Lopes, 2012), as well as continuous optimisation (Muñoz and Smith-Miles, 2017, Muñoz and Smith-Miles, 2020), supervised classification (Muñoz et al., 2018), time series forecasting (Kang et al., 2017), and anomaly detection (Kandanaarachchi et al., 2019). It is applicable to any field where there are algorithms developed and evaluated on suites of test instances, and the characteristics of those test instances can be adequately described with features that suggest their intrinsic hardness. The tools to support such analysis and insights are publicly available as a MATLAB toolbox (Muñoz et al., 2020), and with a web-based user interface known as MATILDA (Smith-Miles et al., 2019).

In this paper we apply Instance Space Analysis for the first time to provide new insights into the 0–1 Knapsack Problem (0–1KP). This binary optimisation problem involves deciding which of a finite set of items should be included in a knapsack, given the weight and profit of each item, with the goal to maximize the profit of the filled knapsack while meeting a fixed capacity constraint on its total weight. It has long been acknowledged that the classical instance classes used to test and compare algorithms for 0–1KP cover only a limited portion of the potential problem space, and many of these instance classes contain instances that are not considered difficult to solve for most algorithms (Pisinger and Toth, 1998, Pisinger, 2005, Hill and Reilly, 2000). As the algorithms designed to solve 0–1KP have grown in sophistication, and computational resources have become more powerful, the need for more difficult instance classes to augment the classical test sets and illustrate the strength of new algorithms has become more pressing (Martello et al., 1999). More difficult instances can be produced from classical instance classes by increasing the number of items and the magnitudes of the coefficients (Pisinger, 2005). However, this approach to generating harder test instances classes does not adequately test an algorithm’s ability to deal with instances that are relatively small but are difficult to solve on account of their structural properties.

In response to this inadequacy of the classical instance classes, several new ones have been proposed in the last two decades which seek to explore the problem space more thoroughly. Martello et al. (1999) brought together and refined instance classes and ideas from earlier studies by Amado and Barcia, 1993, Martello and Toth, 1997, Pisinger and Toth, 1998, Chvátal, 1980, Pferschy et al., 1997, to more comprehensively test the capabilities of the powerful COMBO algorithm (Martello et al., 1999). Hill and Reilly (2000) studied the complex interplay between the profit and weight correlations, and the resulting correlation structure between the objective function and the constraint slackness, demonstrating the impact on algorithm performance when this is varied beyond random correlations. Following this, in 2005 Pisinger (Pisinger, 2005) designed several new instance classes which frequently produce relatively difficult instances for all known algorithms.

A primary goal of the Instance Space Analysis presented in this paper is to improve our understanding of the underlying similarities between the instance classes proposed by Pisinger and the harder classical instance classes; in essence, why these instances are particularly hard. We also investigate the differences between hard instance classes which make them difficult in different ways, or to varying degrees for different algorithms. The diversity of all published 0–1KP test instance suites is explored via an instance space construction, and a new set of test instances are generated to fill the instance space. In this manner, we provide a visual answer to Pisinger’s 2005 question: “Where are the hard knapsack problems?”.

In order to construct an instance space for 0–1KP, this paper builds upon previous efforts to identify features summarising instance difficulty, much of which has already been exploited by previous research on automated algorithm selection methods for 0–1KP (Hall and Posner, 2007). In addition to drawing upon the established literature, we also propose and test some new features of 0–1KP that help to explain algorithm performance. Earlier efforts to characterise the difficulty of 0–1KP can be traced back to 1980 when Balas and Zemel (Balas and Zemel, 1980) defined a measure of knapsack difficulty based on a combination of the gap between the optimal and linear relaxation solutions and the range of item efficiencies. Chung et al. (1988) found that for a specific formulation of the strongly-correlated instance class the difficulty is strongly affected by several key features of instances: the weight of the least heavy item, the quantity added to each item’s weight to define its profit, and the capacity of the knapsack. Hall and Posner (2007) also identified several significant features and applied them in an algorithm selection framework to decide between a branch-and-bound algorithm and a dynamic programming algorithm, the resulting meta-algorithm being termed KPCHOICE. Our selected features, based on the existing literature and some novel metrics, will be described later in this paper.

The remainder of this paper is structured as follows. In Section 2 we present the Instance Space Analysis framework as applied to the 0–1KP. Specifically, we discuss the 0–1KP problem in its general form, before describing the various classes of test instance benchmarks considered in this paper. We present three state-of-the-art algorithms to illustrate the comparative study, and the measures used to evaluate their performance. Finally, the experimental dataset – known as the “meta-data” – is completed with the calculation of a comprehensive set of features to describe the test instances. Once the rationale for these features is presented at the end of Section 2, the entire set of meta-data for the 0–1KP study is completely described. In Section 3 we construct an initial instance space using this meta-data, enabling the existing benchmark test instances to be projected and visualised in a 2D plane. We demonstrate where the hard test instances are located for various instance classes found in the literature, and assess the diversity of the suite of test instances within the theoretical boundary of the instance space. Considering algorithm performance, we then describe and predict the algorithm footprints for our three chosen algorithms. Combining these machine learning efforts, we are also able to perform automated algorithm selection to identify which algorithm is recommended for different regions of the instance space. The opportunity to generate additional test instances to fill the instance space is the focus of Section 4, where several methods are introduced to achieve a more comprehensive set of test instances that reach to the boundaries of the instance space and fill interior gaps. The augmented meta-data, with the original benchmarks and our newly generated test instances, is then used to construct a more comprehensive instance space in Section 5 from which conclusions are drawn in Section 6.

Section snippets

Instance Space Analysis

The foundations for Instance Space Analysis (ISA) are firmly grounded in the Algorithm Selection Framework developed by Rice (1976) and inspired by the No-Free Lunch theorems of Wolpert and Macready (1997). Central to the methodology is the construction of an instance space whereby test instances are represented as points in a 2D plane, with the region of predicted good performance of an algorithm being its footprint. Moreover, through ISA we can identify regions where additional test instances

Constructing an initial instance space

The Instance Space Analysis Toolkit is a MATLAB based set of tools that facilitate the construction of an instance space (Muñoz et al., 2020). The toolkit contains both an automated data processing pipeline, and functions that perform specific stages of the analysis. The implementation details for constructing an initial 0–1KP instance space are described in this section with full reproducibility enabled with the meta-data and code available from //matilda.unimelb.edu.au/matilda/problems/opt/knapsack

Generating new instances

The existing classes of knapsack instances available in the literature cover a wide range of problem instance characteristics, but do not necessarily provide a representative sample of all possible knapsack instances. Attempting to create a fully representative set of test instances is likely to be both infeasible in terms of the number of instances required, and arguably unnecessary given the set of possible knapsack instances may include many which are unlikely to be encountered in practical

Updating the instance space

As mentioned in Section 2, the ISA methodology is iterative, i.e., an initial instance space is created and explored based on currently available meta-data, as shown in Section 3. Gaps in the instance space can then be examined to assess regions where new instances can increase the diversity and coverage of the instance space, as was indicated in Section 4. Now, with these new instances added to I, a different set of features may best describe the algorithms’ performance and the 2D axes best

Conclusions

This paper has provided a visual approach to answer the long-standing question “where are the hard knapsack instances?”. Using the recently developed methodology of Instance Space Analysis, we have shown how the strengths and weaknesses of three algorithms for the 0–1KP variant of knapsack problems can be compared objectively across the space of possible test instances, rather than losing valuable information by summarising “on-average” performance. We have demonstrated the areas of strength

CRediT authorship contribution statement

Kate Smith-Miles: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Supervision, Project administration, Funding acquisition. Jeffrey Christiansen: Formal analysis, Software, Investigation, Data curation, Writing - original draft, Visualization. Mario Andrés Muñoz: Formal analysis, Methodology, Investigation, Validation, Writing - original draft, Visualization.

Acknowledgements

We are grateful to the two reviewers and editors for their valuable suggestions. Funding was provided by the Australian Research Council through grant FL140100012. The authors are grateful to Samuel Fairchild for his assistance with feature calculations, and Dr. Neelofar for her work on the development of the MATILDA online tool for Instance Space Analysis available at https://matilda.unimelb.edu.au

References (36)

  • C.-S. Chung et al.

    A hard knapsack problem

    Naval Res. Logist. (NRL)

    (1988)
  • V. Chvátal

    Hard knapsack problems

    Oper. Res.

    (1980)
  • N.G. Hall et al.

    Performance prediction and preselection for optimization and heuristic solution procedures

    Oper. Res.

    (2007)
  • N.G. Hall et al.

    The generation of experimental data for computational testing in optimization

  • R.R. Hill et al.

    The effects of coefficient correlation structure in two-dimensional knapsack problems on solution procedure performance

    Manage. Sci.

    (2000)
  • J. Hooker

    Needed: An empirical science of algorithms

    Oper. Res.

    (1994)
  • J. Hooker

    Testing heuristics: We have it all wrong

    J. Heuristics

    (1995)
  • S. Kandanaarachchi et al.

    On normalization and algorithm selection for unsupervised outlier detection

    Data Mining Knowl. Discov.

    (2019)
  • Cited by (0)

    View full text