Production, Manufacturing and Logistics
An approximate dynamic programming approach for the vehicle routing problem with stochastic demands

https://doi.org/10.1016/j.ejor.2008.03.023Get rights and content

Abstract

This paper examines approximate dynamic programming algorithms for the single-vehicle routing problem with stochastic demands from a dynamic or reoptimization perspective. The methods extend the rollout algorithm by implementing different base sequences (i.e. a priori solutions), look-ahead policies, and pruning schemes. The paper also considers computing the cost-to-go with Monte Carlo simulation in addition to direct approaches. The best new method found is a two-step lookahead rollout started with a stochastic base sequence. The routing cost is about 4.8% less than the one-step rollout algorithm started with a deterministic sequence. Results also show that Monte Carlo cost-to-go estimation reduces computation time 65% in large instances with little or no loss in solution quality. Moreover, the paper compares results to the perfect information case from solving exact a posteriori solutions for sampled vehicle routing problems. The confidence interval for the overall mean difference is (3.56%, 4.11%).

Introduction

The classical, deterministic, vehicle routing problem (VRP) seeks minimum cost routes from a depot to a geographically dispersed customers’ set having known demands. In this problem, all vehicles start and end their route at the depot, all customer demands are satisfied by exactly one vehicle, and vehicle capacities are not exceeded. The VRP has received extensive attention in the literature. Bertsimas and Simchi-Levi, 1996, Toth and Vigo, 2002 review exact approaches, algorithms, and relaxations for the VRP.

Stochastic vehicle routing problems (SVRP’s) result when one or more VRP elements are random variables. Random elements might be the customers’ set, the travel times, or the customers’ demands. Gendreau et al. (1996) summarize the literature on various SVRP’s. Dror et al. (1989) indicate that optimal solution properties for the VRP do not hold for the SVRP. Further, Gendreau et al., 1995, Gendreau et al., 1999, Laporte et al., 2002, Ichoua et al., 2006 show that VRP’s combining stochastic, integer, and in some cases dynamic elements (i.e. elements varying over time) call for complex solution methodologies.

This paper studies the Vehicle Routing Problem with Stochastic Demands (VRPSD). In this problem, customers’ demands follow known probability distributions and a customer’s actual demand is only revealed when the vehicle arrives at the customer location. The VRPSD’s goal is to minimize total expected route cost. VRPSD’s occur in practice when delivering petroleum products, industrial gases (Chepuri and Homem-De-Mello, 2005), and home heating oil (Dror et al., 1985). Other VRPSD’s arise delivering products to cities under emergency (Dessouky et al., 2005), hospitals, restaurants, vending machines (Yang et al., 2000), and bank branches. Random demands are also present when collecting money (Laporte et al., 1989), packages (Marković et al., 2005), sludge, and recycled materials from banks, homes, and industrial plants.

Most VRPSD research assumes an a priori solution approach (Bertsimas, 1992, Teodorovic and Pavkovic, 1992, Gendreau et al., 1995, Savelsbergh and Goetschalckx, 1995, Hjorring and Holt, 1999, Laporte et al., 2002, Bianchi et al., 2004, Novoa et al., 2006). In the first stage, complete a priori routes are designed before any actual demands become known. In the second stage, routes are followed, demands are revealed, and extra trips to the depot for replenishment are performed if a customer’s demand exceeds current vehicle capacity. The routes order is not changed. The objective is to find a route sequence that minimizes total expected cost from original distance traveled and from extra trips to and from the depot. To minimize costs further, Bertsimas et al., 1995, Yang et al., 2000 design a priori routes that may prescribe returns to the depot before vehicle capacity is depleted (i.e. proactive returns).

This paper assumes a dynamic solution approach that models the problem in multiple stages. Other authors (Dror et al., 1989, Bertsimas, 1992, Dror, 1993, Secomandi, 2001, Laporte et al., 2002, Secomandi and Margot, in review) call this approach “reoptimization”. In this approach, routing decisions occur concurrently with service and are based on the most current system state. The system state updates every time the vehicle arrives at a location and observes demand. There is no planned route and decisions at each stage are which customer to visit next and whether or not send the vehicle to the depot for replenishment to minimize expected routing costs.

Powell et al. (1995) mention that stochastic and dynamic models are key for designing decision support systems that respond to changing conditions often observed in practical applications. Further, Psaraftis (1995) indicates that dynamic approaches respond to the need for efficient real-time logistic and that they are implementable due to advances in communications technologies such as wireless phones and global positioning systems (GPS) which facilitate interaction between drivers and dispatchers. Bastian and Rinnooy Kan, 1992, Psaraftis, 1995 mention that the dynamic or reoptimization approach is computationally challenging but results in flexible routes that may reduce total routing costs. Erera et al. (2007) favor fix routes from a priori approaches since they may decrease management costs, increase drivers performance and achieve service regularity. Nevertheless, Savelsbergh and Goetschalckx (1995) present instances with a 10% increase in transportation cost from using fixed routes instead of reoptimization.

This paper examines the rollout algorithm as an efficient heuristic method for solving the dynamic single VRPSD in real-time. The rollout algorithm, originally proposed by Bertsekas and Tsitsiklis, 1996, Bertsekas et al., 1997, Bertsekas, 2000, Bertsekas, 2001, overcomes the curse of dimensionality in dynamic programming (DP). The only previous computational approaches applying rollout to the VRPSD are Secomandi, 1998, Secomandi, 2000, Secomandi, 2001, Novoa, 2005, Secomandi and Margot, in review. Secomandi’s contribution in regard to rollout methods is the development of a one-step rollout algorithm.

Our paper has three main contributions. The first is the development of a two-step rollout algorithm that provides 1.6% cheaper solutions than the one-step rollout algorithm. The second is the use of Monte Carlo simulation (MCS) for computing the updated base sequences expected cost as an alternative to the exact computation in Secomandi, 1998, Secomandi, 2000, Secomandi, 2001. We demonstrate that MCS may reduce the total computational time by about 65% for large instances. The third contribution is the development of improved base sequences and pruning schemes leading to a cost reduction of about 4% over previous methods. The best rollout method evidences the benefits of linking a priori and dynamic approaches for VRPSD.

Paper organization is as follows. Section 2 reviews literature on dynamic approaches. Section 3 describes the problem. Section 4 presents the dynamic programming formulation. Section 5 describes the proposed rollout algorithms and the Monte Carlo simulation approach. Section 6 contains numerical results and Section 7 concludes the paper.

Section snippets

Literature review

There are few papers on dynamic approaches to the VRPSD relative to those studying the a priori approach. The earliest contributions are theoretical models in Dror et al., 1989, Dror, 1993 that model the dynamic VRPSD as a Markov Decision process. However, these papers do not provide any computational results. Secomandi, 1998, Secomandi, 2000, Secomandi, 2001, Secomandi, 2003 is the first author that provides a computationally tractable heuristic.

Secomandi (2000) solves the VRPSD using two

Problem description

A single-vehicle with fixed capacity Q departs from a depot to perform only deliveries (or only pick-ups) at different customer locations. Node 0 represents the depot and I=1,2,,n represents the customers’ set. Distances between customers i and j, denoted by d(i,j), are known, symmetric, and satisfy the triangle inequality. Travel costs are proportional to distances.

In the computational study, customers’ demands follow known discrete distributions. They are assumed statistically independent

Problem formulation

The dynamic VRPSD can be formulated as a stochastic shortest path (SSP) problem. SSP’s are Markov decision models that reach an absorbing cost-free termination state in a random number of stages and have a discount factor of one. The formulation in this section follows closely the ones in Novoa, 2005, Secomandi, 1998, Secomandi, 2000, Secomandi, 2001.

Let xk=(l,ql,r1,,rn) be an array with n+2 components that represents the system state at decision stage k. l0,1,,n, is the current vehicle

Rollout algorithms

We introduce this section with a brief description of the rollout algorithm (RA) proposed by Bertsekas, 2001, Bertsekas, 2000, Bertsekas et al., 1997 and implemented by Secomandi, 1998, Secomandi, 2000, Secomandi, 2001 to solve the VRPSD dynamically. Then, we divide the section in four subsections which describe the RA’s elements that we further study to extend the work in Secomandi, 1998, Secomandi, 2000, Secomandi, 2001.

RA is an approximate method to solve large DP problems. RA assumes that a

Numerical results

This section compares the different RA’s studied using an analysis of variance procedure (ANOVA) and assesses the gap between RA’s and a posteriori or perfect information solutions. All the methods compared were coded in C++ and were run in a Pentium-IV, 2.4-Ghz, 512 MB. The section starts with a brief description of the procedures to generate the instances and to label the different algorithms compared.

Conclusions

The paper contribution is the development of efficient rollout algorithms (RA’s) for solving the single VRPSD under a dynamic approach. RA is a type of policy iteration where single or multiple initial suboptimal base policies are sequentially improved. Studied RA’s extend the computational work on single-stage RA in Secomandi (2001) and the theoretical work on RA’s in Bertsekas et al. (1997), Bertsekas and Tsitsiklis (1996) and Bertsekas, 2000, Bertsekas, 2001. Developed RA’s consider only

References (49)

  • D.P. Bertsekas et al.

    Rollout algorithms for stochastic scheduling problems

    Journal of Heuristics

    (1999)
  • D.P. Bertsekas et al.

    Neuro-Dynamic Programming

    (1996)
  • D.P. Bertsekas et al.

    Rollout algorithms for combinatorial optimization

    Journal of Heuristics

    (1997)
  • D.J. Bertsimas

    A vehicle routing problem with stochastic demand

    Operations Research

    (1992)
  • D.J. Bertsimas et al.

    Computational approaches to stochastic vehicle routing problems

    Transportation Science

    (1995)
  • D.J. Bertsimas et al.

    A new generation of vehicle routing research: Robust algorithms, addressing uncertainty

    Operations Research

    (1996)
  • L. Bianchi et al.

    Metaheuristics for the vehicle routing problem with stochastic demands. Parallel problem solving from nature PPSN VIII

    (2004)
  • K. Chepuri et al.

    Solving the vehicle routing problem with stochastic demands using the cross entropy method

    Annals of Operations Research

    (2005)
  • D.P. de Farias et al.

    The linear programming approach to approximate dynamic programming

    Operations Research

    (2003)
  • M.M. Dessouky et al.

    Rapid Distribution of Medical Supplies

  • M. Dror et al.

    A computational comparison of algorithms for the inventory routing problem

    Annals of Operations Research

    (1985)
  • M. Dror et al.

    Vehicle routing with stochastic demands: Properties and solution frameworks

    Transportation Science

    (1989)
  • Erera, A.L., Savelsbergh, M., Uyar, E., 2007. Fixed routes with backup vehicles for stochastic vehicle routing problems...
  • M. Gendreau et al.

    Parallel tabu search for real-time vehicle routing and dispatching

    Transportation Science

    (1999)
  • Cited by (0)

    View full text