Abstract
Wormhole routing has emerged as the most widely used switching technique in massively parallel computers. We present a detailed survey of various techniques for enhancing the performance and reliability of wormhole-routing schemes in directly connected networks. We start with an overview of the direct network topologies and a comparison of various switching techniques. Next, the characteristics of the wormhole routing mechanism are described in detail along with the theory behind deadlock-free routing. The performance of routing algorithms depends on the selection of the path between the source and the destination, the network traffic, and the router design. The routing algorithms are implemented in the router chips. We outline the router characteristics and describe the functionality of various elements of the router. Depending on the usage of paths between the source and the destination, routing algorithms are classified as deterministic, fully adaptive, and partially adaptive. We discuss several representative algorithms for all these categories. The algorithms within each category vary in terms of resource requirements and performance under various traffic conditions. The main difference among various adaptive routing schemes is the technique used to avoid deadlocks. We also discuss a few algorithms based on deadlock recovery techniques. Along with performance, fault tolerance is essential for message routing in multicomputers, and we thus discuss several fault-tolerant wormhole routing algorithms along with their fault-handling capabilities. These routing schemes enable a message to reach its destination even in the presence of faults in the network. The implementation details of wormhole routing algorithms in contemporary commercial systems are also discussed. We conclude by itemizing several future directions and open issues.
- ABALI, B. AND AYKANAT, C. 1994. Routing algorithms for IBM SP1. In Proceedings of the Parallel Computer Routing and Communications Workshop, (May), 161-175.]] Google Scholar
- ADVE, V.S. AND VERNON, M.K. 1994. Performance analysis of mesh interconnection networks with deterministic routing. IEEE Trans. Parallel Distrib. Syst. (Mar.), 225-246.]] Google ScholarDigital Library
- AGERWALA, T., MARTIN, J. L., MIRZA, J. H., SADLER, D. C., DIAS, D. M., AND SNIR, M. 1995. SP2 system architecture. IBM Syst. J. 34, 2, 152- 184.]] Google ScholarDigital Library
- AGRAWAL, A. 1991. Limits on interconnection network performance. IEEE Trans. Parallel Distrib. Syst. 2, 4 (Oct.), 398-412.]] Google Scholar
- AKERS, S.B. AND KRISHNAMURTHY, B. 1989. A group-theoretic model for symmetric interconnection networks. IEEE Trans. Comput. (April), 555-566.]] Google ScholarDigital Library
- ANJAN, K. V. AND PINKSTON, T.M. 1995. An efficient, fully adaptive deadlock recovery scheme: DISHA. International Symposium on Computer Architecture (June), 201-210.]] Google Scholar
- ANJAN, K.V., PINKSTON, T.M., AND DUATO, J. 1996. Generalized theory for deadlock-free adaptive routing and its application to disha concurrent. In Proceedings of the International Parallel Processing Symposium (April).]] Google Scholar
- AOYAMA, K. AND CHIEN, A.A. 1998. The cost of adaptivity and virtual lanes in wormhole router. J. VLSI Des., To appear.]]Google Scholar
- BOPPANA, R. V. AND CHALASANI, S. 1996. A framework for designing deadlock-free wormhole routing algorithms. IEEE Trans. Parallel Distrib. Syst. (Feb.), 169-183.]] Google ScholarDigital Library
- BOPPANA, R. V. AND CHALASANI, S. 1995. Faulttolerant wormhole routing for mesh networks. IEEE Trans. Comput. (July), 848-864.]] Google ScholarDigital Library
- BOPPANA, R. V. AND CHALASANI, S. 1993. A comparison of adaptive wormhole routing algorithms. In Proceedings of the International Symposium on Computer Architecture (May), 351-360.]] Google Scholar
- BOURA, Y.M. AND DAS, C.R. 1994. Efficient fully adaptive wormhole routing in n-dimensional meshes. In Proceedings of the International Conference on Distributed Computing Systems, 589-596.]]Google Scholar
- BOURA, Y.M. AND DAS, C.R. 1993. A class of partially adaptive routing algorithms for ndimensional meshes. In Proceedings of the 23rd International Conference on Parallel Processing, Vol. 3 (Aug.), 175-182.]] Google Scholar
- CHIEN, A.A. 1993. A cost and speed model for k-ary n-cube wormhole routers. In Proceedings of Hot Interconnects.]]Google Scholar
- CHIEN, A. A. AND KIM, J.H. 1995. Planar adaptive routing: Low-cost adaptive networks for multiprocessors. J. ACM (Jan.), 91-123.]] Google ScholarDigital Library
- DALLY, W.J. 1992. Virtual channel flow control. IEEE Trans. Parallel Distrib. Syst. 3 (March), 194-205.]] Google ScholarDigital Library
- DALLY, W.J. 1990. Performance analysis of kary n-cube interconnection networks. IEEE Trans. Comput. 39, 6 (June), 775-785.]] Google ScholarDigital Library
- DALLY, W. J. AND AOKI, H. 1993. Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans. Parallel Distrib. Syst. 4, 4 (April), 466-475.]] Google ScholarDigital Library
- DALLY, W.J. AND SEITZ, C.L. 1987. Deadlock free message routing in multiprocessor interconnection networks. IEEE Trans. Comput. 36, 5 (May), 547-553.]] Google Scholar
- DALLY, W. J., DENNISON, L., HARRIS, D., KAN, K., AND XANTHOPOULUS, T. 1994. The reliable router: A reliable and high-performance communication substrate for parallel computers. In Proceedings of the Workshop on Parallel Computer Routing and Communications (May), 241-255.{]] Google Scholar
- DAO, B.V., DUATO, J., AND YALAMANCHILI, S. 1995. Configurable flow control mechanisms for fault-tolerant routing. In Proceedings of the International Symposium on Computer Architecture (June).]] Google Scholar
- DONGARRA, J.J., OTTO, S.W., SNIR, M., WALKER, D. 1998. An introduction to the MPI standard. Commun. ACM, to appear.]]Google Scholar
- DRAPER, J. AND GHOSH, J. 1994. A simple analytical model for wormhole routing in multicomputer systems. J. Parallel Distrib. Comput. 20, 202-214.]] Google ScholarDigital Library
- DUATO, J. 1994a. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. In Proceedings of the International Conference on Parallel Processing, Vol. I, 142-149.]] Google ScholarDigital Library
- DUATO, J. 1994b. A theory of fault-tolerant routing in wormhole networks. In Proceedings of the International Conference on Parallel and Distributed Systems, 600-607.]] Google Scholar
- DUATO, J. 1993. A new theory of deadlock-free adaptive routing in wormhole network. IEEE Trans. Parallel Distrib. Systems 4, 12 (Dec.), 1320-1331.]] Google ScholarDigital Library
- DUATO, J. 1992. Improving the efficacy of virtual channels with time-dependent selection functions. In Proceedings of Parallel Architectures and Languages Europe.]] Google Scholar
- DUATO, J. AND LOPEZ, P. 1994. Performance evaluation of adaptive routing algorithms for K-ary N-cubes. In Proceedings of the Workshop on Parallel Computer Routing and Communication (May).]] Google Scholar
- DUATO, J., DAO, B.V., GAUGHAN, P.T., AND YALAMANCHILI, S. 1994. Scouting: Fully adaptive, deadlock-free routing in faulty pipelined networks. In Proceedings of the International Conference on Parallel and Distributed Systems (Dec.).]] Google Scholar
- GAUGHAN, P.T. AND YALAMANCHILI, S. 1995. A family of fault tolerant routing protocols for direct multiprocessor networks. IEEE Trans. Parallel Distrib. Syst. (May), 482-497.]] Google ScholarDigital Library
- GLASS, C.J. AND NI, L.M. 1994. The turn model for adaptive routing. J. ACM 41 (Sept.), 874-902.]] Google ScholarDigital Library
- GLASS, C. J. AND NI, L.M. 1993. Fault-tolerant wormhole routing in meshes. In Proceedings of the International Symposium on Fault-Tolerant Computing, 240-249.]]Google Scholar
- GLASS, C.J. AND NI, L.M. 1992a. Maximally fully adaptive routing in 2D meshes. In Proceedings of the International Conference on Parallel Processing (Aug.).]]Google Scholar
- GLASS, C.J. AND NI, L.M. 1992b. Adaptive routing in mesh-connected networks. In Proceedings of the International Conference on Distributed Computing Systems, 12-19.]]Google Scholar
- GOPAL, I.S. 1985. Prevention of store-and-forward deadlock in computer networks. IEEE Trans. Commun. (Dec.), 1258-1264.]]Google ScholarCross Ref
- GRAVANO, L., PIFARRE, G. D., BERMAN, P. E., AND SANK, J.L.C. 1994. Adaptive deadlockand livelock-free routing with all minimal paths in torus networks. IEEE Trans. Parallel Distrib. Syst. 5, 12 (Dec.), 1233-1251.]] Google ScholarDigital Library
- INTEL 1990. A Touchstone DELTA System Description. Intel Corp., Santa Clara, CA.]]Google Scholar
- INTEL 1991. Paragon XP/S Product Overview. Intel Corp., Santa Clara, CA.]]Google Scholar
- JESSHOPE, C. R., MILLER, P. R., AND YANCHEV, J. T. 1989. High performance communications in processor networks. In Proceedings of the International Symposium on Computer Architecture (May), 150-157.]] Google Scholar
- KERMANI, P. AND KLEINROCK, L. 1979. Virtual cut-through: A new computer communication switch technique. Comput. Netw. 3, 267-286.]]Google Scholar
- KESSLER, R.E. AND SCHWARZMEIER, J.L. 1993. CRAY T3D: A new dimension for Cray research. Compcon (Spring), 176-182.]]Google Scholar
- KIM, J AND DAS, C.R. 1991. Modeling wormhole routing in a hypercube. IEEE Trans. Comput. (Dec.), 1052-1060.]]Google Scholar
- KIM, J. H. AND CHIEN, A.A. 1995. Network performance under bimodal traffic loads. J. Parallel Distrib. Comput. 28, 43-64.]] Google ScholarDigital Library
- KIM, J. H., LIU, Z., AND CHIEN, A.A. 1994. Compressionless routing: A framework for adaptive and fault-routing. In Proceedings of the International Symposium on Computer Architecture (April), 289-300.{]] Google Scholar
- KONSTANTINIDOU S. AND SNYDER, L. 1991. Chaos router: Architecture and performance. In Proceedings of the International Symposium on Computer Architecture (May), 212-221.]] Google Scholar
- LEISERSON, C.E., ABUHAMDEH, Z.S., DOUGLAS, D. C., FEYNMAN, C. R., GANMUKHI, M. N., HILL, J.V., KUSZMAUL, B.C., PIERRE, M.A.S., WELLS, D. S., WONG, M. C., YANG, S. W., AND ZAK, R. 1992. The network architecture of the connection machine CM-5. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, 544-557.]] Google Scholar
- LENOSKI, D., LAUDON, J., GHARACHORLOO, K., WE- BER, W., GUPTA, A., HENNESSY, J., HOROWITZ, M., AND LAM, M. 1992. The Stanford DASH multiprocessor. IEEE Computer (March), 63- 79.]] Google Scholar
- LIBESKIND-HADAS, R. AND BRANDT, E. 1995. Origin-based fault-tolerant routing in the mesh. In Proceedings of the Symposium on High Performance Computer Architecture (Jan.), 102-111.]] Google Scholar
- LIN, X., MCKINLEY, P.K., AND NI, L.M. 1993. The message flow model for routing in wormhole-routed networks. In Proceedings of the International Conference on Parallel Processing, Vol. I, 294-297.]] Google ScholarDigital Library
- LINDER, D.H. AND HARDEN, J.C. 1991. An adaptive and fault tolerant wormhole routing strategy for k-ary n cubes. IEEE Trans. Comput. 40 (Jan.), 2-12.]] Google ScholarDigital Library
- LOPEZ, P. AND DUATO, J. 1993. Deadlock-free adaptive routing algorithms for the 3D-torus: Limitations and solutions. In Proceedings of Parallel Architectures and Languages Europe.{]] Google Scholar
- MCKINLEY, P.K., TASI, Y., AND ROBINSON, D.F. 1995. Collective communication in wormhole-routed massively parallel computers. IEEE Computer (Dec.), 39-50.]] Google ScholarDigital Library
- NCUBE COMPANY 1990. NCUBE-2 Processor Manual.]]Google Scholar
- NI, L. M. AND MCKINLEY, P.K. 1993. A survey of wormhole routing techniques in direct networks. IEEE Computer 26, 2 (Feb.), 62-76.]] Google ScholarDigital Library
- NI, L.M., GuI, Y., AND MOORE, S. 1995. Performance evaluation of switch-based wormhole networks. In Proceedings of the International Conference on Parallel Processing.]]Google Scholar
- NOAKES, M., WALLACH, D.A., AND DALLY, W.J. 1993. The J-machine multicomputer: An architectural evaluation. In Proceedings of the International Symposium on Computer Architecture, 224-235.]] Google Scholar
- PARK, H. AND AGRAWAL, D.P. 1995. Efficient deadlock-free wormhole routing in shuffle based networks. In Proceedings of the IEEE Symposium on Parallel and Distributed Processing, 92-99.]] Google Scholar
- PINKSTON, T. M. AND WARNAKULASURYA, S. 1997. On deadlocks in interconnection networks. In Proceedings of the International Symposium on Computer Architecture (June), 38-49.]] Google Scholar
- REEVES, D.S., GEHRINGER, E.F., AND CHANDIRA- MANI, A. 1989. Adaptive routing and deadlock recovery: A simulation study. In Proceedings of the Fourth Conference on Hypercube Concurrent Computers and Applications (March).]]Google Scholar
- REXFORD, J. AND SHIN, K.G. 1994. Support for multiple classes of traffic in multicomputer routers. In Proceedings of the Parallel Computer Routing and Communication Workshop, LNCS 853, Springer, New York, 116-130.]] Google Scholar
- SAMATHAM, M. R. AND PRADHAN, D.K. 1989. The de Bruijn multiprocessor network: A versatile parallel processing and sorting network for VLSI. IEEE Trans. Comput. C-38, (April), 567-581.]] Google ScholarDigital Library
- SCHWIEBERT, L. AND JAYASIMHA, D.N. 1996. A necessary and sufficient condition for deadlock-free wormhole routing. J. Parallel Distrib. Comput. 32, 103-117.]] Google ScholarDigital Library
- SCHWIEBERT, L. AND JAYASIMHA, D. N. 1995. Optimally fully adaptive minimal wormhole routing for meshes. J. Parallel Distrib. Comput. 27, 56-70.]] Google ScholarDigital Library
- SCOTT, S. AND THORSON, G. 1994. Optimized routing in the Cray T3D. In Proceedings of the International Workshop on Parallel Computer Routing and Communication, 281-294.]] Google Scholar
- STUNKEL, C. B., SHEA, D. G., ABALI, B., DENNEAU, M.M., HOCHSCHILD, P.H., JOSEPH, D.J., NATHANSON, B.J., TSAO, M., AND VARKER, P.R. 1994. Architecture and implementation of Vulvan. In Proceedings of the International Parallel Processing Symposium (April), 268-274.]] Google Scholar
- Su, C. AND SHIN, K.G. 1993. Adaptive deadlock-free routing in multicomputers using only one extra channel. In Proceedings of the International Conference on Parallel Processing, Vol. 1 (August), 227-231.]] Google Scholar
- SUH, Y.J., DAO, B.V., DUATO, J., AND YALAMAN- CHILI, S. 1995. Software based fault-tolerant oblivious routing in pipelined networks. In Proceedings of the International Conference on Parallel Processing (August).]]Google Scholar
- UPADHYAY, J., VARAVITHYA, V., AND MOHAPATRA, P. 1997. A traffic-balanced adaptive routing scheme for two-dimensional meshes. IEEE Trans. Comput. (Feb.), 190-197.]] Google ScholarDigital Library
- UPADHYAY, J., VARAVITHYA, V., AND MOHAPATRA, P. 1995a. An efficient and balanced routing in two-dimensitonal meshes. In Proceedings of the First International Symposium on High Performance Computer Architecture (Jan.), 112-122.]] Google Scholar
- UPADHYAY, J., VARAVITHYA, V., AND MOHAPATRA, P. 1995b. Routing algorithms for torus networks. In Proceedings of the International Conference on High Performance Computing, 743-748.]]Google Scholar
- VARAVITHYA, V., UPADHYAY, J., AND MOHAPATRA, P. 1995. An efficient fault-tolerant routing scheme for two-dimensional meshes. In Proceedings of the International Conference on High-Performance Computing, 773-778.]]Google Scholar
Index Terms
- Wormhole routing techniques for directly connected multicomputer systems
Recommendations
A comparison of adaptive wormhole routing algorithms
ISCA '93: Proceedings of the 20th annual international symposium on computer architectureImprovement of message latency and network utilization in torus interconnection networks by increasing adaptivity in wormhole routing algorithms is studied. A recently proposed partially adaptive algorithm and four new fully-adaptive routing algorithms ...
A Heuristic-Based Wormhole Routing Algorithm for Hypercube Multicomputer Networks
In this paper, we consider fault-tolerant routing algorithms in hypercube multicomputer networks. In particular, one of the most quoted adaptive fault-tolerant routing algorithm for hypercubes in the literature is studied in detail and its limited ...
Comments