Abstract
An architecture for improving computer performance is presented and discussed. The main feature of the architecture is a high degree of decoupling between operand access and execution. This results in an implementation which has two separate instruction streams that communicate via queues. A similar architecture has been previously proposed for array processors, but in that context the software is called on to do most of the coordination and synchronization between the instruction streams. This paper emphasizes implementation features that remove this burden from the programmer. Performance comparisons with a conventional scalar architecture are given, and these show that considerable performance gains are possible.
Single instruction stream versions, both physical and conceptual, are discussed with the primary goal of minimizing the differences with conventional architectures. This would allow known compilation and programming techniques to be used. Finally, the problem of deadlock in such a system is discussed, and one possible solution is given.
- 1 Flynn, M. J., "Very High-Speed Computing Systems," Proceedings of the IEEE, Vol. 54, No. 12, pp. 1901-1909, December 1966.Google ScholarCross Ref
- 2 Riseman, E. M. and C. C. Foster, "Percolation of Code to Enhance Parallel Dispatching and Execution," IEEE Trans. on Computers, Vol. C-21, No. 12, pp. 1411-1415, December 1972.Google Scholar
- 3 Tjaden, G. S. and M. J. Flynn, "Detection and Parallel Execution of Independent Instructions," IEEE Trans. on Computers, Vol. C-19, No. 10, pp. 889-895, October 1970.Google Scholar
- 4 Thornton, J. E., Design of a Computer - The Control Data 6600, Scott, Foresman and Co., Glenview, IL, 1970. Google ScholarDigital Library
- 5 Anderson, D. W., F. J. Sparacio, and R. M. Tomasulo, "The IBM, System/360 Model 91: Machine Philosophy and Instruction Handling," IBM Journal of Research and Development, pp. 8-24, January 1967Google Scholar
- 6 Bucholz, W., ed., Planning a Computer System, McGraw-Hill, New York, 1962. Google ScholarDigital Library
- 7 Cohler, E. U. and J. E. Storer, "Functionally Parallel Architecture for Array Processors," Computer, Vol. 14, No. 9, pp. 28-36, September 1981.Google Scholar
- 8 McMahon, F. H., "FORTRAN CPU Performance Analysis," Lawrence Livermore Laboratories, 1972.Google Scholar
- 9 CRAY-1 Computer Systems, Hardware Reference Manual, Cray Research, Inc., Chippewa Falls, WI, 1979.Google Scholar
Index Terms
- Decoupled access/execute computer architectures
Recommendations
Decoupled access/execute computer architectures
ISCA '82: Proceedings of the 9th annual symposium on Computer ArchitectureAn architecture for improving computer performance is presented and discussed. The main feature of the architecture is a high degree of decoupling between operand access and execution. This results in an implementation which has two separate instruction ...
Decoupled state-execute architecture
ISHPC'05/ALPS'06: Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systemsThe majority of register file designs follow one of two well-known approaches.Manymodern high-performance processors (POWER4 [1], Pentium4 [2]) use a merged register file that holds both architectural and rename registers. Other processors use a Future ...
Comments