Skip to main content

3D-MAPS: 3D Massively Parallel Processor with Stacked Memory

  • Chapter
  • First Online:
Design for High Performance, Low Power, and Reliable 3D Integrated Circuits

Abstract

This chapter describes the architecture, design, analysis, and simulation and measurement results of the 3D-MAPS (3D massively parallel processor with stacked memory) chip built with a 1.5 V, 130 nm process technology, and a two-tier 3D stacking technology using 1.2 um-diameter, 6 um-height through-silicon vias (TSVs) and 3.4 um-diameter face-to-face bonding pads. 3D-MAPS consists of a core tier containing 64 cores and a memory tier containing 256 KB SRAM. Each core communicates with its dedicated 4 KB SRAM using face-to-face bonding pads. The maximum feasible memory bandwidth of 3D-MAPS is extremely wide (70.9 GB/s at 277 MHz) while the data transfer delay between the core tier and the memory tier is negligible. The maximum operating frequency is 277 MHz, the peak measured memory bandwidth usage is 63.8 GB/s, and the peak measured power is approximately 4 W based on eight parallel benchmarks.

The materials presented in this chapter are based on [7].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The footprint area given to us as part of 2009 DARPA/Tezzaron 3D IC multi-project wafer run was 5 ×5 mm, and the size of core was determined so that we can fit the maximum number of cores.

  2. 2.

    We do not perform on-chip thermal analysis mainly because our processor is low power and consumes up to 4 W as shown in Sect. 20.8. We observe that our package-level solutions are enough to keep the processor temperature low using a simple air-cooled heatsink as shown in Fig. 20.13.

References

  1. L.E. Cannon, A cellular computer to implement the Kalman Filter algorithm. PhD thesis, Montana State University, 1969

    Google Scholar 

  2. G.V. der Plas, P. Limaye, I. Loi, A. Mercha, H. Oprins, C. Torregiani, S. Thijs, D. Linten, M. Stucchi, G. Katti, D. Velenis, V. Cherman, B. Vandevelde, V. Simons, I.D. Wolf, R. Labie, D. Perry, S. Bronckers, N. Minas, M. Cupac, W. Ruythooren, J.V. Olmen, A. Phommahaxay, M. de Potter de ten Broeck, A. Opdebeeck, M. Rakowski, B.D. Wachter, M. Dehan, M. Nelis, R. Agarwal, A. Pullini, F. Angiolini, L. Benini, W. Dehaene, Y. Travaly, E. Beyne, P. Marchal, Design issues and considerations for low-cost 3D TSV IC technology. IEEE J. Solid-State Circuits 46(1), 293–307 (2011)

    Google Scholar 

  3. X. Dong, Y. Xie, System-level cost analysis and design exploration for three-dimensional integrated circuits (3D ICs), in Proceedings of Asia and South Pacific Design Automation Conference, Pacifico Yokohama, Jan 2009, pp. 234–241

    Google Scholar 

  4. B. Hohlt, Pthread parallel K-means, 2001

    Google Scholar 

  5. U. Kang, H.-J. Chung, S. Heo, S.-H. Ahn, H. Lee, S.-H. Cha, J. Ahn, D. Kwon, J.H. Kim, J.-W. Lee, H.-S. Joo, W.-S. Kim, H.-K. Kim, E.-M. Lee, S.-R. Kim, K.-H. Ma, D.-H. Jang, N.-S. Kim, M.-S. Choi, S.-J. Oh, J.-B. Lee, T.-K. Jung, J.-H. Yoo, C. Kim, 8Gb 3D DDR3 DRAM using through-silicon-via technology, in Proceedings of IEEE International Solid-State Circuits Conference, San Francisco, Feb 2009, pp. 130–131

    Google Scholar 

  6. Y. Kikuchi, M. Takahashi, T. Maeda, H. Hara, H. Arakida, H. Yamamoto, Y. Hagiwara, T. Fujita, M. Watanabe, T. Shimazawa, Y. Ohara, T. Miyamori, M. Hamada, M. Takahashi, Y. Oowaki, A 222mW H.264 full-HD decoding application processor with x512b stacked DRAM in 40nm, in Proceedings of IEEE International Solid-State Circuits Conference, San Francisco, Feb 2010, pp. 326–327

    Google Scholar 

  7. D.H. Kim, K. Athikulwongse, M.B. Healy, M.M. Hossain, M. Jung, I. Khorosh, G. Kumar, Y.-J. Lee, D.L. Lewis, T.-W. Lin, C. Liu, S. Panth, M. Pathak, M. Ren, G. Shen, T. Song, D.H. Woo, X. Zhao, J. Kim, H. Choi, G.H. Loh, H.-H. S. Lee, S.K. Lim, 3D-MAPS: 3D massively parallel processor with stacked memory, in Proceedings of IEEE International Solid-State Circuits Conference, San Francisco, 2012

    Google Scholar 

  8. J.-S. Kim, C.S. Oh, H. Lee, D. Lee, H.R. Hwang, S. Hwang, B. Na, J. Moon, J.-G. Kim, H. Park, J.-W. Ryu, K. Park, S.K. Kang, S.-Y. Kim, H. Kim, J.-M. Bang, H. Cho, M. Jang, C. Han, J.-B. Lee, J.S. Choi, and Y.-H. Jun. A 1.2 V 12.8 GB/s 2 Gb mobile wide-I/O DRAM with 4 × 128 I/Os using TSV based stacking. IEEE J. Solid-State Circuits 47(1), 107–116 (2012)

    Google Scholar 

  9. M. Koyanagi, Y. Nakagawa, K.-W. Lee, T. Nakamura, Y. Yamada, K. Inamura, K. tae Park, H. Kurino, Neuromorphic vision chip fabricated using three-dimensional integration technology, in Proceedings of IEEE International Solid-State Circuits Conference, San Francisco, Feb 2001, pp. 270–271

    Google Scholar 

  10. C.C. Liu, I. Ganusov, M. Burtscher, S. Tiwari, Bridging the processor-memory performance gap with 3D IC technology. IEEE Des. Test Comput. 22(6), 556–564 (2005)

    Article  Google Scholar 

  11. G.H. Loh, 3D-Stacked memory architectures for multi-core processors, in Proceedings of IEEE International Symposium on Computer Architecture, Beijing, June 2008, pp. 453–464

    Google Scholar 

  12. National Institute of Standards and Technology, Advanced encryption standard (AES), http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf

  13. H. Saito, M. Nakajima, T. Okamoto, Y. Yamada, A. Ohuchi, N. Iguchi, T. Sakamoto, K. Yamaguchi, M. Mizuno, A chip-stacked memory for on-chip SRAM-rich SoCs and processors. IEEE J. Solid-State Circuits 45(1), 15–22 (2010)

    Article  Google Scholar 

  14. D.H. Woo, N.H. Seong, D.L. Lewis, H.-H.S. Lee, An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth, in Proceedings of IEEE International Symposium on High-Performance Computer Architecture, Bangalore, Jan 2010

    Google Scholar 

  15. Xilinx, Implementing median filters in XC4000E FPGAs

    Google Scholar 

  16. J. Zhao, X. Dong, Y. Xie. Cost-aware three-dimensional (3D) many-core multiprocessor design, in Proceedings of ACM Design Automation Conference, Anaheim, June 2010, pp. 126–131

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Lim, S.K. (2013). 3D-MAPS: 3D Massively Parallel Processor with Stacked Memory. In: Design for High Performance, Low Power, and Reliable 3D Integrated Circuits. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9542-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-9542-1_20

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4419-9541-4

  • Online ISBN: 978-1-4419-9542-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics