Abstract
Effective memory utilization is critical to reap the benefits of the multi-core processors emerging on embedded systems. In this paper we explore the use of a stream model to effectively utilize memory hierarchies. We target image processing algorithms running on the Analog Devices Blackfin BF561 fixed-point, dual-core DSP. Using optimized assembly to effectively use cores reduces runtime, but also underscores the need to mitigate the memory bottleneck. Like other embedded processors, the Blackfin BF561 has L2 SRAM available. Applying the stream model allows us to effectively make full use of both cores and the L2 SRAM. We achieve almost a 10X speedup in execution time compared to non-optimized C code.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wulf, W.A., McKee, S.A.: Hitting the Memory Wall: Implications of the Obvious. SIGARCH Computer Architecture News 23, 20–24 (1995)
Cucchiara, R., Massimo Piccardi, A.P.: Exploiting Cache in Multimedia. In: Proc. of Int’l Conference on Multimedia Computing and Systems, vol. 1, pp. 345–350 (1999)
Pati, A.: Exploring Multimedia Applications Locality to Improve Cache Performance. In: Proc. of 8th Int’l Conference on Multimedia, pp. 509–510 (2000)
Naz, A., Kavi, K., Sweany, P., Rezaei, M.: A Study of Separate Array and Scalar Caches. In: Proc. of the 18th Int’l Symposium on High Performance Computing Systems and Applications, pp. 157–164 (2004)
Naz, A., Rezaei, M., Kavi, K., Sweany, P.: Improving Data Cache Performance with Integrated Use of Split Caches, Victim Cache and Stream Buffers. In: Proc. of the 2004 Workshop on Memory Performance: Dealing with Applications, Systems and Architecture, pp. 41–48 (2004)
Banakar, R., Steinke, S., Lee, B.-S., Balakrishnan, M., Marwedel, P.: Scratchpad Memory: A Design Alternative for Cache On-chip memory in Embedded Systems. In: Proc. of the 10th Int’l Symposium on Hardware/Software Codesign, pp. 73–78 (2002)
Dally, W.J., Kapasi, U.J., Khailany, B., Ahn, J.H., Das, A.: Stream Processors: Programmability and Efficiency. ACM Queue 2, 52–52 (2004)
Kapasi, U.J., Rixner, S., Dally, W.J., Khailany, B., Ahn, J.H., Mattso, P., Owen, J.D.: Programmable Stream Processors. ACM Computer 8, 54–62 (2003)
Venkatasubramanian, S.: The Graphics Card as a Stream Computer. In: Workshop on Management and Processing of Data Streams (2003)
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream Computing on Graphics Hardware. ACM Transactions on Graphics 23, 777–786 (2004)
Gordon, M.I., Thies, W., Karczmarek, M., Lin, J., Meli, A.S., Lamb, A.A., Leger, C., Wong, J., Hoffmann, H., Maze, D., Amarasinghe, S.: A Stream Compiler for Communication-Exposed Architectures. SIGPLAN Not. 10, 291–303 (2002)
Mattson, P.: A Programming System for the Imagine Media Processor. PhD thesis, Stanford University (2001)
Rixner, S., Dally, W.J., Kapasi, U.J., Khailany, B., Lopez-Lagunas, A., Mattson, P.R., Owens, J.D.: A Bandwidth-Efficient Architecture for Media Processing. In: Proc. of the 31th Int’l Symposium on Microarchitecture, pp. 3–13 (1998)
Khailany, B., Dally, W.J., Kapasi, U.J., Mattson, P., Namkoong, J., Owens, J.D., Towles, B., Chang, A., Rixner, S.: Imagine: Media Processing with Streams. IEEE Micro 21, 35–46 (2001)
Dally, W.J.: Merrimac: Supercomputing with Streams. In: Proc. of the Conference on Supercomputing (2003)
Stream Processing: Enabling a New Class of Easy to Use, High-Performance Parallel DSPs. White Paper 1.9, Stream Processors Inc. 455 DeGuigne Drive Sunnyvale, CA 94085, USA (2007)
Sankaralingam, K., Nagarajan, R., Liu, H., Kim, C., Huh, J., Ranganathan, N., Burger, D., Keckler, S.W., McDonald, R.G., Moore, C.R.: TRIPS: A Polymorphous Architecture for Exploiting ILP, TLP, and DLP. ACM Transactions on Architecture and Code Optimization 1, 62–93 (2004)
Waingold, E., Taylor, M., Srikrishna, D., Sarkar, V., Lee, W., Lee, V., Kim, J., Frank, M., Finch, P., Barua, R., Babb, J., Amarasinghe, S., Agarwal, A.: Baring It All to Software: RAW Machines. Computer 30, 86–93 (1997)
Gummaraju, J., Rosenblum, M.: Stream Programming on General-Purpose Processors. In: Proc. of the 38th Int’l Symposium on Microarchitecture, Washington, DC, USA, pp. 343–354. IEEE Computer Society Press, Los Alamitos (2005)
Kolagotla, R.K., Fridman, J., Aldrich, B.C., Hoffman, M.M., Anderson, W.C., Allen, M.S., Witt, D.B., Dunton, R.R., Booth, L.A.J: High Performance Dual-MAC DSP Architecture. IEEE Signal Processing 19, 42–43 (2002)
Analog Devices, Inc. One Technology Way, Norwood, MA 02062, USA: ADSP-BF53x/BF56x Blackfin Processor Programming Reference. 1.0 edn. (2005)
Analog Devices, Inc. One Technology Way, Norwood, MA 02062, USA: ADSP-BF561 Blackfin Processor Hardware Reference. 1.0 edn. (2005)
Green, B.: Edge Detection Tutorial (2002), http://www.pages.drexel.edu/~weg22/edge.html
Analog.com: Software Development Kit (SDK) Downloads (2007), http://www.analog.com/processors/platforms/sdk.html
Ning, K., Yi, G., Gentile, R.: Single-chip Dual-core Embedded Programming Models for Multimedia Applications (2005), http://www.ecnmag.com/article/CA502854.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benjamin, M.G., Kaeli, D. (2007). Stream Image Processing on a Dual-Core Embedded System. In: Vassiliadis, S., Bereković, M., Hämäläinen, T.D. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2007. Lecture Notes in Computer Science, vol 4599. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73625-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-73625-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73622-6
Online ISBN: 978-3-540-73625-7
eBook Packages: Computer ScienceComputer Science (R0)