Skip to main content

Advertisement

Log in

Synthesis and comparison of low-power high-throughput architectures for SAD calculation

  • Published:
Analog Integrated Circuits and Signal Processing Aims and scope Submit manuscript

Abstract

Video applications are increasingly present in consumer electronic devices which require low-power and low-energy consumption. Sum of Absolute Differences (SAD) is the most used distortion metric in video coding implementation and consumes a relative large area in the motion estimation hardware. This paper presents the standard-cells synthesis and a comprehensive analysis of various parallel hardware architectures alternatives for SAD calculation, focusing on different design constraints such as high-performance (maximum throughput) and the tradeoff between high-performance and low-power dissipation (namely an isoperformance target). Low-power techniques supported by commercial standard-cells tools are exercised in this design, such as clock gating, multi-threshold (VT) and a combination of slow and fast standard-cells. We achieved significant power reduction for the architectures with lower frequencies and higher parallelism, slow cells and mainly with only one pipeline stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. ITU-T Recommendation H.264/AVC (03/10): advanced video coding for generic audiovisual services (2010).

  2. Chen, T.-C., Chien, S.-Y., Huang, Y.-W., Tsai, C.-H., Chen, C.-Y., Chen, T.-W., et al. (2006). Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder. IEEE Transactions on Circuits and Systems for Video Technology, 16(6), 673–688.

    Article  Google Scholar 

  3. Chen, T.-C., Chen, Y.-H., Tsai, C.-Y., Tsai, S.-F., Chien, S.-Y., & Chen, L.-G. (2007). 2.8 to 67.2mW Low-power and power-aware H.264 encoder for mobile applications. In Symposium on VLSI circuits. Digest of Technical Papers (pp. 222–223).

  4. Liu, Z., Song, Y., Shao, M., Li, S., Li, L., Ishiwata, S., et al. (2009). HDTV1080p H.264/AVC encoder chip design and performance analysis. IEEE Journal of Solid-State Circuits, 44(2), 594–607.

    Article  Google Scholar 

  5. Kao, C.-Y., Kuo, H.-C., & Lin, Y.-L. (2006). High performance fractional motion estimation and mode decision for H.264/AVC. In IEEE international conference on multimedia and expo (pp. 1241–1244).

  6. Chen, T.-C., Chen, Y.-H., Tsai, S.-F., Chien, S.-Y., & Chen, L.-G. (2007). Fast algorithm and architecture design of low-power integer motion estimation for H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology, 17(5), 568–577.

    Article  Google Scholar 

  7. Chen, T. C., et al. (2006). Low power and power aware fractional motion estimation of H.264/AVC for mobile applications. In IEEE international symposium on circuits and systems (pp. 5331–5334).

  8. Shafique, M., Bauer, L., & Henkel, J. (2008, Aug). 3-Tier dynamically adaptive power-aware motion estimator for H.264/AVC video encoding. In ISLPED (pp. 147–152).

  9. Ndili, O., & Ogunfunmi, T. (2011). Algorithm and architecture co-design of hardware-oriented, modified diamond search for fast motion estimation in H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology, 21(9), 1214–1227.

    Article  Google Scholar 

  10. Sinan, Y., Hasan, F. A., & Ilker, H. (2005). A high performance hardware architecture for an SAD reuse based hierarchical motion estimation algorithm for H.264 video coding. In International conference on field programmable logic and applications (pp. 509–514).

  11. Vanne, J., Aho, E., Hamalainen, T. D., & Kuusilinna, K. (2006). A high-performance sum of absolute difference implementation for motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 16(7), 876–883.

    Article  Google Scholar 

  12. Yufei, L., Xiubo, F., & Qin, W. (2007). A high-performance low cost SAD architecture for video coding. IEEE Transactions on Consumer Electronics, 53(2), 535–541.

    Article  Google Scholar 

  13. Zhenyu, L., Yang, S., Ming, S., Shen, L., Lingfeng, L., Satoshi, G., et al. (2007). 32-Parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080P Real-Time encoding application. In IEEE workshop on signal processing systems (pp. 675–680).

  14. Zhenyu, L., Yiqing, H., Yang, S., Satoshi, G., & Takeshi, I. (2007, Mar). Hardware-efficient propagate partial SAD architecture for variable block size motion estimation in H.264/AVC. In Greak Lakes symposium on VLSI (pp. 160–163).

  15. Huang, Y., Liu, Q., Goto, S., & Ikenaga, T. (2009, May). Reconfigurable SAD tree architecture based on adaptive sub-sampling in HDTV application. In Greak Lakes symposium on VLSI (pp. 463–468).

  16. Diniz, C., Corrêa, G., Susin, A., & Bampi, S. (2010). Comparative analysis of parallel SAD calculation hardware architectures for H.264/AVC video coding. In IEEE Latin American symposium on circuits and systems (pp. 132–135).

  17. Artisan Components, TSMC 0.18 μm Process 1.8–Volt SAGE XTM Standard Cell Library, Release 4.1, Set. 2003.

  18. Cadence, Inc. (2011). http://www.cadence.com. Retrieved Aug, 2011.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cláudio Machado Diniz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walter, F.L., Diniz, C.M. & Bampi, S. Synthesis and comparison of low-power high-throughput architectures for SAD calculation. Analog Integr Circ Sig Process 73, 873–884 (2012). https://doi.org/10.1007/s10470-012-9971-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10470-012-9971-z

Keywords

Navigation