skip to main content
10.1145/3474085.3475550acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

No-Reference Video Quality Assessment with Heterogeneous Knowledge Ensemble

Published:17 October 2021Publication History

ABSTRACT

Blind assessment of video quality is still challenging even in this deep learning era. The limited number of samples in existing databases is insufficient to learn a good feature extractor for video quality assessment (VQA), while manually labeling a larger database with subjective perception is very labor-intensive and time-consuming. To relieve such difficulty, we first collect 3589 high-quality video clips as the reference and build a large VQA dataset. The dataset contains more than 300K samples degraded by various distortion types due to compression and transmission error, and provides weak labels for each distorted sample with several full-reference VQA algorithms. To learn effective representation from the weakly labeled data, we alleviate the bias of single weak label (i.e., single knowledge) via learning from multiple heterogeneous knowledge. To this end, we propose a novel no-reference VQA (NR-VQA) method with HEterogeneous Knowledge Ensemble (HEKE). Comparing to learning from single knowledge, HEKE can theoretically reach a lower infimum, and learn richer representation due to the heterogeneity. Extensive experimental results show that the proposed HEKE outperforms existing NR-VQA methods, and achieves the state-of-the-art performance. The source code will be available at https://github.com/Sissuire/BVQA-HEKE.

References

  1. S. Ahn and S. Lee. 2018. Deep blind video quality assessment based on temporal human perception. In 2018 25th IEEE Int. Conf. Image Process. (ICIP). 619--623. https://doi.org/10.1109/ICIP.2018.8451450Google ScholarGoogle Scholar
  2. G. Brown, J. L. Wyatt, and P. Tivno. 2005. Managing diversity in regression ensembles. J. Mach. Learn. Res., Vol. 6, 55 (2005), 1621--1650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Chen, L. Li, L. Ma, J. Wu, and G. Shi. 2020. RIRNet: Recurrent-in-recurrent network for video quality assessment. In Proceedings of the 28th ACM Int. Conf. Multimedia (Seattle, WA, USA) (MM '20). ACM, New York, NY, USA, 834--842. https://doi.org/10.1145/3394171.3413717 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cisco 2020. Cisco annual internet report (2018?2023) white paper. Retrieved March 9, 2020 from https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.htmlGoogle ScholarGoogle Scholar
  5. F. De Simone, M. Naccari, M. Tagliasacchi, F. Dufaux, S. Tubaro, and T. Ebrahimi. 2009. Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel. In 2009 Int. Workshop Quality Multimedia Exper. 204--209. https://doi.org/10.1109/QOMEX.2009.5246952Google ScholarGoogle ScholarCross RefCross Ref
  6. FFmpeg Developers. 2021. FFmpeg tool. http://ffmpeg.org/ Retrieved April, 8, 2021 fromGoogle ScholarGoogle Scholar
  7. S. Geman, E. Bienenstock, and R. Doursat. 1992. Neural networks and the bias/variance dilemma. Neural Comput., Vol. 4, 1 (1992), 1--58. https://doi.org/10.1162/neco.1992.4.1.1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. V. Hosu, F. Hahn, M. Jenadeleh, H. Lin, H. Men, T. Szirányi, S. Li, and D. Saupe. 2017. The Konstanz natural video database (KoNViD-1k). In 2017 Ninth Int. Conf. Quality Multimedia Exper. (QoMEX). 1--6. https://doi.org/10.1109/QoMEX.2017.7965673Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Korhonen. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Trans. Image Process., Vol. 28, 12 (2019), 5923--5938. https://doi.org/10.1109/TIP.2019.2923051Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Li, T. Jiang, and M. Jiang. 2019. Quality assessment of in-the-wild videos. In Proceedings of the 27th ACM Int. Conf. Multimedia (Nice, France) (MM '19). ACM, New York, NY, USA, 2351--2359. https://doi.org/10.1145/3343031.3351028 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Li, Q. Guo, and X. Lu. 2016. Spatiotemporal statistics for video quality assessment. IEEE Trans. Image Process., Vol. 25, 7 (2016), 3329--3342. https://doi.org/10.1109/TIP.2016.2568752Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Liu, Z. Duanmu, and Z. Wang. 2018. End-to-end blind quality assessment of compressed videos using deep neural networks. In Proceedings of the 26th ACM Int. Conf. Multimedia (Seoul, Republic of Korea) (MM '18). ACM, New York, NY, USA, 546--554. https://doi.org/10.1145/3240508.3240643 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Manasa and S. S. Channappayya. 2016. An optical flow-based no-reference video quality assessment algorithm. In 2016 IEEE Int. Conf. Image Process. (ICIP). 2400--2404. https://doi.org/10.1109/ICIP.2016.7532789Google ScholarGoogle ScholarCross RefCross Ref
  14. A. Mittal, M. A. Saad, and A. C. Bovik. 2016. A completely blind video integrity oracle. IEEE Trans. Image Process., Vol. 25, 1 (2016), 289--300. https://doi.org/10.1109/TIP.2015.2502725Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Mittal, R. Soundararajan, and A. C. Bovik. 2013. Making a "completely blind" image quality analyzer. IEEE Signal Process. Lett., Vol. 20, 3 (2013), 209--212. https://doi.org/10.1109/LSP.2012.2227726Google ScholarGoogle ScholarCross RefCross Ref
  16. A. K. Moorthy, L. K. Choi, A. C. Bovik, and G. de Veciana. 2012. Video quality assessment on mobile devices: subjective, behavioral and objective studies. IEEE J. Sel. Top. Signal Process., Vol. 6, 6 (2012), 652--671. https://doi.org/10.1109/JSTSP.2012.2212417Google ScholarGoogle ScholarCross RefCross Ref
  17. Y. Pitrey, M. Barkowsky, R. Pépion, P. Le Callet, and H. Hlavacs. 2012. Influence of the source content and encoding configuration on the perceived quality for scalable video coding. In Human Vis. Electron. Imag. XVII, Vol. 8291. SPIE, 460--467. https://doi.org/10.1117/12.912180Google ScholarGoogle Scholar
  18. S. V. Reddy Dendi and S. S. Channappayya. 2020. No-reference video quality assessment using natural spatiotemporal scene statistics. IEEE Trans. Image Process., Vol. 29 (2020), 5612--5624. https://doi.org/10.1109/TIP.2020.2984879Google ScholarGoogle ScholarCross RefCross Ref
  19. M. A. Saad, A. C. Bovik, and C. Charrier. 2014. Blind prediction of natural video quality. IEEE Trans. Image Process., Vol. 23, 3 (2014), 1352--1365. https://doi.org/10.1109/TIP.2014.2299154 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Seshadrinathan and A. C. Bovik. 2010. Motion tuned spatio-temporal quality assessment of natural videos. IEEE Trans. Image Process., Vol. 19, 2 (2010), 335--350. https://doi.org/10.1109/TIP.2009.2034992 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Seshadrinathan, R. Soundararajan, A. C. Bovik, and L. K. Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE Trans. Image Process., Vol. 19, 6 (2010), 1427--1441. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Z. Sinno and A. C. Bovik. 2019. Large-scale study of perceptual video quality. IEEE Trans. Image Process., Vol. 28, 2 (2019), 612--627. https://doi.org/10.1109/TIP.2018.2869673Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Soundararajan and A. C. Bovik. 2013. Video quality assessment by reduced reference spatio-temporal entropic differencing. IEEE Trans. Circuits Syst. Video Technol., Vol. 23, 4 (2013), 684--694. https://doi.org/10.1109/TCSVT.2012.2214933 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun, and M. Paluri. 2018. A closer look at spatiotemporal convolutions for action recognition. In 2018 IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR). 6450--6459. https://doi.org/10.1109/CVPR.2018.00675Google ScholarGoogle Scholar
  25. Z. Tu, Y. Wang, N. Birkbeck, B. Adsumilli, and A. C. Bovik. 2021. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Trans. Image Process., Vol. 30 (2021), 4449--4464. https://doi.org/10.1109/TIP.2021.3072221Google ScholarGoogle ScholarCross RefCross Ref
  26. P. V. Vu and D. M. Chandler. 2014. ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices. J. Electron. Imag., Vol. 23 (Feb. 2014), 013016.Google ScholarGoogle ScholarCross RefCross Ref
  27. Z. Wang, E. P. Simoncelli, and A. C. Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conf. Signals, Syst. Comput., Vol. 2. 1398--1402 Vol.2. https://doi.org/10.1109/ACSSC.2003.1292216Google ScholarGoogle Scholar
  28. J. Wu, Y. Liu, W. Dong, G. Shi, and W. Lin. 2019. Quality assessment for video with degradation along salient trajectories. IEEE Trans. Multimedia, Vol. 21, 11 (Nov 2019), 2738--2749. https://doi.org/10.1109/TMM.2019.2908377Google ScholarGoogle ScholarCross RefCross Ref
  29. J. Wu, J. Ma, F. Liang, W. Dong, G. Shi, and W. Lin. 2020. End-to-end blind image quality prediction with cascaded deep neural network. IEEE Trans. Image Process., Vol. 29 (2020), 7414--7426. https://doi.org/10.1109/TIP.2020.3002478Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Xu, P. Ye, Y. Liu, and D. Doermann. 2014. No-reference video quality assessment via feature learning. In 2014 IEEE Int. Conf. Image Proce. (ICIP). 491--495. https://doi.org/10.1109/ICIP.2014.7025098Google ScholarGoogle ScholarCross RefCross Ref
  31. W. Xue, L. Zhang, X. Mou, and A. C. Bovik. 2014. Gradient magnitude similarity deviation: a highly efficient perceptual image quality index. IEEE Trans. Image Process., Vol. 23, 2 (2014), 684--695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. You and J. Korhonen. 2019. Deep neural networks for no-reference video quality assessment. In 2019 IEEE Int. Conf. Image Process. (ICIP). 2349--2353. https://doi.org/10.1109/ICIP.2019.8803395Google ScholarGoogle ScholarCross RefCross Ref
  33. F. Zhang, S. Li, L. Ma, Y. C. Wong, and K. N. Ngan. 2011. IVP subjective quality video database. The Chinese University of Hong Kong, http://ivp. ee. cuhk. edu. hk/research/database/subjective(2011).Google ScholarGoogle Scholar
  34. L. Zhang, Z. Shi, M. M. Cheng, Y. Liu, J. W. Bian, J. T. Zhou, G. Zheng, and Z. Zeng. 2021. Nonlinear regression via deep negative correlation learning. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 43, 3 (2021), 982--998. https://doi.org/10.1109/TPAMI.2019.2943860Google ScholarGoogle ScholarCross RefCross Ref
  35. Y. Zhang, X. Gao, L. He, W. Lu, and R. He. 2019. Blind video quality assessment with weakly supervised learning and resampling strategy. IEEE Trans. Circuits Syst. Video Technol., Vol. 29, 8 (2019), 2244--2255. https://doi.org/10.1109/TCSVT.2018.2868063Google ScholarGoogle ScholarCross RefCross Ref
  36. K. Zhu, C. Li, V. Asari, and D. Saupe. 2015. No-reference video quality assessment based on artifact measurement and statistical analysis. IEEE Trans. Circuits Syst. Video Technol., Vol. 25, 4 (2015), 533--546. https://doi.org/10.1109/TCSVT.2014.2363737Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. No-Reference Video Quality Assessment with Heterogeneous Knowledge Ensemble

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '21: Proceedings of the 29th ACM International Conference on Multimedia
        October 2021
        5796 pages
        ISBN:9781450386517
        DOI:10.1145/3474085

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader