skip to main content
10.1145/3477911.3477922acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicctaConference Proceedingsconference-collections
research-article

Unsupervised Construction of Task-Specific Datasets for Object Re-identification

Published:15 October 2021Publication History

ABSTRACT

In the last decade, we have seen a significant uprise of deep neural networks in image processing tasks and many other research areas. However, while various neural architectures have successfully solved numerous tasks, they constantly demand more and more processing time and training data. Moreover, the current trend of using existing pre-trained architectures just as backbones and attaching new processing branches on top not only increases this demand but diminishes the explainability of the whole model.

Our research focuses on combinations of explainable building blocks for the image processing tasks, such as object tracking. We propose a combination of Mask R-CNN, state-of-the-art object detection and segmentation neural network, with our previously published method of sparse feature tracking [16]. Such a combination allows us to track objects by connecting detected masks using the proposed sparse feature tracklets. However, this method cannot recover from complete object occlusions and has to be assisted by an object re-identification.

To this end, this paper uses our feature tracking method for a slightly different task: an unsupervised extraction of object representations that we can directly use to fine-tune an object re-identification algorithm, see Fig. 1 for visualisation. As we have to use objects masks already in the object tracking, our approach utilises the additional information as an alpha channel of the object representations, which further increases the precision of the re-identification. An additional benefit is that our fine-tuning method can be employed even in a fully online scenario.

References

  1. Alina Bialkowski, Simon Denman, Sridha Sridharan, Clinton Fookes, and Patrick Lucey. 2012. A database for person re-identification in multi-camera surveillance networks. In 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA). IEEE, 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  2. Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua. 2010. Brief: Binary robust independent elementary features. In ECCV. Springer.Google ScholarGoogle Scholar
  3. Andrea Colombari, Andrea Fusiello, and Vittorio Murino. 2007. Segmentation and tracking of multiple video objects. Pattern Recognition 40, 4 (2007), 1307–1317.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Afshin Dehghan, Shayan Modiri Assari, and Mubarak Shah. 2015. Gmmcp tracker: Globally optimal generalized maximum multi clique problem for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4091–4099.Google ScholarGoogle ScholarCross RefCross Ref
  5. Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2009. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence 32, 9(2009), 1627–1645.Google ScholarGoogle Scholar
  6. Christopher G Harris, Mike Stephens, 1988. A combined corner and edge detector.. In Alvey vision conference, Vol. 15. Citeseer, 10–5244.Google ScholarGoogle Scholar
  7. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961–2969.Google ScholarGoogle ScholarCross RefCross Ref
  8. Martin Hirzer, Csaba Beleznai, Peter M Roth, and Horst Bischof. 2011. Person re-identification by descriptive and discriminative classification. In Scandinavian conference on Image analysis. Springer, 91–102.Google ScholarGoogle ScholarCross RefCross Ref
  9. Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google ScholarGoogle Scholar
  10. José Lezama, Karteek Alahari, Josef Sivic, and Ivan Laptev. 2011. Track to the future: Spatio-temporal video segmentation with long-range motion cues. In CVPR 2011. IEEE, 3369–3376.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarGoogle ScholarCross RefCross Ref
  12. David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91–110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bruce D Lucas, Takeo Kanade, 1981. An iterative image registration technique with an application to stereo vision. In IJCAI.Google ScholarGoogle Scholar
  14. Niki Martinel, Christian Micheloni, and Claudio Piciarelli. 2012. Distributed signature fusion for person re-identification. In 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC). IEEE, 1–6.Google ScholarGoogle Scholar
  15. A. Milan, L. Leal-Taixé, I. Reid, S. Roth, and K. Schindler. 2016. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs] (March 2016). http://arxiv.org/abs/1603.00831 arXiv:1603.00831.Google ScholarGoogle Scholar
  16. Petr Pulc. 2019. Hierarchical Motion Tracking for UHD video processing. GitHub repository (2019). https://github.com/petrpulc/gpu_orb_trackerGoogle ScholarGoogle Scholar
  17. Petr Pulc. 2021. Mask R-CNN Pedestrian Tracklets. https://doi.org/10.34740/KAGGLE/DS/1376245Google ScholarGoogle Scholar
  18. Petr Pulc and Martin Holeňa. 2018. Hierarchical Motion Tracking Using Matching of Sparse Features. In 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS). IEEE, 449–456.Google ScholarGoogle Scholar
  19. Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arXiv (2018).Google ScholarGoogle Scholar
  20. Edward Rosten and Tom Drummond. 2006. Machine learning for high-speed corner detection. In ECCV. Springer, 430–443.Google ScholarGoogle Scholar
  21. Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In ICCV. IEEE, 2564–2571.Google ScholarGoogle Scholar
  22. Jianbo Shi and Carlo Tomasi. 1993. Good features to track. Technical Report. Cornell University.Google ScholarGoogle Scholar
  23. Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE international conference on computer vision. 4489–4497.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Qing Wang, Feng Chen, Wenli Xu, and Ming-Hsuan Yang. 2011. An experimental comparison of online object-tracking algorithms. In Wavelets and Sparsity XIV, Vol. 8138. International Society for Optics and Photonics.Google ScholarGoogle Scholar
  25. Shu Wang, Huchuan Lu, Fan Yang, and Ming-Hsuan Yang. 2011. Superpixel tracking. In 2011 International Conference on Computer Vision. IEEE, 1323–1330.Google ScholarGoogle Scholar
  26. Taiqing Wang, Shaogang Gong, Xiatian Zhu, and Shengjin Wang. 2014. Person re-identification by video ranking. In European conference on computer vision. Springer, 688–703.Google ScholarGoogle ScholarCross RefCross Ref
  27. Nicolai Wojke and Alex Bewley. 2018. Deep cosine metric learning for person re-identification. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 748–756.Google ScholarGoogle Scholar
  28. Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ning Xu, Linjie Yang, Yuchen Fan, Dingcheng Yue, Yuchen Liang, Jianchao Yang, and Thomas Huang. 2018. Youtube-vos: A large-scale video object segmentation benchmark. arXiv preprint arXiv:1809.03327(2018).Google ScholarGoogle Scholar
  30. Linjie Yang, Yuchen Fan, and Ning Xu. 2019. Video instance segmentation. CoRR abs/1905.04804(2019). https://arxiv.org/abs/1905.04804Google ScholarGoogle Scholar
  31. Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2020. FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking. arXiv preprint arXiv:2004.01888(2020).Google ScholarGoogle Scholar
  32. Zongpu Zhang, Yang Hua, Tao Song, Zhengui Xue, Ruhui Ma, Neil Robertson, and Haibing Guan. 2018. Tracking-assisted Weakly Supervised Online Visual Object Segmentation in Unconstrained Videos. In Proceedings of the 26th ACM international conference on Multimedia. 941–949.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. 2016. Mars: A video benchmark for large-scale person re-identification. In European Conference on Computer Vision. Springer, 868–884.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Unsupervised Construction of Task-Specific Datasets for Object Re-identification
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology Applications
            July 2021
            103 pages
            ISBN:9781450390521
            DOI:10.1145/3477911

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 15 October 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited
          • Article Metrics

            • Downloads (Last 12 months)9
            • Downloads (Last 6 weeks)1

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format