research-article

Unsupervised Construction of Task-Specific Datasets for Object Re-identification

Authors:
Petr Pulc

Czech Technical University in Prague, Czech Republic

Czech Technical University in Prague, Czech Republic
View Profile

,
Martin Holena

Czech Academy of Sciences, Czech Republic

Czech Academy of Sciences, Czech Republic
View Profile

ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology ApplicationsJuly 2021Pages 66–72https://doi.org/10.1145/3477911.3477922

Published:15 October 2021Publication History

ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology Applications

Pages 66–72

ABSTRACT

In the last decade, we have seen a significant uprise of deep neural networks in image processing tasks and many other research areas. However, while various neural architectures have successfully solved numerous tasks, they constantly demand more and more processing time and training data. Moreover, the current trend of using existing pre-trained architectures just as backbones and attaching new processing branches on top not only increases this demand but diminishes the explainability of the whole model.

Our research focuses on combinations of explainable building blocks for the image processing tasks, such as object tracking. We propose a combination of Mask R-CNN, state-of-the-art object detection and segmentation neural network, with our previously published method of sparse feature tracking [16]. Such a combination allows us to track objects by connecting detected masks using the proposed sparse feature tracklets. However, this method cannot recover from complete object occlusions and has to be assisted by an object re-identification.

To this end, this paper uses our feature tracking method for a slightly different task: an unsupervised extraction of object representations that we can directly use to fine-tune an object re-identification algorithm, see Fig. 1 for visualisation. As we have to use objects masks already in the object tracking, our approach utilises the additional information as an alpha channel of the object representations, which further increases the precision of the re-identification. An additional benefit is that our fine-tuning method can be employed even in a fully online scenario.

References

Alina Bialkowski, Simon Denman, Sridha Sridharan, Clinton Fookes, and Patrick Lucey. 2012. A database for person re-identification in multi-camera surveillance networks. In 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA). IEEE, 1–8.Google ScholarCross Ref
Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua. 2010. Brief: Binary robust independent elementary features. In ECCV. Springer.Google Scholar
Andrea Colombari, Andrea Fusiello, and Vittorio Murino. 2007. Segmentation and tracking of multiple video objects. Pattern Recognition 40, 4 (2007), 1307–1317.Google ScholarDigital Library
Afshin Dehghan, Shayan Modiri Assari, and Mubarak Shah. 2015. Gmmcp tracker: Globally optimal generalized maximum multi clique problem for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4091–4099.Google ScholarCross Ref
Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2009. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence 32, 9(2009), 1627–1645.Google Scholar
Christopher G Harris, Mike Stephens, 1988. A combined corner and edge detector.. In Alvey vision conference, Vol. 15. Citeseer, 10–5244.Google Scholar
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961–2969.Google ScholarCross Ref
Martin Hirzer, Csaba Beleznai, Peter M Roth, and Horst Bischof. 2011. Person re-identification by descriptive and discriminative classification. In Scandinavian conference on Image analysis. Springer, 91–102.Google ScholarCross Ref
Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google Scholar
José Lezama, Karteek Alahari, Josef Sivic, and Ivan Laptev. 2011. Track to the future: Spatio-temporal video segmentation with long-range motion cues. In CVPR 2011. IEEE, 3369–3376.Google ScholarDigital Library
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarCross Ref
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91–110.Google ScholarDigital Library
Bruce D Lucas, Takeo Kanade, 1981. An iterative image registration technique with an application to stereo vision. In IJCAI.Google Scholar
Niki Martinel, Christian Micheloni, and Claudio Piciarelli. 2012. Distributed signature fusion for person re-identification. In 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC). IEEE, 1–6.Google Scholar
A. Milan, L. Leal-Taixé, I. Reid, S. Roth, and K. Schindler. 2016. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs] (March 2016). http://arxiv.org/abs/1603.00831 arXiv:1603.00831.Google Scholar
Petr Pulc. 2019. Hierarchical Motion Tracking for UHD video processing. GitHub repository (2019). https://github.com/petrpulc/gpu_orb_trackerGoogle Scholar
Petr Pulc. 2021. Mask R-CNN Pedestrian Tracklets. https://doi.org/10.34740/KAGGLE/DS/1376245Google Scholar
Petr Pulc and Martin Holeňa. 2018. Hierarchical Motion Tracking Using Matching of Sparse Features. In 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS). IEEE, 449–456.Google Scholar
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arXiv (2018).Google Scholar
Edward Rosten and Tom Drummond. 2006. Machine learning for high-speed corner detection. In ECCV. Springer, 430–443.Google Scholar
Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In ICCV. IEEE, 2564–2571.Google Scholar
Jianbo Shi and Carlo Tomasi. 1993. Good features to track. Technical Report. Cornell University.Google Scholar
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE international conference on computer vision. 4489–4497.Google ScholarDigital Library
Qing Wang, Feng Chen, Wenli Xu, and Ming-Hsuan Yang. 2011. An experimental comparison of online object-tracking algorithms. In Wavelets and Sparsity XIV, Vol. 8138. International Society for Optics and Photonics.Google Scholar
Shu Wang, Huchuan Lu, Fan Yang, and Ming-Hsuan Yang. 2011. Superpixel tracking. In 2011 International Conference on Computer Vision. IEEE, 1323–1330.Google Scholar
Taiqing Wang, Shaogang Gong, Xiatian Zhu, and Shengjin Wang. 2014. Person re-identification by video ranking. In European conference on computer vision. Springer, 688–703.Google ScholarCross Ref
Nicolai Wojke and Alex Bewley. 2018. Deep cosine metric learning for person re-identification. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 748–756.Google Scholar
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962Google ScholarDigital Library
Ning Xu, Linjie Yang, Yuchen Fan, Dingcheng Yue, Yuchen Liang, Jianchao Yang, and Thomas Huang. 2018. Youtube-vos: A large-scale video object segmentation benchmark. arXiv preprint arXiv:1809.03327(2018).Google Scholar
Linjie Yang, Yuchen Fan, and Ning Xu. 2019. Video instance segmentation. CoRR abs/1905.04804(2019). https://arxiv.org/abs/1905.04804Google Scholar
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2020. FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking. arXiv preprint arXiv:2004.01888(2020).Google Scholar
Zongpu Zhang, Yang Hua, Tao Song, Zhengui Xue, Ruhui Ma, Neil Robertson, and Haibing Guan. 2018. Tracking-assisted Weakly Supervised Online Visual Object Segmentation in Unconstrained Videos. In Proceedings of the 26th ACM international conference on Multimedia. 941–949.Google ScholarDigital Library
Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. 2016. Mars: A video benchmark for large-scale person re-identification. In European Conference on Computer Vision. Springer, 868–884.Google ScholarCross Ref

Index Terms

Unsupervised Construction of Task-Specific Datasets for Object Re-identification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Multiple Object Tracking by Joint Head, Body Detection and Re-Identification
Intelligent Robotics and Applications
Abstract
Multi-object tracking (MOT) is an important problem in computer vision which has a wide range of applications. Formulating MOT as multi-task learning of object detection and re-Identification (re-ID) in a single network is appealing since it ...
Read More
Real-time object tracking using bounded irregular pyramids

Target representation and localization is a central component in visual object tracking. In this paper a new approach for target representation and localization is presented. This approach tackles two of the most important causes of failure in object ...
Read More
Hierarchical feature grouping for multiple object segmentation and tracking
IVCNZ '12: Proceedings of the 27th Conference on Image and Vision Computing New Zealand

In this paper, we propose a hierarchical feature grouping method for multiple object segmentation and tracking. The proposed method aims to segment and track objects in the object-level without prior knowledge about the scene and object. We firstly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology Applications
July 2021
103 pages
ISBN:9781450390521
DOI:10.1145/3477911

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Fine-tuning of Object Re-identification
Hierarchical Sparse Feature Tracking
Multiple Object Tracking
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 29
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Unsupervised Construction of Task-Specific Datasets for Object Re-identification

ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multiple Object Tracking by Joint Head, Body Detection and Re-Identification

Real-time object tracking using bounded irregular pyramids

Hierarchical feature grouping for multiple object segmentation and tracking

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Unsupervised Construction of Task-Specific Datasets for Object Re-identification

ICCTA '21: Proceedings of the 2021 7th International Conference on Computer Technology Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multiple Object Tracking by Joint Head, Body Detection and Re-Identification

Real-time object tracking using bounded irregular pyramids

Hierarchical feature grouping for multiple object segmentation and tracking

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media