Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Kasaei, S. Hamidreza; Oliveira, Miguel; Lim, Gi Hyun; Seabra Lopes, Luís; Tomé, Ana Maria

doi:10.1007/s10846-015-0189-z

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Published: 31 January 2015

Volume 80, pages 537–553, (2015)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

S. Hamidreza Kasaei¹,
Miguel Oliveira¹,
Gi Hyun Lim¹,
Luís Seabra Lopes^1,2 &
…
Ana Maria Tomé^1,2

903 Accesses
26 Citations
Explore all metrics

Abstract

3D object detection and recognition is increasingly used for manipulation and navigation tasks in service robots. It involves segmenting the objects present in a scene, estimating a feature descriptor for the object view and, finally, recognizing the object view by comparing it to the known object categories. This paper presents an efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner. In this paper, “open-ended” implies that the set of object categories to be learned is not known in advance. The training instances are extracted from on-line experiences of a robot, and thus become gradually available over time, rather than at the beginning of the learning process. This paper focuses on two state-of-the-art questions: (1) How to automatically detect, conceptualize and recognize objects in 3D scenes in an open-ended manner? (2) How to acquire and use high-level knowledge obtained from the interaction with human users, namely when they provide category labels, in order to improve the system performance? This approach starts with a pre-processing step to remove irrelevant data and prepare a suitable point cloud for the subsequent processing. Clustering is then applied to detect object candidates, and object views are described based on a 3D shape descriptor called spin-image. Finally, a nearest-neighbor classification rule is used to predict the categories of the detected objects. A leave-one-out cross validation algorithm is used to compute precision and recall, in a classical off-line evaluation setting, for different system parameters. Also, an on-line evaluation protocol is used to assess the performance of the system in an open-ended setting. Results show that the proposed system is able to interact with human users, learning new object categories continuously over time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

A survey on semi-supervised learning

Article Open access 15 November 2019

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

Article 06 March 2024

References

Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: Directions forward. Comp. Vision Image Underst. 117(8), 827–891 (2013)
Article Google Scholar
Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 1997. IEEE Computer Society, Washington, DC, USA (1997)
Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comp. Vision Image Underst. 81(2), 166–210 (2001)
Article MATH Google Scholar
Chauhan, A., Lopes, L.S.: Using spoken words to guide open-ended category formation. Cogn. Process. 12(4), 341–354 (2011)
Article Google Scholar
Collet Romea, A., Berenson, D., Srinivasa, S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: IEEE International Conference on Robotics and Automation, (ICRA 2009) (2009)
Dinh, H., Kropac, S.: Multi-resolution spin-images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 863–870 (2006)
Filipe, S., Alexandre, L.A.: A comparative evaluation of 3d keypoint detectors in a rgb-d object dataset. In: 9th International Conference on Computer Vision Theory and Applications. Lisbon, Portugal (2014)
Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Hertzberg, J., Zhang, J., Zhang, L., Rockel, S., Neumann, B., Lehmann, J., Dubba, K., Cohn, A.G., Saffiotti, A., Pecora, F., Mansouri, M., Konečný, S̆., Günther, M., Stock, S., Lopes, L.S., Oliveira, M., Lim, G.H., Kasaei, H., Mokhtari, V., Hotz, L., Bohlken, W.: The race project. KI - Künstliche Intelligenz, pp. 297–304 (2014). doi:10.1007/s13218-014-0327-y
Islam, M., Jahan, F., Min, J.H., hwan Baek, J.: Object classification based on visual and extended features for video surveillance application. In: Control Conference (ASCC 2011), 8th Asian, pp. 1398–1401 (2011)
Jeong, S., Lee, M.: Adaptive object recognition model using incremental feature representation and hierarchical classification. Neural Netw. 25, 130–140 (2012)
Article MATH Google Scholar
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern. Anal. Mach. Intell. 21(5), 433–449 (1999)
Article Google Scholar
Kasaei, H., Oliveira, M.R., Lim, G.H., Lopes, L.S., Tomé, A.M.: An interactive open-ended learning approach for 3d object recognition. In: Proceedings of the 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC) (2014)
Kirstein, S., Wersing, H., Gross, H.M., Körner, E.: A life-long learning vector quantization approach for interactive learning of multiple categories. Neural Netw. 28, 90–105 (2012)
Article Google Scholar
Kootstra, G., Ypma, J., De Boer, B.: Active exploration and keypoint clustering for object recognition. In: IEEE International Conference on Robotics and Automation, (ICRA 2008), pp. 1005–1010 (2008)
Liu, Y., Zha, H., Qin, H.: Shape topics: A compact representation and new algorithms for 3d partial shape retrieval. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 2, pp. 2025–2032 (2006)
Martinez Torres, M., Collet Romea, A., Srinivasa, S.: Moped: A scalable and low latency object recognition and pose estimation system. In: IEEE International Conference on Robotics and Automation, (ICRA 2010) (2010)
Mian, A., Bennamoun, M., Owens, R.: On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. Int. J. Comput. Vis. 89(2–3), 348–361 (2010)
Article Google Scholar
Oliveira, M., Lim, G.H., Seabra Lopes, L., Kasaei, H., Tome, A., Chauhan, A.: A perceptual memory system for grounding semantic representations in intelligent service robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2014)
Ozawa, S., Toh, S.L., Abe, S., Pang, S., Kasabov, N.: Incremental learning of feature space and classifier for face recognition. Neural Netw. 18(5–6), 575–584 (2005)
Article Google Scholar
Rockel, S., Neumann, B., Zhang, J., Dubba, S.K.R., Cohn, A.G., Konecny, S., Mansouri, M., Pecora, F., Saffiotti, A., Günther, M., et al.: An ontology-based multi-level robot architecture for learning from experiences. In: Proceedings of the AAAI Spring Symposium: Designing Intelligent Robots (2013)
Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1-3), 125–141 (2008)
Article Google Scholar
Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3d recognition and pose using the viewpoint feature histogram. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pp. 2155–2162. IEEE (2010)
Schulz, D., Burgard, W., Fox, D., Cremers, A.: Tracking multiple moving targets with a mobile robot using particle filters and statistical data association. In: IEEE International Conference on Robotics and Automation, (ICRA 2001), vol. 2, pp. 1665–1670 (2001)
Seabra Lopes, L., Chauhan, A.: How many words can my robot learn? An approach and experiments with one-class learning. Interact. Stud. 8(1), 53–81 (2007)
Article Google Scholar
Seabra Lopes, L., Chauhan, A.: Open-ended category learning for language acquisition. Connect. Sci 20(4), 277–297 (2008)
Article Google Scholar
Takamuku, S., Hosoda, K., Asada, M.: Shaking eases object category acquisition: Experiments with a robot arm. In: Proceedings of the Seventh International Conference on Epigenetic Robotics (2007)
Tombari, F.: Di Stefano, L.: Object recognition in 3d scenes with occlusions and clutter by hough voting. In: 4th Pacific-Rim Symposium on Image and Video Technology (PSIVT 2010), pp. 349–355 (2010)
Wohlkinger, W., Vincze, M.: Shape-based depth image to 3d model matching and classification with inter-view similarity. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), pp. 4865–4870 (2011)
Yeh, T., Darrell, T.: Dynamic visual category learning. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR 2008), pp. 1–8 (2008)

Download references

Author information

Authors and Affiliations

IEETA - Instituto de Engenharia Electrónica e Telemática de Aveiro, Universidade de Aveiro, Aviero, Portugal
S. Hamidreza Kasaei, Miguel Oliveira, Gi Hyun Lim, Luís Seabra Lopes & Ana Maria Tomé
Departamento de Electrónica, Telecomunicações e Informática, Universidade de Aveiro, Aveiro, Portugal
Luís Seabra Lopes & Ana Maria Tomé

Authors

S. Hamidreza Kasaei
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Gi Hyun Lim
View author publications
You can also search for this author in PubMed Google Scholar
Luís Seabra Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Ana Maria Tomé
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Hamidreza Kasaei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kasaei, S.H., Oliveira, M., Lim, G.H. et al. Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments. J Intell Robot Syst 80, 537–553 (2015). https://doi.org/10.1007/s10846-015-0189-z

Download citation

Received: 07 July 2014
Accepted: 09 January 2015
Published: 31 January 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10846-015-0189-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation