short-paper

Public Access

Conversational Group Detection with Graph Neural Networks

Authors:
Sydney Thompson

Yale University, USA

Yale University, USA
View Profile

,
Abhijit Gupta

Yale University, USA

Yale University, USA
View Profile

,
Anjali W. Gupta

Yale University, USA

Yale University, USA
View Profile

,
Austin Chen

Yale University, USA

Yale University, USA
View Profile

,
Marynel Vázquez

Yale University, USA

Yale University, USA
View Profile

ICMI '21: Proceedings of the 2021 International Conference on Multimodal InteractionOctober 2021Pages 248–252https://doi.org/10.1145/3462244.3479963

Published:18 October 2021Publication History

ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction

Pages 248–252

ABSTRACT

We study conversational group detection in varied social scenes using a message-passing Graph Neural Network (GNN) in combination with the Dominant Sets clustering algorithm. Our approach first describes a scene as an interaction graph, where nodes encode individual features and edges encode pairwise relationship data. Then, it uses a GNN to predict pairwise affinity values that represent the likelihood of two people interacting together, and computes non-overlapping group assignments based on these affinities. We evaluate the proposed approach on the Cocktail Party and MatchNMingle datasets. Our results suggest that using GNNs to leverage both individual and relationship features when computing groups is beneficial, especially when more features are available for each individual.

Supplemental Material

conversational_group_detection_presentation.mp4

mp4

13.4 MB

Download

References

Xavier Alameda-Pineda, Jacopo Staiano, Ramanathan Subramanian, Ligia Batrinca, Elisa Ricci, Bruno Lepri, Oswald Lanz, and Nicu Sebe. 2016. SALSA: A Novel Dataset for Multimodal Group Behavior Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 8(2016), 1707–1720.Google ScholarDigital Library
Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, 2018. Relational inductive biases, deep learning, and graph networks. arxiv:1806.01261 [cs.LG]Google Scholar
Filippo Maria Bianchi, Daniele Grattarola, and Cesare Alippi. 2020. Spectral Clustering with Graph Neural Networks for Graph Pooling. In Proceedings of the 37th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 874–883.Google ScholarDigital Library
Dan Bohus, Chit W. Saw, and Eric Horvitz. 2014. Directions Robot: In-the-Wild Experiences and Lessons Learned. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems(Paris, France) (AAMAS ’14). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 637–644.Google Scholar
Laura Cabrera-Quiros, Andrew Demetriou, Ekin Gedik, Leander van der Meij, and Hayley Hung. 2021. The MatchNMingle Dataset: A Novel Multi-Sensor Resource for the Analysis of Social Interactions and Group Dynamics In-the-Wild During Free-Standing Conversations and Speed Dates. IEEE Transactions on Affective Computing 12, 1 (2021), 113–130. https://doi.org/10.1109/TAFFC.2018.2848914Google ScholarCross Ref
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 257–266.Google ScholarDigital Library
Marco Cristani, Loris Bazzani, Giulia Paggetti, Andrea Fossati, Diego Tosato, Alessio Del Bue, Gloria Menegaz, and Vittorio Murino. 2011. Social Interaction Discovery by Statistical Analysis of F-formations. In Proceedings of the British Machine Vision Conference (BMVC), Vol. 2. Citeseer, BMVA Press, 4.Google ScholarCross Ref
Marco Cristani, Ramachandra Raghavendra, Alessio Del Bue, and Vittorio Murino. 2013. Human behavior analysis in video surveillance: A Social Signal Processing perspective. Neurocomputing 100(2013), 86–97.Google ScholarDigital Library
Haowen Deng, Tolga Birdal, and Slobodan Ilic. 2018. PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors. In Proceedings of the European Conference on Computer Vision (ECCV). 602–618.Google ScholarDigital Library
Eyal Dim and Tsvi Kuflik. 2014. Automatic Detection of Social Behavior of Museum Visitor Pairs. ACM Transactions on Interactive Intelligent Systems (TiiS) 4, 4(2014), 1–30.Google ScholarDigital Library
Ekin Gedik and Hayley Hung. 2018. Detecting Conversing Groups Using Social Dynamics From Wearable Acceleration: Group Size Awareness. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–24.Google ScholarDigital Library
Edward Twitchell Hall. 1966. The Hidden Dimension. Vol. 609. Garden City, NY: Doubleday.Google Scholar
William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 1025–1035.Google ScholarDigital Library
Hooman Hedayati, Daniel Szafir, and Sean Andrist. 2019. Recognizing F-Formations in the Open World. In 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). 558–559.Google Scholar
Hayley Hung and Ben Kröse. 2011. Detecting F-Formations as Dominant Sets. In Proceedings of the 13th International Conference on Multimodal Interfaces (Alicante, Spain) (ICMI ’11). Association for Computing Machinery, New York, NY, USA, 231–238. https://doi.org/10.1145/2070481.2070525Google ScholarDigital Library
Junko Ichino, Kazuo Isoda, Tetsuya Ueda, and Reimi Satoh. 2016. Effects of the Display Angle on Social Behaviors of the People around the Display: A Field Study at a Museum. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW ’16). Association for Computing Machinery, New York, NY, USA, 26–37.Google ScholarDigital Library
Manuela Jungmann, Richard Cox, and Geraldine Fitzpatrick. 2014. Spatial Play Effects in a Tangible Game with an F-Formation of Multiple Players. In Proceedings of the Fifteenth Australasian User Interface Conference - Volume 150(Auckland, New Zealand) (AUIC ’14). Australian Computer Society, Inc., AUS, 57–66.Google Scholar
Adam Kendon. 1990. Conducting interaction: Patterns of behavior in focused encounters. Vol. 7. CUP Archive.Google Scholar
Thomas N. Kipf and Max Welling. 2016. Semi-Supervised Classification with Graph Convolutional Networks. arxiv:1609.02907 [cs.LG]Google Scholar
Xuelei Li, Liangkui Ding, Li Wang, and Fang Cao. 2017. FPGA accelerates deep residual learning for image recognition. In 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). 837–840.Google Scholar
Nicolai Marquardt, Ken Hinckley, and Saul Greenberg. 2012. Cross-Device Interaction via Micro-Mobility and f-Formations. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (Cambridge, Massachusetts, USA) (UIST ’12). Association for Computing Machinery, 13–22.Google ScholarDigital Library
Alejandro Moreno, Robby van Delden, Ronald Poppe, and Dennis Reidsma. 2013. Socially Aware Interactive Playgrounds. IEEE Pervasive Computing 12, 3 (2013), 40–47.Google ScholarDigital Library
Massimiliano Pavan and Marcello Pelillo. 2006. Dominant sets and pairwise clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 1(2006), 167–172.Google ScholarCross Ref
Ashwini Pokle, Roberto Martín-Martín, Patrick Goebel, Vincent Chow, Hans M Ewald, Junwei Yang, Zhenkai Wang, Amir Sadeghian, Dorsa Sadigh, Silvio Savarese, 2019. Deep Local Trajectory Replanning and Control for Robot Navigation. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 5815–5822.Google Scholar
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.Google Scholar
Jorge Rios-Martinez, Anne Spalanzani, and Christian Laugier. 2015. From Proxemics Theory to Socially-Aware Navigation: A Survey. International Journal of Social Robotics 7, 2 (2015), 137–153.Google ScholarCross Ref
Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, and Peter Battaglia. 2018. Graph Networks as Learnable Physics Engines for Inference and Control. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80). PMLR, 4470–4479.Google Scholar
Francesco Setti, Hayley Hung, and Marco Cristani. 2013. Group detection in still images by F-formation modeling: A comparative study. In 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS). IEEE, 1–4.Google ScholarCross Ref
Francesco Setti, Chris Russell, Chiara Bassetti, and Marco Cristani. 2015. F-formation detection: Individuating free-standing conversational groups in images. PloS one 10, 5 (2015), e0123783.Google ScholarCross Ref
Mason Swofford, John Peruzzi, Nathan Tsoi, Sydney Thompson, Roberto Martín-Martín, Silvio Savarese, and Marynel Vázquez. 2020. Improving Social Awareness Through DANTE: Deep Affinity Network for Clustering Conversational Interactants. Proc. ACM Hum.-Comput. Interact. 4, CSCW1, Article 020 (May 2020), 23 pages.Google ScholarDigital Library
Anton Tsitsulin, John Palowitch, Bryan Perozzi, and Emmanuel Müller. 2020. Graph Clustering with Graph Neural Networks. arxiv:2006.16904 [cs.LG]Google Scholar
Sebastiano Vascon, Eyasu Z. Mequanint, Marco Cristani, Hayley Hung, Marcello Pelillo, and Vittorio Murino. 2016. Detecting conversational groups in images and sequences: A robust game-theoretic approach. Computer Vision and Image Understanding 143 (2016), 11–24. https://doi.org/10.1016/j.cviu.2015.09.012Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kai ser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.Google ScholarDigital Library
Marynel Vázquez, Elizabeth J. Carter, Braden McDorman, Jodi Forlizzi, Aaron Steinfeld, and Scott E. Hudson. 2017. Towards Robot Autonomy in Group Conversations: Understanding the Effects of Body Orientation and Gaze. In 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI. IEEE, 42–52.Google ScholarDigital Library
Marynel Vázquez, Aaron Steinfeld, and Scott E. Hudson. 2015. Parallel detection of conversational groups of free-standing people and tracking of their lower-body orientation. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 3010–3017. https://doi.org/10.1109/IROS.2015.7353792Google ScholarDigital Library
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?arxiv:1810.00826 [cs.LG]Google Scholar
Xing Xu, Tan Wang, Yang Yang, Lin Zuo, Fumin Shen, and Heng Tao Shen. 2020. Cross-Modal Attention With Semantic Consistence for Image–Text Matching. IEEE transactions on neural networks and learning systems 31, 12(2020), 5412–5425.Google ScholarCross Ref
Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation Learning with Differentiable Pooling. In Proceedings of the 32nd International Conference on Neural Information Processing Systems(Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 4805–4815.Google ScholarDigital Library
Ting Yu, Ser-Nam Lim, Kedar Patwardhan, and Nils Krahnstoever. 2009. Monitoring, recognizing and discovering social networks. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1462–1469.Google ScholarCross Ref
Manzil Zaheer, Satwik Kottur, Siamak Ravanbhakhsh, Barnabás Póczos, Ruslan Salakhutdinov, and Alexander J Smola. 2017. Deep Sets. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 3394–3404.Google Scholar
Gloria Zen, Bruno Lepri, Elisa Ricci, and Oswald Lanz. 2010. Space speaks: towards socially and personality aware visual surveillance. In Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis. 37–42.Google ScholarDigital Library

Recommendations

Edge Based Graph Neural Network to Recognize Semigraph Representation of English Alphabets
MIKE 2013: Proceedings of the First International Conference on Mining Intelligence and Knowledge Exploration - Volume 8284

Graph Neural Network based on edges is introduced in this paper and is used to recognize the English uppercase alphabets treating their corresponding graphs as semigraphs. Graph Neural Network(GNN) is a connectionist model comprising of two feedforward ...
Read More
Graph Neural Networks: Taxonomy, Advances, and Trends
Graph neural networks provide a powerful toolkit for embedding real-world graphs into low-dimensional spaces according to specific tasks. Up to now, there have been several surveys on this topic. However, they usually lay emphasis on different angles so ...
Read More
Graph Neural Networks: Foundation, Frontiers and Applications
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

The field of graph neural networks (GNNs) has seen rapid and incredible strides over the recent years. Graph neural networks, also known as deep learning on graphs, graph representation learning, or geometric deep learning, have become one of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction
October 2021
876 pages
ISBN:9781450384810
DOI:10.1145/3462244
Editors:
Zakia Hammal
Carnegie Mellon University
,
Carlos Busso
University of Texas at Dallas
,
Catherine Pelachaud
CNRS - ISIR, Sorbonne University
,
Sharon Oviatt
Monash University
,
Albert Ali Salah
Utrecht University and Boğaziçi University
,
Guoying Zhao
University of Oulu
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
F-formation
clustering
graph neural network
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 636
  Total Downloads
- Downloads (Last 12 months)287
- Downloads (Last 6 weeks)48
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Conversational Group Detection with Graph Neural Networks

ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction

ABSTRACT

Supplemental Material

References

Cited By

Recommendations

Edge Based Graph Neural Network to Recognize Semigraph Representation of English Alphabets

Graph Neural Networks: Taxonomy, Advances, and Trends

Graph Neural Networks: Foundation, Frontiers and Applications