research-article

VESS: Variable Event Stream Structure for Event-based Instance Segmentation Benchmark

Authors:
Sifan Yang

Tsinghua Shenzhen, International Graduate School

Tsinghua Shenzhen, International Graduate School
View Profile

,
Qi Zheng

Tsinghua University

Tsinghua University
View Profile

,
Xiaowei Hu

Tsinghua University

Tsinghua University
View Profile

,
Guijin Wang

Tsinghua University, Beijing National Research Center for Information

Tsinghua University, Beijing National Research Center for Information
View Profile

ICDSP '20: Proceedings of the 2020 4th International Conference on Digital Signal ProcessingJune 2020Pages 112–116https://doi.org/10.1145/3408127.3408178

Published:10 September 2020Publication History

ICDSP '20: Proceedings of the 2020 4th International Conference on Digital Signal Processing

Pages 112–116

ABSTRACT

Comparing with traditional frame-based camera, event camera (also known as dynamic vision sensor) has received increasing attention due to various outstanding advantages. Inspired by biology, the camera naturally captures the dynamics of a scene with low latency, filtering out redundant information with low power consumption. Deep learning based instance segmentation, which are influential research in visual recognition tasks, could potentially take advantage of the benefits of event camera, but the event based application combined with deep learning still faces some challenges. In this work, we propose to develop event-based instance segmentation that unlocks the potential of the event data by combining event camera and deep learning. To make the best out of the event data, we propose a novel event representation method - variable event stream structure (VESS) for event-based instance segmentation. However, event-based datasets are rare, and none of them contains instance segmentation labels, we produce the accurate label specialized for instance segmentation on event camera. The proposed method before is verified on the dataset, and our work can reach an average Intersection over Union (IOU) of 55.75% in real-time and work properly under challenging environment like motion blur and extreme lighting condition.

References

P. Lichtsteiner, C. Posch, and T. Delbruck. 2008. A 128x128 120 db 15μs latency asynchronous temporal contrast vision sensor. IEEE journal of solid-state circuits, 43(2):566--576.Google Scholar
A. I. Maqueda, A. Loquercio, G. Gallego, N. Garcıa, and D. Scaramuzza. 2018. Event-based vision meets deep learning on steering prediction for self-driving cars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5419--5427.Google Scholar
Kim, H., Leutenegger, S., & Davison, A. J. (2016, October). 2018. Real-time 3D reconstruction and 6-DoF tracking with an event camera. In European Conference on Computer Vision (pp. 349-364). Springer, Cham.Y. Zhou, G. Gallego, H. Rebecq, L. Kneip, H. Li, and D. Scaramuzza. Semi-dense 3d reconstruction with a stereo event camera. ECCV.Google Scholar
H. Rebecq, G. Gallego, E. Mueggler, and D. Scaramuzza. 2017. Emvs: Event-based multi-view stereo3d reconstruction with an event camera in real-time. International Journal of Com- puter Vision, pages 1--21.Google Scholar
H. Rebecq, T. Horstschaefer, G. Gallego, and D. Scara-muzza. 2017. Evo: A geometric approach to event-based 6-dof parallel tracking and mapping in real time. IEEE Robotics and Automation Letters, 2(2):593--600.Google ScholarCross Ref
M. Liu and T. Delbruck. 2018. Adaptive time-slice block-matching optical flow algorithm for dynamic vision sensors. Technical report.Google Scholar
Hu, X., Wang, G., Zhang, Y., Yang, H., & Zhang, S. 2019. Large depth-of-field 3D shape measurement using an electrically tunable lens. Optics express, 27(21), 29697--29709.Google Scholar
Yang, S., Wang, J., Wang, G., Hu, X., Zhou, M., & Liao, Q. 2017. Robust RGB-D SLAM in dynamic environment using faster R-CNN. In 2017 3rd IEEE International Conference on Computer and Communications (ICCC) (pp. 2398-2402). IEEE.Google ScholarCross Ref
Xiao, Y., Wang, G., Hu, X., Shi, C., & Meng, L. 2019. Guided, Fusion-Based, Large Depth-of-field 3D Imaging Using a Focal Stack. Sensors 2019, 19(22), 4845.Google Scholar
Chen, N. F. 2018. Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 644-653).Google ScholarCross Ref
Chen, G., Cao, H., Ye, C., Zhang, Z., Liu, X., Mo, X., & Knoll, A. C. 2019. Multi-cue event information fusion for pedestrian detection with neuromorphic vision sensors. Frontiers in neurorobotics, 13, 10.Google Scholar
Alonso, I., & Murillo, A. C. 2019. EV-SegNet: Semantic Segmentation for Event-based Cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 0-0).Google ScholarCross Ref
K. He, G. Gkioxari, P. Doll'ar, and R. Girshick. Mask R-CNN. 2017. In IEEE Int. Conf. on Computer Vision, pages 2980--2988.Google Scholar
C. Dechesne, C. Mallet, A. Le Bris, and V. Gouet-Brunet. 2017. Semantic segmentation of forest stands of pure species com-bining airborne lidar data and very high resolution multispec-tral imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 126:129--145.Google ScholarCross Ref
X. Lagorce, G. Orchard, F. Galluppi, B. E. Shi, and R. B. Benosman. 2017. Hots: a hierarchy of event-based time-surfaces for pattern recognition. IEEE transactions on pattern analysis and machine intelligence, 39(7):1346--1359.Google Scholar
A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, and R. Benosman. 2018. Hats: Histograms of averaged time surfaces for robust event-based object classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1731--1740.Google Scholar
A. Z. Zhu, L. Yuan, K. Chaney, and K. Daniilidis. 2019. Evflownet: Self-supervised optical flow estimation for event-based cameras. arXiv preprint arXiv:1802.06898.Google Scholar
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In Proc. of IEEE conf. on CVPR, pages 3213--3223.Google Scholar
D. Bolya, C. Zhou, F. Xiao, Y. J. Lee. 2019. YOLACT: Real-time Instance Segmentation. ICCV.Google Scholar

Index Terms

VESS: Variable Event Stream Structure for Event-based Instance Segmentation Benchmark
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation

Recommendations

Standard and Event Cameras Fusion for Feature Tracking
ICMVA '21: Proceedings of the 2021 International Conference on Machine Vision and Applications

Standard cameras are frame-based sensors that capture the scene at a fixed rate. They cannot provide information between two frames and suffer from the motion blur problem in high-speed robotic and vision applications. By contrast, event-based cameras ...
Read More
Event Camera Survey and Extension Application to Semantic Segmentation
IPMV '22: Proceedings of the 4th International Conference on Image Processing and Machine Vision

Event cameras are a kind of radically novel vision sensors. Unlike traditional standard cameras which acquire full images at a fixed rate, event cameras capture brightness changes for each pixel asynchronously. As a result, the output of event camera is ...
Read More
How to Learn a Domain-Adaptive Event Simulator?
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

The low-latency streams captured by event cameras have shown impressive potential in addressing vision tasks such as video reconstruction and optical flow estimation. However, these tasks often require massive training event streams, which are expensive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICDSP '20: Proceedings of the 2020 4th International Conference on Digital Signal Processing
June 2020
383 pages
ISBN:9781450376877
DOI:10.1145/3408127

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Event camera
Event-based instance segmentation
Labeled dataset
VESS
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

VESS: Variable Event Stream Structure for Event-based Instance Segmentation Benchmark

ICDSP '20: Proceedings of the 2020 4th International Conference on Digital Signal Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Standard and Event Cameras Fusion for Feature Tracking

Event Camera Survey and Extension Application to Semantic Segmentation

How to Learn a Domain-Adaptive Event Simulator?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

VESS: Variable Event Stream Structure for Event-based Instance Segmentation Benchmark

ICDSP '20: Proceedings of the 2020 4th International Conference on Digital Signal Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Standard and Event Cameras Fusion for Feature Tracking

Event Camera Survey and Extension Application to Semantic Segmentation

How to Learn a Domain-Adaptive Event Simulator?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media