ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

GeNePy3D: a quantitative geometry python toolbox for bioimaging

[version 2; peer review: 2 approved]
PUBLISHED 17 Jun 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the NEUBIAS - the Bioimage Analysts Network gateway.

This article is included in the Bioinformatics gateway.

This article is included in the Python collection.

Abstract

The advent of large-scale fluorescence and electronic microscopy techniques along with maturing image analysis is giving life sciences a deluge of geometrical objects in 2D/3D(+t) to deal with. These objects take the form of large scale, localised, precise, single cell, quantitative data such as cells’ positions, shapes, trajectories or lineages, axon traces in whole brains atlases or varied intracellular protein localisations, often in multiple experimental conditions. The data mining of those geometrical objects requires a variety of mathematical and computational tools of diverse accessibility and complexity. Here we present a new Python library for quantitative 3D geometry called GeNePy3D which helps handle and mine information and knowledge from geometric data, providing a unified application programming interface (API) to methods from several domains including computational geometry, scale space methods or spatial statistics. By framing this library as generically as possible, and by linking it to as many state-of-the-art reference algorithms and projects as needed, we help render those often specialist methods accessible to a larger community. We exemplify the usefulness of the  GeNePy3D toolbox by re-analysing a recently published whole-brain zebrafish neuronal atlas, with other applications and examples available online. Along with an open source, documented and exemplified code, we release reusable containers to allow for convenient and wide usability and increased reproducibility.

Keywords

bioimage informatics, quantitative geometry, computational geometry, workflow, python

Revised Amendments from Version 1

Small updates to answer reviewer comments include removal of reference to 'large scale', a clarification of reasoning behind licensing choices and removal of mention of alignment algorithms. We also updated the title, to remove mention of large scale, and updated affiliation of one of the authors.

See the authors' detailed response to the review by Virginie Uhlmann

Introduction

Bioimage informatics aims at bringing microscopy into quantitative biology, associating higher level information to pixels to answer complex biological questions. In particular machine learning based techniques1 are easing the image analysis step, extracting geometrical objects from multidimensional images. But the next step, transforming that geometrical information into biological knowledge, involves a very diverse set of algorithmic tools in distinct communities, from spatial statistics2,3 to computational geometry4,5 or neuroinformatics6. Similarly, the software ecosystem around geometrical data analysis is very diverse and heterogeneous, with reference algorithm implementation spread across languages (Spatstat7 for spatial statistics in R, CGAL8 for computational geometry in C++) or across module in python (scipy9 for generic algorithms, anytree10 for trees, trimesh11 for meshes etc), a lack of generic geometric data exchange format and standard graphical tools like Fiji12 and Icy13 being limited in the flexibility of the analysis easily available. To address this problem, we propose GeNePy3D14,15, a python library meant as a ’middleware’ library to facilitate building data analysis workflows for geometrical objects by providing one convenient API for geometrical data I/O, conversion and interaction between geometrical objects and access to many common and less common algorithm. We will introduce below the architecture of the library and show one example workflow, re-analysing a published dataset of zebrafish brain neuronal traces by combining traces and brain region to extract quantitative metrics per region.

Methods

Architecture

GeNePy3D14,15 was designed with any computational-minded life scientist as target user, to provide a simple and homogeneous API. GeNePy3D consists of four main objects (Figure 1) corresponding to four basic geometrical objects of interest: Points (cells or intracellular object positions...), Curve (particles tracks, neurite branches, microtubules...), Tree (neuronal traces, dividing cell tracks) and Surface (cell surface or other tissue level structure...). Each of them has its own attributes, functions and I/O. We provide ways to transform between them, (decomposing a Tree into sequences of Curve, or converting Points into the Surface that enclose them). Interaction between objects of the same/different classes are also available (optimal transport-based distance between two Points, intersection between Curve and Surface, etc.) Altogether, GeNePy3D offers a unified and seamless way to analyse complex geometrical biological data.

8f99870a-1d8c-40c7-ac82-36a4bc282261_figure1.gif

Figure 1. GeNePy3D architecture.

The library is structured around four main classes for four principal geometrical objects, and propose various functions acting on them or converting between them, either implemented anew or linking to recognized library.

Implementation

GeNePy3D is implemented in Python, taking advantage of a high-level programming language with simple syntax and many open-source packages. We reused algorithms and functions available from various recognised packages when possible, and developed our own implementation when needed, within a unique interface. Most of the packages we link to are available from the Python package Index (PyPi) and can be easily installed via Python package manager (pip). Figure 1 lists out some functions with colors denoting the package used. Beyond standard ones, more specific ones includes AnyTree for tree manipulation, TriMesh for surface manipulation or ScikitLearn for machine learning tasks. Other feature are listed as optional, as they come from harder to install or less recognized sources, including the C++ library CGAL, only partially available in Python, for generic object interaction in 3D, or the optimal transport method implemented in PyEMD. Some original development available in GeNePy3d include an algorithm to compute local 3D scale we recently published16. Many common input/output formats are supported including SWC for Tree, CSV, XYZ for Points/Curve and STL and OFF for Surface. We release the library in two packages for licensing issues (see licenses below).

Operation

GeNePy3D works with Python 3.6. Details of the specific software requirements, documentation including the installation instruction and Python notebooks examples can be accessed via https://genepy3d.gitlab.io. Example pipelines using GeNePy3D are run using Jupyter notebooks. To ease the use and deployment of GeNePy3D we provide ready to use docker containers at https://gitlab.com/genepy3d/genepy3d_dockers.

Use case

To exemplify the use of GeNePy3D14,15, we reanalyzed a recently published dataset containing up to 2000 traced neurons across the whole brain of larval zebrafish17. The authors annotated 36 symmetric regions and established a connectivity atlas for the neurons within these regions. Figure 2A illustrates a possible workflow using GeNePy3D for reanalyzing the dataset. The inputs consist of neuronal traces in SWC formats and a 3D volume in NRRD format containing different annotated labels for the 36 brain regions. The traces are imported into GeNePy3D under Tree objects, while the regions are reconstructed into Surface objects using marching cube algorithm. Figure 2B top illustrates the outline of the Tectum along with all neuronal traces arriving to this brain region. We then extracted branching point positions from the neuronal traces (Tree→Points), decomposed them into sections (Tree→Curves) and checked whether the branching points or curve sections lies within or outside each region (interaction with Surface). Examples of decomposing the traces, computing sections inside and outside the Tectum region are shown in Figure 2B bottom. Finally, we measured within the brain regions neuronal lengths, number of branching points, tortuosities (proportion of length over distance between two end points of the curve), and local 3D scales16 (scale at which the curve transforms to 3D).

8f99870a-1d8c-40c7-ac82-36a4bc282261_figure2.gif

Figure 2. Example workflow for analysing of Larval zebrafish brain dataset17 with GeNePy3D.

(A) Workflow schema. (B) Example of intermediate data and operations from the workflow: outline surface of the Tectum and all neurons arriving to it (top), decomposition of a neuronal tree into sections (displayed with random colors) based on branching positions (bottom left), and computing of neuronal sections inside/outside the Tectum (bottom right). (C) Resulting quantifications: distribution of average neuronal lengths for groups of neurons arriving to/originating from/passing all brain regions (top), and heat map of averaged neuronal lengths over each brain region for group of neurons arriving to the brain regions (bottom). The regions with small number of arriving neurons (< 10 neurons) are excluded (in gray). The letters (i-iv) in (B) illustrate some steps in (A).

Part of the resulting quantification obtained are shown Figure 2C. The top graph shows a longer neuronal length on averaged for groups of neurons arriving to and originating from the regions compared to ones passing through. Figure 2C bottom shows a map of the averaged neuronal length for each brain regions for arriving neurons showing that neurons coming from fore- and midbrain are much longer than those from hindbrain. Detail of all processing steps and additional quantified results can be found at https://gitlab.com/genepy3d/genepy3d_examples/-/tree/master/zebrafish_atlas.

Conclusions

The advent of machine learning and developments in biological imaging is leading to numerous geometrical datasets, and GeNePy3d14,15 aims at enabling complex analysis workflows based on those objects. But as in other aspects of bioimage informatics, the key will be for the community to work together and define common formats and structures for region of interests and geometric objects to ease the interactions between the various visualisation, data management or analysis tools, and convert raw images to biological knowledge. GeNePy3d is ready to become a component of that ecosystem.

Data availability

Source data

The data used for Figure 2 has been published in https://fishatlas.neuro.mpg.de. To download the traces, we choose ’single axons’, ’connect without logging in’, chose ’Kunst et al. 2019’ in publications; once all neurons are loaded the download option appears.

Software availability

GeNePy3D is hosted at: https://genepy3d.gitlab.io and easily installable through the PyPi tool.

Source code available at: https://gitlab.com/genepy3d.

Archived source code at time of publication:

License: The library is distributed as two packages. The main package GeNePy3D14 is under a BSD 3-Clause Licence, while features that necessitate linking to GPL-licensed code are distributed separately in GeNePy3D_GPL15, under the GNU General Public License v3.0.

We wanted to release GeNePy3D under a BSD license but could not avoid the use of some GPL license software, forcing us to such a solution. Practical consequences should be minimal in most circumstances thanks to modern python package management.

The source code for the analysis of Figure 2 is available at https://gitlab.com/genepy3d/genepy3d_examples/-/tree/master/zebrafish_atlas.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 26 Nov 2020
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Phan MS and Chessel A. GeNePy3D: a quantitative geometry python toolbox for bioimaging [version 2; peer review: 2 approved] F1000Research 2021, 9:1374 (https://doi.org/10.12688/f1000research.27395.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 17 Jun 2021
Revised
Views
8
Cite
Reviewer Report 24 Jun 2021
Virginie Uhlmann, Wellcome Genome Campus, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK 
Approved
VIEWS 8
Thank you for providing in-depth clarifications to all of my points and amending ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Uhlmann V. Reviewer Report For: GeNePy3D: a quantitative geometry python toolbox for bioimaging [version 2; peer review: 2 approved]. F1000Research 2021, 9:1374 (https://doi.org/10.5256/f1000research.57818.r87836)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 26 Nov 2020
Views
12
Cite
Reviewer Report 16 Mar 2021
Yizhi Wang, Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA 
Approved
VIEWS 12
The author design a Python package to analyze the various geometric objects extracted from microscopy images of the brain. The package combines tools from computational geometry, spatial statistic, and other fields into a unified API. The usefulness of the package ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Wang Y. Reviewer Report For: GeNePy3D: a quantitative geometry python toolbox for bioimaging [version 2; peer review: 2 approved]. F1000Research 2021, 9:1374 (https://doi.org/10.5256/f1000research.30274.r80682)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
25
Cite
Reviewer Report 15 Dec 2020
Virginie Uhlmann, Wellcome Genome Campus, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK 
Approved with Reservations
VIEWS 25
The authors describe GeNePy3D, a Python toolbox that facilitates the processing and quantification of geometrical objects extracted from images. The goal is for this package to bring together the methods provided in various existing geometrical processing libraries such as PyEMD, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Uhlmann V. Reviewer Report For: GeNePy3D: a quantitative geometry python toolbox for bioimaging [version 2; peer review: 2 approved]. F1000Research 2021, 9:1374 (https://doi.org/10.5256/f1000research.30274.r75563)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 17 Jun 2021
    Anatole Chessel, Laboratory of Optics and Biosciences, CNRS, INSERM, Ecole polytechniqe, Institut polytechnique de Paris, Palaiseau, 91120, France
    17 Jun 2021
    Author Response
    The authors describe GeNePy3D, a Python toolbox that facilitates the processing and quantification of geometrical objects extracted from images. The goal is for this package to bring together the methods ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 17 Jun 2021
    Anatole Chessel, Laboratory of Optics and Biosciences, CNRS, INSERM, Ecole polytechniqe, Institut polytechnique de Paris, Palaiseau, 91120, France
    17 Jun 2021
    Author Response
    The authors describe GeNePy3D, a Python toolbox that facilitates the processing and quantification of geometrical objects extracted from images. The goal is for this package to bring together the methods ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 26 Nov 2020
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.