Impact Statement
This article discusses a 3D cell segmentation method for detecting individual cells from lightsheet volumetric images of cardiac tissue of murine embryo. Automatized 3D segmentation is a crucial step to deliver accurate, quantitative descriptions of cell morphology in 3D—relevant to the study of cell structure underlying cardiac muscle growth during development, with potential applications in tissue engineering for cardiac regenerative medicine. The article has been jointly written by development biologists with expertise in the quantitative analysis of heart development, and computer scientists with expertise in the development of mathematical models for image and data analysis. This collaboration makes this research an authentic endeavor to develop impactful tools and knowledge relevant to the biology of the cardiac muscle.
1. Introduction
Recent works have shown coordinated cell division during the embryonic stage^{(Reference Economou, Brock, Cobourne and Green1–Reference Li, Miao and Shieh3)} regulates the orientation of myofibres and consequently determines heart contraction patterns. Studying morphogenesis of a developing heart is thus crucial to accelerate fundamental research in cardiology. Novel imaging techniques, such as lightsheet microscopy have allowed larger depth of field imaging, facilitating cellular level 3D imaging of murine embryo heart. The quantification of cellular morphology in 3D is, however, restricted by the availability of dedicated automatic computational methods. In this article, we propose a method to segment cells from 3D lightsheet microscopy images of murine myocardial tissue (with membrane and nuclei fluorescence) to facilitate precise extraction of morphological features of cardiomyocytes.
While lightsheet microscopy allows larger depth of field imaging, this introduces greater inference from dense tissue leading to significant intensity heterogeneity. This adds to the challenges in identification of individual cells (segmentation) from dense tissue samples imaged in 3D. The tissue characteristics of the imaging organ/organism introduce significant signal variation and structural alteration. In comparison to the more commonly used prototype organisms—zebrafish or Caenorhabditis elegans, lower transparency of mouse myocardial tissue leads to low signal to noise ratio. Additionally, the myocardial tissue, unlike the epithelial, has more irregular architecture as shown in Figure 1. Due to this nonuniform structure of myocardium in the heart, regions consisting of nonmyocardial cells (shown in Figure 1 with blue arrows) are introduced in the images. It is necessary to identify and discard these regions at an early computational stage for a robust morphogenesis analysis of cardiomyocytes.
1.1. Related works
In the literature, few works have proposed methods for cell segmentation from 3D tissue samples. The more commonly used approaches^{(Reference Cao, Wong, Zhao and Yan4–Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo8)} employ only cell membrane images (without nuclei staining). These methods mainly employ techniques for cell membrane image enhancement and subsequent detection of the intermembranous regions to identify individual cells. The efficiency of these methods relies mainly on 3D membrane enhancement to deal with the intensity inhomogeneity and incomplete structures. Various techniques, such as Hessianbased image enhancement^{(}Reference Cao, Wong, Zhao and Yan^{4}^{,} Reference Mosaliganti, Noche, Xiong, Swinburne and Megason^{7}^{)}, unsupervised clustering,^{(}Reference Cao, Zhao and Yan^{5}^{)} and supervised machine learning^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)} have been adopted in literature to obtain a noisefree prediction of the cell membrane. The enhanced images of cell membrane are then subject to 3D Watershed^{(Reference Cao, Wong, Zhao and Yan4–Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo8)} to achieve final cell segmentation. A slightly different approach is proposed in Ref. (Reference Stegmaier, Amat and Lemon9), where watershed segmentation is first employed to detect cells from enhanced 2D membrane images. The detected cellular regions are then merged axially based on various constraints to obtain final 3D cell reconstructions. Using solely the cell membrane for segmentation is intuitive only when nuclei information is unavailable. But as pointed out in Ref. (Reference Cao, Wong, Zhao and Yan4), these approaches do not allow to discriminate between cellular and noncellular regions. This leads to significant false positives in more complex and developed tissue, such as the myocardium.
In presence of both membrane and nuclei images, the nuclei information can aid in determining precise cell locations. Very few works in literature^{(}Reference Azuma and Onami^{10}^{,} Reference Pop, Dufour and Le Garrec^{11}^{)} have employed multifluorescence images, cell membrane and nuclei, to guide cell segmentation. Azuma and Onami^{(}Reference Azuma and Onami^{10}^{)} achieve this by using detected nuclei regions as the seed for watershed method to segment cells from C. elegans embryo images. The nuclei segmentation is performed using the difference of Gaussian filtering. This approach relies heavily on the correct choice of parameters and is inadequate in spatially resolving nuclei from lowresolution 3D images. An alternative approach was developed by Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)}, where the intensity clustering technique was employed for 3D nuclei detection from confocal microscopy images of murine heart. This was followed by coupled active mesh initialized at the detected nuclei and evolved using the membrane cues to obtain 3D cell segmentation. The 3D active mesh approach is robust to noisy and discontinuous membrane structure avoiding leakage in the cell membrane. However, the clustering approach is dependent on image intensity and tends to oversegment nuclei due to the intensity heterogeneity. Neither of these methods combines the nuclei and membrane information to aid in their detection and for the final cell segmentation.
In order to deal with the aforementioned challenges and shortcomings of stateofart methods in 3D cell segmentation from tissues, we propose a method that exploits the robustness of convolutional neural network for detection, efficiency of a graphbased association method, and active mesh to obtain 3D reconstruction of cardiomyocytes Ca3D.
1.2. Main contributions
In this article, we propose a novel approach to 3D cell segmentation from multifluorescence image of cardiac tissue of a mouse embryo. A single convolutional neural network is designed to first obtain nuclei and membrane detection from 2D slices of a 3D volumetric image. The neural network is designed to disregard nuclei from nonmyocardial regions at an initial computational stage. The 3D nuclei reconstruction is then obtained via solving an optimization problem, which associates detected 2D nuclei along the axial direction. This global nuclei association approach aids in delineating cells along axial direction while maintaining a more regular 3D structure. A linear optimization embedded in an alternating direction method of multipliers is designed to solve the association problem. The global association problem can easily employ biologically inspired constraints within the optimization framework. Further, this also eliminates nuisances of overdetection due to intensity heterogeneity as observed in clustering^{(}Reference Pop, Dufour and Le Garrec^{11}^{)} and local heuristicsbased approaches^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}. Finally, an active mesh initialized at the detected 3D nuclei is evolved using cell membranebased force function to segment individual cells from the cluttered environment. The overview of the proposed method is shown in Figure 2. The main highlights of the article are summarized as follows:

– A hybrid pipeline that exploits a multitask neural network (mutually benefit from the multichannel signal) for 2D detection and an optimization method to reconstruct 3D nuclei from 2D detections. The final cell segmentation is obtained by employing an active contour model initialized at the 3D nuclei detections and evolved using the membrane information.

– The optimization problem is designed to associate 2D nuclei along axial direction to obtain 3D nuclei reconstruction. The global association problem is formulated as a linear optimization problem with quadratic cost and biologically inspired cost function.

– A novel algorithm is designed which embeds the linear optimization in an alternating direction method of multipliers to solve the association problem.
The article is organized as follows. In Section 2, the detailed pipeline developed for 3D cell segmentation is described. In Section 3, we describe in detail the mathematical formulation for obtaining 3D reconstruction from 2D detection. In Section 4, we provide the extensive experimental evaluation and comparison with existing methods for 3D cell segmentation. Finally, we conclude the article discussing the advantages of our method and future directions.
2. Ca3D: Pipeline for 3D Segmentation of Cardiomyocytes
The proposed hybrid method exploits convolutional neural networks and classical image processing approaches to obtain robust cell segmentation from 3D images of tissue. The different steps of this pipeline are described in detail in the following subsections.
2.1. Multitask learning for membrane and nuclei detection
The membrane cue is invariably necessary to segment cells from immunofluorescent images^{(}Reference Cao, Wong, Zhao and Yan^{4}^{,} ^{ Reference Mosaliganti, Noche, Xiong, Swinburne and Megason7–Reference Azuma and Onami10)} but as discussed in Ref. (Reference Azuma and Onami10) combining the nuclei and membrane is more conducive for cell segmentation from tissue samples. This is specifically more relevant for cardiomyocyte segmentation which demonstrates significant variation in cell morphology and noncellular structures. In the convolutional neural network community, the efficacy of multitask learning has been demonstrated in the literature^{(}Reference Ramesh and Tasdizen^{12}^{,} Reference Sarkar, Mukherjee, Labruyère and OlivoMarin^{13}^{)}, for various image analysis applications. We exploit the multitask neural network approach to combine the membrane and nuclei information at an initial detection stage for more robust detection.
The neural network is trained on 2D slices of nuclei and membrane images. The network is designed with a partially shared encoder between the nuclei and membrane channels to mutually benefit from one another. Further, due to low resolution, spatially delineating two nuclei is a challenging task. In this scenario, the membrane cue is advantageous to spatially resolve nuclei. An additional constraint for a robust delineation of the nuclei is employed via detection of intermembranous (inverted membrane signal) regions, which would ideally denote the cytoplasmic regions of the myocardial cells. An initial convolution block encodes the features from nuclei and membrane, which are then concatenated and input to the common encoder. Two separate decoderblocks input the original features and output from the encoder to provide pixelwise prediction of the nuclei and membrane region. The cytoplasmic region detection is performed via another branch which uses input from the membrane features and nucleimembrane encoder output. The detailed network architecture is shown in Figure 4. The neural network is trained on the combined loss function from the three channels,
Here $ {I}_m $ , $ {I}_n $ are the ground truth for membrane and nuclei, whereas $ {\hat{I}}_m $ , $ {\hat{I}}_n $ are the corresponding predictions. $ {I}_{im} $ is the intermembranous regions denoted by $ \left(1{I}_m\right) $ , which is precomputed and augmented (nonmyocardial regions are removed) during the ground truth generation process.
Dice loss $ {L}_d $ is used for nuclei and inverted membrane detection whereas mean squared loss, $ {L}_m $ is used for membrane detection. The final refined 2D nuclei prediction is obtained as an overlap of nuclei and intermembranous region prediction. This aids in discarding the nonmyocardial cell nuclei and noisy detection. Sample prediction results of the nuclei and membrane are shown in Figure 3. As can be seen, the enhanced membrane signal and the desired nuclei are obtained with significantly good precision.
2.2. 3D nuclei detection via association
The 3D nuclei structure can be obtained from the 2D detections. This can be achieved by associating the 2D detections axially to construct the 3D nuclei. A naïve approach is to employ local heuristics (such as position, size, etc.)^{(}Reference Stegmaier, Amat and Lemon^{9}^{,} Reference Wang, Sarkar, Aziz, Vaccari, Gahlmann and Acton^{14}^{)} to solve the association problem, but often lead to irregular and erroneous 3D structures. In Refs. (Reference Mukherjee, Basu, Condron and Acton15–Reference Xu, Su, Zhu, Guan and Zhang17), the authors present more sophisticated formulations for 3D segmentation by association. Xu et al. ^{(}Reference Xu, Su, Zhu, Guan and Zhang^{17}^{)} employ a graphbased global association to associate 2D detections for 3D neuron segmentation. In Ref. (Reference Mukherjee, Basu, Condron and Acton15), Mukherjee et al. used a graphbased method which solves the minimum spanning tree problem to associate disjoint objects and achieve 3D neuron segmentation. An attraction forcebased association of disjoint objects in 3D is designed by Ref. (Reference Mukherjee, Condron and Acton16). These aforementioned methods^{(}Reference Mukherjee, Basu, Condron and Acton^{15}^{,} Reference Mukherjee, Condron and Acton^{16}^{,} Reference Pop, Dufour and Le Garrec^{18}^{)} have been designed for segmenting neurons, which demonstrate a tubular treelike geometry unlike cell nuclei that are more convex in structure. A robust data association for 3D nuclei segmentation forms the core of our method since this serves as the initialization of an active mesh model for 3D cell segmentation. Here, we design a novel graphbased association approach to detect 3D nuclei while maintaining the convex geometry and certain biological constraints. In this section, we formally define the mathematical formulation of the association problem.
Let $ \mathcal{I} $ be the 3D volumetric image of nuclei and $ {\mathcal{I}}_z $ be a 2D image ( $ z th $ slice of the volumetric image $ \mathcal{I} $ ) in the domain $ \left[0,1\right] $ . Let $ \Phi ={\left\{{o}_z^j\right\}}_{z=1\dots {N}_s,\hskip0.24em j=1\dots {n}_z} $ a set consisting of binary object on 2D images and $ {o}_z^j\cap {o}_z^k=\mathrm{\varnothing} $ _{,} where $ j, k\in \left\{1\dots {n}_z\right\} $ . Here, N_{s} denotes the number of 2D slices and n_{z} denotes the number of 2D objects in zth slice. $ \Psi ={\left\{{\psi}_p\right\}}_{p=1\dots {N}_3} $ , where N _{3} is the total number of 3D objects in the volumetric image $ \mathcal{I} $ and $ {\psi}_p\cap {\psi}_q=\mathrm{\varnothing} $ . For 3D nuclei detection, we seek to find a function $ \mathcal{F}:\Phi \mapsto \Psi $ . The mapping, $ \mathcal{F} $ , is accomplished by solving a constrained optimization problem, which simultaneously selects a 2D object and a link connecting two 2D objects from consecutive zslices. The problem is defined as follows:
Here a graph structure is imposed on the ensemble of 2D detections. Each 2D nuclei detection $ {o}_z^i $ is considered as a node $ {x}_i $ in the graph (Figure 5a). The connectivity between nodes is defined such that $ {x}_i $ in stack $ z $ is connected to a node $ {x}_j $ in the subsequent stack $ z+1 $ . Each node in the graph is connected to a source, s and a sink, t. The edge between two nodes x_{i} and x_{j} is denoted by an indicator function $ {Y}_{i, j} $ , that is, Y_{i,j} = 1 if an edge is present between two 2D objects and 0 otherwise. $ {Y}_{i, j} $ determines the connectivity between 2D detections to reconstruct the 3D nuclei. The nucleus connectivity is penalized by a weight $ {C}_{i, j}, $ which is computed based on how likely $ \mathcal{F}\left({o}_z^i\right)=\mathcal{F}\left({o}_{z+1}^j\right) $ . The link between a source and a node is denoted by $ {Y}_{\hskip1pt s, i} $ and $ {Y}_{i, t} $ with connectivity weights $ {C}_{s, i} $ and $ {C}_{i, t} $ respectively. $ {C}_i $ is defined as the likelihood of a 2D object belonging to any flow path. The second order cost $ {Q}_{i, j, k} $ employed in the optimization regulates if nuclei from three subsequent stacks belong to the same 3D object. The optimal solution is obtained by sending the maximum flow from source to sink while minimizing overall cost. The 2D objects in one flow path from source to sink can be inferred as one 3D object. A detailed description of the optimization problem solution is given in Section 3.
This optimization problem satisfies three main properties necessary to reconstruct 3D nuclei from 2D detection.
Property 1. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}{\left({o}_z^k\right)}_{k\ne j} $ . This property implies that two disjoint objects in a particular 2D imaged plane, belong to two different 3D objects. It is satisfied directly during the graph construction where no edge is present between detection (nodes) in the same plane.
Property 2. The second property implies that one 2D object in a particular zslice can be connected to only one 2D object in the subsequent zslice, that is, If $ {o}_z^j\hskip0.70em \cap \hskip0.70em {o}_{\mathcal{N} z}^m\ne \mathrm{\varnothing} $ and $ {o}_z^k\hskip0.35em \cap \hskip0.35em {o}_{\mathcal{N} z}^m\ne \mathrm{\varnothing} $ , where $ j\in \left\{1\dots {n}_z\right\} $ and $ m\in \left\{1\dots {n}_{\mathcal{N} z}\right\} $ , (as shown in Figure 5b), then either one of the following needs to be satisfied:

a. $ \mathcal{F}\left({o}_z^j\right)=\mathcal{F}\left({o}_{\mathcal{N} z}^m\right)\ne \mathcal{F}\left({o}_z^k\right), $

b. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}\left({o}_{\mathcal{N} z}^m\right)=\mathcal{F}\left({o}_z^k\right), $

c. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}\left({o}_{\mathcal{N} z}^m\right)\ne \mathcal{F}\left({o}_z^k\right). $
Here $ \mathcal{N} z $ denotes the neighboring 2D slice. The constraint $ {\sum}_{i, j\in \varepsilon}{Y}_{i, j}={\sum}_{j, k\in \varepsilon}{Y}_{j, k} $ in equation (2) is essential in satisfying this property. The constraint indicates that the number of incoming edges to a node equals the number of outgoing edges. When the total number of incoming edges equals one it satisfies the property that a 2D detection can be connected to only one 2D detection in the subsequent stack.
Property 3. The third property takes into consideration how objects from three subsequent slices can be mapped to the same 3D object. If $ {o}_i^j\hskip0.35em \cap \hskip0.35em {o}_{\mathcal{N} z}^m\hskip0.35em \cap \hskip0.35em {o}_{\mathcal{N}{(\mathcal{N} z)}^l}\ne \mathrm{\varnothing} $ and $ {o}_z^k\hskip0.35em \cap \hskip0.35em {o}_{\mathcal{N} z}^m\hskip0.35em \cap \hskip0.35em {o}_{\mathcal{N}(\mathcal{N} z)}^l\ne \mathrm{\varnothing} $ (as in Figure 5b, where $ l\in \left\{1\dots {n}_{\mathcal{N}\left(\mathcal{N} z\right)}\right\} $ , then either of the following needs to be satisfied:

a. $ \mathcal{F}\left({o}_z^j\right)=\mathcal{F}\left({o}_{\mathcal{N} z}^m\right)=\mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)\ne \mathcal{F}\left({o}_z^k\right), $

b. $ \mathcal{F}\left({o}_z^j\right)=\mathcal{F}\left({o}_{\mathcal{N} z}^m\right)\ne \mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)\ne \mathcal{F}\left({o}_z^k\right), $

c. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}\left({o}_{\mathcal{N} z}^m\right)=\mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)=\mathcal{F}\left({o}_z^k\right), $

d. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}\left({o}_{\mathcal{N} z}^m\right)=\mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)\ne \mathcal{F}\left({o}_z^k\right), $

e. $ \mathcal{F}\left({o}_z^j\right)\ne \mathcal{F}\left({o}_{\mathcal{N} z}^m\right)\ne \mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)\ne \mathcal{F}\left({o}_z^k\right), $

f. $ \mathcal{F}\left({o}_z^k\right)=\mathcal{F}\left({o}_{\mathcal{N} z}^m\right)\ne \mathcal{F}\left({o}_{\mathcal{N}\left(\mathcal{N} z\right)}^l\right)\ne \mathcal{F}\left({o}_z^j\right). $
The secondorder cost, $ {Q}_{i, j, k} $ determines when Y_{i,j} = 1 and Y_{j,k} = 1 simultaneously and thus $ {x}_i $ , $ {x}_j $ , and $ {x}_k $ of stacks z, z + 1, and z + 2 belong to the same 3D object. The optimization details to solve for $ \mathcal{F} $ using equation (2) are detailed in Section 3.
2.3. Active mesh cell segmentation
The 3D nuclei detection provides an approximate location of nuclei. In order to obtain final cell segmentation, the membrane information is essential. Edgebased deformable models^{(}Reference Pop, Dufour and Le Garrec^{11}^{,} Reference Dufour, Thibeaux, Labruyere, Guillen and OlivoMarin^{19}^{)} are more appropriate choice for segmentation in this scenario. This is specifically advantageous in our case, since this formulation allows the mesh to coincide with image edge features and simultaneously forbid overlapping of meshes^{(}Reference Dufour, Thibeaux, Labruyere, Guillen and OlivoMarin^{19}^{)}. The active mesh model implemented in the bioimage analysis software Icy^{(}Reference De Chaumont, Dallongeville and Chenouard^{20}^{)} is used for this purpose. The active contour evolution parameters are estimated in an automated manner as described in Refs. (Reference Pop, Dufour and Le Garrec11) and (Reference Dufour, Thibeaux, Labruyere, Guillen and OlivoMarin19). We initialize the active mesh at the detected nuclei and evolve using the 3D membrane image $ {\mathcal{I}}_M $ . Each 2D slice in $ {\mathcal{I}}_M $ is multitask neural network membrane prediction $ {\hat{I}}_m $ .
3. Mathematical Framework
The solution of global data association problem, without the secondorder constraint, can be achieved via constrained linear programming^{(}Reference Chenouard, Bloch and OlivoMarin^{21}^{)} or network flow algorithm^{(}Reference Goldberg and Tarjan^{22}^{,} Reference Zhang, Li and Nevatia^{23}^{)}. The sparse matrix Q imposing the higherorder connectivity constraint on the graph makes the optimization problem nontrivial. We propose a relaxed linearization technique of the quadratic objective function via variable substitution, which can then be solved via network flow algorithm embedded within an alternating direction method of multipliers framework^{(}Reference Boyd, Parikh and Chu^{24}^{)}.
The objective function in the proposed optimization problem equation (2) can be written in a matrix form as
Here $ \mathbf{C},\mathbf{Y}\in {\mathcal{R}}^N $ and $ \mathbf{Q}\in {\mathcal{R}}^{N\times N} $ , N denotes the number of edges in the graph.
The relaxed linearization of the objective function in equation (3) via variable substitution is then written as
It should be noted here, that Y is a binary integer, but imposing similar constraints on variable Z would require to solve dual linear integer programming problem. In order to avoid that, we impose the designed constraints on the relaxation variable Z. In addition to equality constraint with binary variable Y, Z should have a nonnegative structure. The imposed constraints on Z imply that it will be as close to zero as possible when Y = 0 and $ \mathbf{Z}\simeq 1 $ when Y = 1.
We employ alternating direction method of multipliers to solve the augmented Lagrangian formulation of the problem defined in equation (4). The inequality constrained can be handled by introducing a slack variable v such that Z–v = 0. The augmented Lagrangian of is then given as
Here $ f\left(\nu \right)= max\left(0,{\unicode{x03BB}}_3+\mu \mathbf{Z}\right) $ . λ_{1}, λ_{2}, and λ_{3} are the Lagrangian multiplier and μ is the penalty parameter of the augmented Lagrangian function.
The variable $ \mathbf{Z} $ , $ \mathbf{Y} $ , $ {\lambda}_1 $ , $ {\lambda}_2 $ and $ {\lambda}_3 $ are updated in an alternating minimization fashion. The solution to obtain $ \mathbf{Z} $ and $ \mathbf{Y} $ are explained as follows.
3.1. Update Z
The solution for $ \mathbf{Z} $ can be obtained by first taking the derivative of the augmented Lagrangian with respect to $ \mathbf{Z} $ and setting it to zero. The derivative of cost function in equation (5) with respect to Z is given as
By setting the derivative of equation (6) to 0 and setting $ \beta =\frac{1}{3\mu} $ , we obtain a closedform solution of Z as
3.2. Update Y
The solution for Y is obtained by solving the augmented Lagrangian with the constraints on Y as in equation (3):
Since Y_{i,j} is a binary variable, one can assume $ {Y}_{i, j}^2={Y}_{i, j} $ . We can then rewrite $ \parallel {Z}_{i, j}{Y}_{i, j}{\parallel}_2^2={Y}_{i, j}2{Y}_{i, j}{Z}_{i, j}+{Z}_{i, j}^2 $ . By imposing this approximation^{(}Reference Chari, LacosteJulien, Laptev and Sivic^{25}^{)}, $ \parallel \mathbf{Z}\mathbf{Y}{\parallel}^2\simeq {\left(12\mathbf{Z}\right)}^T\mathbf{Y}+\parallel \mathbf{Z}{\parallel}^2 $ . The optimization problem for Y can now be written as
The optimal flow Y, is obtained by solving equation (9) as a mincost network flow optimization using the pushrelabel algorithm^{(}Reference Goldberg and Tarjan^{22}^{,} Reference Zhang, Li and Nevatia^{23}^{)}. The pseudoalgorithm of the optimization problem is given in Algorithm 1.
Algorithm 1 Ca3D
Input: $ {C}_{i, j} $ , $ {Q}_{i, j, k}\forall i, j\in 1\dots N $
Output: $ {Y}_{i, j}\forall i, j\in 1\dots N $
Method: The ADMM update for the Augmented Lagrangian is as follows:
Initialize $ {\mathbf{Y}}^{\tau} $ by solving equation (2) without the secondorder cost
For $ \tau \ge 0 $ ; For $ t\ge 0 $ ,
$ {\mathbf{Z}}^{t+1}=\beta \left(\left(\mu \mathbf{Q}\right){\mathbf{Y}}^{\tau}+\mu \left(1+\nu \right){\unicode{x03BB}}_1^t{\unicode{x03BB}}_2^t{\unicode{x03BB}}_3^t\right), $ (10)
$ {\unicode{x03BB}}_1^{t+1}={\unicode{x03BB}}_1^t+\mu \left({\mathbf{Z}}^{t+1}{\mathbf{Y}}^k\right), $ (11)
$ {\unicode{x03BB}}_2^{t+1}={\unicode{x03BB}}_2^t+\mu \left({\mathbf{Z}}^{t+1}1\right), $ (12)
$ {\unicode{x03BB}}_3^{t+1}={\unicode{x03BB}}_3^t+\mu \left({\mathbf{Z}}^{t+1}{\nu}^t\right), $ (13)
$ {\nu}^{t+1}=\max \left(0,{\unicode{x03BB}}_3+\mu {\mathbf{Z}}^{t+1}\right). $ (14)
Update $ {\mathbf{Y}}^{\tau +\mathbf{1}} $ using equation (9) with $ {\mathbf{Z}}^{t+1} $ , $ {\unicode{x03BB}}_1^{t+1} $ .
3.3. Graph weight selection
In addition to formulating the optimization method, a robust design of the link costs is an important aspect of the data association problem. For our problem, the link cost connecting the 2D nuclei detections $ {o}_z^i $ and $ {o}_{z+1}^j $ is designed taking into account three different criteria: spatial distance between 2D objects, the change in nuclei size, and a biologically inspired membraneoverlap condition. The spatial distance is simply obtained as the Euclidean distance between two nuclei given as $ {c}_{i j}^c=\frac{\parallel c\left({o}_z^i\right) c\left({o}_{z+1}^j\right){\parallel}_2^2}{2{\sigma}_c^2} $ , where $ c\left({o}_z^i\right) $ is the centroid of nuclei $ {o}_z^i $ and $ {\sigma}_c $ constrains the spatial distance (axially) between two nuclei centers. The size constraint is imposed by the change in area of two nuclei given by $ {c}_{i j}^a=\frac{\parallel a\left({o}_z^i\right) a\left({o}_{z+1}^j\right){\parallel}_2^2}{2{\sigma}_a^2} $ , $ a\left({o}_z^i\right) $ is the area of nuclei detection $ {o}_z^i $ and $ {\sigma}_a^2 $ controls the allowed change in the area. This criterion imposes that a significant increase (or decrease) in 2D nuclei detection size denote they are part of different 3D objects, that is, $ \mathcal{F}\left({o}_z^i\right)\ne \mathcal{F}\left({o}_{z+1}^j\right) $ (or false positives in 2D detection).
The biologically inspired membraneoverlap criterion states that even if $ {o}_z^i $ and $ {o}_{z+1}^j $ are spatially close, $ \mathcal{F}\left({o}_z^i\right)\ne \mathcal{F}\left({o}_{z+1}^j\right) $ , if cell membrane is present between them. We ensure this criterion by designing a membrane crossover cost defined by,
Here $ f\left({M}_{., z},{M}_{., z+1}\right) $ is the interpolation of membrane stack between z and z + 1. N is the number of pixels within the window centered at $ p(.) $ . $ p\left( i, j\right) $ is the midpoint of the line joining centroids of nuclei $ {o}_z^i $ and $ {o}_{z+1}^j $ . The cost of linking two nuclei, C_{i,j} is given as the sum of $ {c}_{ij}^c $ , $ {c}_{ij}^a $ , and $ {c}_{ij}^m $ . The pairwise cost, Q_{i,j,k} is formulated as the sum of C_{i,j} and C_{j,k} to ensure the cooccurrence of three nodes in a single flow. This implies that when the link costs concur with the above criteria for three subsequent nuclei $ {o}_z^i $ , $ {o}_{z+1}^j $ , and $ {o}_{z+1}^k $ , then $ \mathcal{F}\left({o}_z^i\right)=\mathcal{F}\left({o}_{z+1}^j\right)=\mathcal{F}\left({o}_{z+2}^k\right) $ . This also ensures a local smoothness in the linked path.
Sample data association results for 3D nuclei reconstruction are shown in Figure 6. It can be observed from the figure, the nuclei association (overlaid on neural network detection results, in magenta) can delineate nuclei axially even in scenarios where visually identifying the separation is challenging.
4. Experimental Results
Ca3d is applied to segment individual cells from 3D images of murine heart from lightsheet microscopy images. The experimental details and the results are discussed in the following subsections.
4.1. Data description
The mouse hearts were dissected from embryos at 10.5 days after fertilization (E10.5). They were fixed in 4% paraformaldehyde and labeled as a whole mount by immunostaining with antiScrib and antiCadh2 antibodies to detect cardiomyocyte membranes. Nuclei were counterstained with Hoechst. Wholemount hearts were cleared with CUBIC approach^{(}Reference Le Garrec, Domnguez and Desgrange^{26}^{)} and mounted on 0.4% agarose in Reagent2 in a capillary. Multichannel 8bit images were acquired on a Z.1 (Zeiss) lightsheet microscope with a 20×/1.0 objective with voxel resolution is 0.63 × 0.63 × 1.26 μm.
4.2. Neural network training
We obtain 56—2D cropped images (dimension 255 × 255) from the 3D scan of an embryo ventricular region. They are manually annotated to create nuclei and membrane ground truth. The 2D ground truth was generated in a semiautomated manner using Icy, a bioimage analysis software^{(}Reference De Chaumont, Dallongeville and Chenouard^{20}^{)}. Ground truth for nuclei was generated by drawing manual contours. The membrane images are first enhanced using a hessian enhancement scheme and manual thresholding to obtain binary detection. An inpainting tool^{(}Reference De Chaumont, Dallongeville and Chenouard^{20}^{)} is used to connect (disconnect) membrane regions to preserve continuity (remove noisy structures) of the structures. Finally, we compute the inverse of membrane ground truth and apply areathreshold to discard nonmyocardial regions. This forms the ground truth for the intermembrane prediction branch. We perform a qualitative and quantitative comparison with four stateoftheart methods to demonstrate the efficacy of our method. A dataset consisting of 500 images is created by employing various augmentations, such as horizontal and vertical flips, rotation, variation in brightness.
The neural network is trained for 100 epochs with a learning rate of 10^{−3} and a learning rate decay of 0.9 at every 50 epochs. Adam Optimizer^{(}Reference Kingma and Ba^{27}^{)} is used to train the network. Details of the network architecture are shown in Figure 4. The filter dimensions (height and width) and the number of channels (Channel Size) at each convolution layer are shown in the aforementioned figure (above each convolution step). The channel size at each step is also marked above each block in the schematic. The dotted blocks denote the concatenation of channels. Each convolution layer consists of ReLU activation followed by Maxpool (size 2 × 2) for the encoder section and an upsampling (scale factor 2) in the decoder section. Sigmoid activation layer is used in the final convolutional layer to obtain a smooth and scaled output.
4.3. 3D segmentation and validation
3D segmentation using Ca3d pipeline is performed on six crops of size 255 × 255 × 70, extracted from the ventricular region of the volumetric image of two different embryos. From each crop, the 2D slices of nuclei and membrane channel are first used to obtain 2D nuclei, membrane and intermembranous segmentation using the multitask neural network. As discussed earlier, the final refined 2D nuclei prediction is obtained as an overlap of nuclei and intermembranous region predictions. The 3D nuclei detections are obtained via the proposed association approach. Before applying active contour, the 3D nuclei of relatively small size are identified and discarded. This ensures errors due to nuclei oversegmentation are reduced in the final step. Due to the size and shape variations (resulting from intensity inhomogeneity), the terminal regions of 3D nuclei (axially) may lead to multiple associated fragments causing nuclei oversegmentation. As we seek to employ a single nuclei detection during active contour initialization for individual cell segmentation, we employ, a simple morphologybased threshold approach to discard relatively smaller 3D nuclei. Only the associated 3D nuclei which consist of 2D detections comprising of seven or more Zslices are used for active contour segmentation. This is determined from expert annotation of nuclei. 2D nuclei regions (from neural network) of the cardiomyocytes are merged along the axial direction manually by a developmental biologist to obtain 3D nuclei structures (this is also used to compute semimanual 3D cell segmentation for validation as discussed later in Section 4.4). From these annotations, minimum size (elongation) of the nuclei was determined as 12 μm. With Zresolution 1.26 μm and keeping into account possible errors ensuing from neural network and the 3D association, 9 μm(1.26 × 7 μm) is selected as threshold of nuclei size. 3D nuclei objects below this size are discarded before employing active contour segmentation. This confirms smaller fragments, specifically resulting from oversegmentation of nuclei, are discarded before active contour initialization.
Due to the lack of ground truth in 3D, we first perform manual validation on a part of the segmented data. The segmentation result on the rest of the data is statistically compared with the validation for different cell morphological parameters. We also compare our method with stateoftheart methods both qualitatively and quantitatively. The details of the validation and comparison methods are given in the following subsections.
4.4. Comparison with competing methods
4.4.1. Comparison methods
We compare Ca3d with four competing stateof theart methods for cell segmentation from 3D images of tissue. The comparison methods were developed to segment cells from 3D images of confocal or lightsheet microscopy images for different organisms (such as mouse, C. elegans) and different developing organs (embryo, heart, and kidney).

i. Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)}: The method was developed for cell segmentation from 3D images of mouse heart from confocal microscopy. This method employs a hierarchical Kmeans clustering^{(}Reference Pop, Dufour and Le Garrec^{18}^{,} Reference Dufour, MeasYedid, Grassart and OlivoMarin^{28}^{)} for 3D nuclei detection. The active surface model^{(}Reference Dufour, Thibeaux, Labruyere, Guillen and OlivoMarin^{19}^{)} is then initialized at the detected nuclei and evolved using the enhanced membrane image.

ii. RACE ^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}: An automated method was designed for cell segmentation of developing embryos from lightsheet and confocal microscopy images. The method employs membrane enhancement techniques to detect intermembranous regions indicating the cellular cytoplasmic region. Morphological operations are employed to enhance and discard detected regions of irrelevant size. 2D watershed segmentation is employed to segment cellular regions in each Zstack, which are then merged using heuristics to obtain the 3D cell reconstruction.

iii. 3DMMS ^{(}Reference Cao, Wong, Zhao and Yan^{4}^{)}: The method employs 3D hessianbased membrane enhancement and intermembrane information for cell region detection. 3D watershed segmentation is performed using these detected regions and a distance computed using the inverted membrane. In this method, precomputed nuclei lineage details from 3D image sequences are used to improve the final segmentation. Due to the absence of nuclei lineage information for our experimentation protocol, we employ the nuclei predicted using the neural network for postprocessing.

iv. ShapeMetrics ^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)} This method performs cell membrane detection using Ilastik ^{(}Reference Sommer, Straehle, Koethe and Hamprecht^{29}^{)}, an interactive machine learning approach for object prediction. Different threshold values are then applied for binary segmentation of the membrane and the threshold value producing visually best prediction is selected. 3D watershed segmentation on the enhanced membrane detection is performed to obtain cell segmentation.
4.4.2. 3D segmentation validation
As we have mentioned earlier, obtaining ground truth for individual cells from the tissue is a time consuming and laborious task even for experts in the field. Consequently, we compare the cell morphological values with two types of validation methods—manual and semimanual cell segmentation.

i. Manual evaluation: The manual evaluation was performed by looking at individually segmented cells and original images by experts to determine the segmentation quality. The segmentation accuracy was measured based on two different criteria: (a) celloverlap: axially and spatially nonoverlapping cells and (b) approximate cell size. The nonoverlapping criterion considers if the segmentation contour overlaps multiple cells either axially or spatially. The cell size criterion looks at the number of Zslices comprising a cell and cell elongation. Each of the segmented cells is given a highconfidence or low confidence score based on the above two criteria. Manual evaluation is performed on randomly selected 300 cells for three different volumetric crops. The frequency distribution of the evaluated cells based on highconfidence value using cell size and cell overlap criterion are individually provided in Figure 7. The distribution of the cell morphological parameters, when both criteria have high confidence value simultaneously, is also shown in Figure 7 (in green). Although the distributions are almost similar to individual criteria, high confidence in both provides a more precise range for cell parameters. We employ this combined highconfidence range to evaluate the efficiency of Ca3D and compare it with stateoftheart methods.

ii. Semimanual 3D cell segmentation: A semimanual method is used to generate cell segmentation for comparison and evaluation. In this process, the 2D nuclei detection (from neural network) of the cardiomyocytes are merged along the axial direction manually by a developmental biologist. The shape and size change of detected nuclei, spatial displacement, presence of cell membrane are evaluated manually during the merging procedure. The merged nuclei act as the initialization of active meshes which is evolved using the detected membrane image are to reconstruct individual cells in 3D. This also provides a confidence measure for the 3D nuclei reconstruction problem^{(}Reference Sarkar, Darby, Foucambert, Meilhac and OlivoMarin^{30}^{)}.
4.4.3. Quantitative comparison with cell morphology distribution profile
The methods are compared with the above manual and semimanual validation techniques with respect to various cell morphological parameters: the cell elongation, cellwidth, cell elongation ratio, surface area of cells. Any one of these cell parameters is not sufficient to assess the efficacy of a method, but if the performance is uniform over the different parameters, the efficacy is then definitely convincing. The mean and standard deviation of the parameters of segmented cells using Ca3D, RACE^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}, 3DMMS^{(}Reference Cao, Wong, Zhao and Yan^{4}^{)}, ShapeMetrics^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)}, and Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)} are presented in Table 1. In addition to that, the percentage of over and undersegmented cells are also presented in Table 1.
It is observed that for different cell parameters, Ca3D performs evenly in comparison to the other methods. For cell elongation, although the mean value of 3DMMS and ShapeMetrics is closer to the manually validated value, a significantly high standard deviation is observed. The mean values for RACE are significantly higher and that of Pop et al. are lower compared to the validation data. This observation is also apparent from the distribution of cells in Figure 8a. The distribution of cell population pertaining to cell elongation parameter for Ca3D is comparable to the validation data in comparison to the other methods. A similar observation is made for cell width. While Ca3D and 3DMMS have similar mean values of cell width, 3DMMS demonstrates a higher standard deviation. Pop et al. and RACE on the other hand have significantly smaller population mean demonstrating a shift in the cell population density from that of the validation data (Figure 8b). The combined effect of the cell elongation and cell width is reflected in the cell elongation ratio (=cell elongation/cell width). The high cell elongation and low cell width lead to a high value of cell elongation ratio for RACE leading to a more flat distribution (Figure 8c). In comparison, Ca3D demonstrates a distribution, which is concurrent with the validation data and closely followed by Pop et al. But, Pop et al. demonstrate a lower cell population mean for surface area. Contrarily, 3DMMS has a similar population mean as the validation data but a high standard deviation leading to higher oversized and undersized segmented cells.
To quantify the distribution presented in Figure 8b, we compute the KL divergence measure and Bhattacharya distance between the parameter distribution of each comparison method and that of the validation methods. A lower value of the distance measure implies that the distribution is similar. It is noted from Table 1, that Ca3D, for all the cell parameters concur with that of manual validation (7th and 8th column) as well as semiautomatic validation (9th and 10th column).
The percentage of oversegmented, undersegmented, and withinrange cells are also presented in Table 1. The range for accurate cell size based on the aforementioned parameters is given as μ_{v} ± k_{v} × σ_{v}. Here, μ_{v} and σ_{v} are the mean and standard deviation of the validated cells. k_{v} is decided such that 90% of the validated cells are within this range. It is observed, with Ca3D we obtain highest percentages of cells, which lie within the valid range for three of the cell parameters. The number of cells within the correct range for surface area is highest for RACE^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}, but percentage of undersized cells with respect to cell width is comparatively higher, which implies that detected cells are thinner and longer in structure. This is also evident from the elongation ratio values, where RACE^{(}Reference Stegmaier, Amat and Lemon^{9}^{)} demonstrates the highest percentage of oversized cells. Lowest percentage of oversized cells is obtained with Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)}, but percentage of undersized cells is significantly high. This is pertaining to the fact that intensitybased clustering possibly detects fragments of nuclei leading to the detection of smallersized cells. 3DMMS^{(}Reference Cao, Wong, Zhao and Yan^{4}^{)} and ShapeMetrics^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)} both demonstrate significantly high percentages of both oversized and undersized cells in comparison to Ca3D.
The qualitative evaluation by visual inspection is shown in Figures 9 and 10. The 3D segmented image viewed from three viewing planes is presented in Figure 9. As observed from the images, RACE^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}, Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)}, and ShapeMetrics^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)} demonstrate significant erroneous cell segmentation involving partial detection, significantly low number of detected cells, ambiguous cell boundary. Cell segmentation obtained using 3DMMS^{(}Reference Cao, Wong, Zhao and Yan^{4}^{)} identifies cell which aligns with the cellmembrane but generates oversegmented cells spatially as well as axially. 3D visualization of cell segmentation on a sample volumetric image and 3D images of some individual cells obtained for each of these methods are shown in Figure 10. Significantly, nonconvex and noisy cellboundary is observed for ShapeMetrics^{(}Reference Takko, Pajanoja, Kurtzeborn, Hsin, Kuure and Kerosuo^{8}^{)}, 3DMMS^{(}Reference Cao, Wong, Zhao and Yan^{4}^{)}, and RACE^{(}Reference Stegmaier, Amat and Lemon^{9}^{)}. Pop et al. ^{(}Reference Pop, Dufour and Le Garrec^{11}^{)} exhibit a smoother cell boundary owing to the application of active mesh model. The superiority in segmentation quality of Ca3D is observed from both Figures 9 and 10 where segmented cells are compactsized, with smoother boundary complying with the cell membrane while maintaining approximately convex shape.
4.5. Ablation study
We also present a comparison for ablation study for the convolutional neural network. The multitask neural network has three different task (nuclei, membrane, and intermembranous region) detection objectives. The ablation study is performed by using different task combinations to train the neural network. We employ three different tasks combinations: (a) nuclei, (b) nuclei and membrane, and (c) nuclei and inverted membrane. For the first and third task combination, we use 3D enhanced membrane (as in Pop et al.) for the active contour evolution. Table 2 provides the mean and standard deviation values along with percentage of over and undersized cells obtained for each of the neural network task combinations with respect to the validation data. For cell elongation and cell width, a lower percentage of oversized cells is obtained using nuclei and membrane tasks but demonstrates a higher percentage of undersized cells. The same setting also demonstrates a high percentage of oversized cells pertaining to ellelongation ratio. The error possibly occurs due to the consideration of nonmyocardial cells which are discarded by Ca3D using the combination of nuclei and invertedmembrane function. It is observed from the table, that under different cell parameters, Ca3D demonstrates a higher percentage of correctly sized cells in comparison to the three other task settings of the neural network.
5. Conclusion
In this article, we propose an approach for 3D cell segmentation multifluorescence volumetric images of mouse embryo heart imaged via lightsheet microscopy. We developed a hybrid approach, which exploits the robustness of convolutional neural network, a graphbased association method, and active mesh to delineate individual cells in myocardial tissue. The data association problem, which forms the main backbone of the method is obtained using a novel optimization technique that solves the mincost network flow problem embedded in an alternating direction method of multipliers framework. We demonstrate the efficacy and superiority of our method via various qualitative and quantitative analyses and comparison with stateoftheart methods. The experimental validation of our 3D cell segmentation approach provides confidence for the applicability of this method in future analysis of coordinated cell division and cardiac muscle orientation.
Competing Interests
The authors declare no competing interests exist.
Authorship Contributions
Conceptualization: R.S., J.C.O.M; Data curation: D.D., S.M.; Data visualization: R.S., D.D. J.C.O.M.; Methodology: R.S.; Writing—original draft: R.S., D.D., S.M., J.C.O.M. All authors approved the final submitted draft.
Funding Statement
This work was supported in part by the FranceBioImaging (FBI) infrastructure under Grant ANR10INBS04, and the Program PIA INCEPTION under Grant ANR16CONV0005. The Meilhac Lab is supported by core funding from the Institut Pasteur and state funding from the Agence Nationale de la Recherche under the “Investissements d’avenir” program (ANR10IAHU01) and an FRM grant (DPC20171239475). D.D. was a recipient of fellowships from the Region IledeFrance and FRM. S.M. is an INSERM research scientist.
Data Availability Statement
The code is available at https://github.com/rituparnaS/3Dcellsegmentationvianeuralnetworkanddataassociation.
Acknowledgment
We are grateful for the technical assistance of Héloise Foucambert for help during the initial stages of the study.