1.IntroductionImage-to-physical registration is an essential component of surgical navigation systems wherein patient-specific information from preoperative imaging is aligned to the intraprocedural coordinate space of the patient. This alignment process requires an intraoperative data collection step to acquire intraprocedural shape descriptors of the organ of interest. Several methods have been developed for three-dimensional measurement of intraoperative organ geometry, including optical or electromagnetic tracking of tool tips, stereo camera reconstruction, laser range scanning, tracked ultrasound, and cone-beam computed tomography, among others.1 However, constraints within the surgical environment require that organ data be collected in an expedient manner compatible with existing surgical workflows. In consideration of these constraints, the intraoperative data most often available to navigation systems consists of intraoperative point clouds collected over a limited extent of the organ surface exposed during the procedure. In soft tissue organs, these intraoperative surface measurements not only drive rigid spatial alignments between preoperative and intraoperative coordinate frames, but they also encode changes in organ shape that occur between preoperative imaging and intraoperative organ presentation. The ability to accurately estimate a complete soft tissue deformation field throughout the organ from these sparse organ surface measurements is a vital objective for many guidance systems involving soft tissue organs. This inference task is the basis of the sparse data challenge herein, which focuses on the application of image-guided liver surgery in which organ deformations frequently reach several centimeters in magnitude.2,3 Rigid and nonrigid registration approaches have been reported in the literature to accomplish image-to-physical alignment, yet direct comparisons of accuracy associated with these methods have long remained out of reach due to a lack of shared validation datasets. Opportunities to compare performance on shared challenge data will provide new insights toward effective common strategies that may be synthesized into the next generation of novel registration algorithms. At SPIE Medical Imaging 2019, Brewer et al. introduced the first image-to-physical liver registration sparse data challenge4 to allow research groups to validate registration algorithms on a shared dataset. This dataset is based on sparse intraoperative data collected from a tissue-mimicking silicone liver phantom under multiple configurations of intraoperative deformation, previously available at Ref. 5. The challenge data consisted of a preoperative liver mesh and 112 intraoperative data patterns sampled from four unique deformed intraoperative poses produced by placing mock surgical packing beneath the posterior surface of the liver phantom. validation targets were distributed at blinded positions throughout the phantom to allow for an unbiased comparison of ground truth target positions against predicted target positions estimated by any proposed registration algorithm. Each of the 112 data patterns were mapped onto the deformed intraoperative organ poses from real patterns of intraoperative data collected with an optically tracked stylus over the visible patch of the anterior surface during clinical evaluation of an image-guided liver surgery system.6,7 Since the release of the sparse data challenge in 2019, a variety of methods including three rigid, two deep learning, and three biomechanically driven nonrigid registration techniques have been contributed to the challenge. This paper presents results comparing these registration approaches on the common validation dataset and offers the first direct comparison study of sparse data registration accuracy in a phantom simulation of image-guided liver surgery conditions. This paper reports full metrics for target registration errors (TRE) that previously were blinded to participants in the sparse data challenge to ensure impartiality. Furthermore, this work includes a detailed analysis of registration errors across multiple sparse data registration methods on a large common set of scenarios of image-guided navigation in the liver. Deformable registration methods are evaluated for sensitivity to data coverage, variability across anatomical segments, the effect of measurement noise, differences in initial alignment, and biomechanical consistency of the deformation field. Through these analyses, common limitations and effective strategies are identified across a variety of proposed methods to inform future algorithmic development in the domain of image-to-physical sparse data registration algorithms. Finally, the full validation data, including the previously blinded target positions and analysis techniques for the sparse data challenge, will be released hereafter via the Open Science Framework at Ref. 8 as a continuing platform for algorithm characterization and benchmarking. 2.Methods2.1.Sparse Data ChallengeThe dataset associated with the image-to-physical liver registration sparse data challenge4 was generated via a silicone liver phantom fabricated from a mixture of 80% Ecoflex 00-10 platinum-cure silicone, 10% Silicone Thinner, and 10% Slacker Tactile Mutator (Smooth-On Inc., Macungie, Pennsylvania, United States) molded into a patient-specific 3D-printed cast of a CT-segmented liver volume. A total of 159 stainless steel beads were implanted into the silicone phantom as CT-visible validation targets. The liver phantom was removed from its cast, and mock laparotomy pads were placed under the posterior face of the phantom to simulate four different configurations of plausible intraoperative deformations. Repeat CT imaging was performed to establish a baseline phantom configuration representing the undeformed preoperative state and a series of four intraoperative deformation configurations from which the ground truth liver geometry and positions of validation targets were segmented using ITK-SNAP (Kitware Inc., Clifton Park, New York, United States). The 112 registration datasets were constructed by mapping 28 unique sparse data patterns to each of the four deformation states via the data transposition method in Ref. 9. Each sparse data pattern was acquired during a clinical study of an image-guided liver surgery system approved by the institutional review board at Memorial Sloan Kettering Cancer Center and with informed consent of all participants.6,7 The extent of data coverage over the total surface area of the liver varied with each pattern from 20% to 44%, with average extent of . To simulate intraoperative instrumentation noise, 21 of 28 data patterns (84 datasets) were mapped to the intraoperative liver geometry with sinusoidal noise of 2-mm amplitude, and the remaining 7 data patterns (28 datasets) were mapped to the intraoperative liver surfaces without additional noise. Randomized rotations and translations were applied to each dataset to ensure that the sparse intraoperative point clouds from each dataset were treated independently. Random rotations were sampled uniformly in SO(3) via normalized axis-angle parameters for and , whereas translations were randomly sampled using a uniform distribution in the range mm for . Figure 1 illustrates the structure of the sparse data challenge. Given the 112 sparse intraoperative datasets and the initial preoperative liver volume in Fig. 1(a), the task of the challenge is to perform a registration that most accurately predicts the deformed state of the whole organ based on the limited information provided by each sparse data pattern. Registration results are provided according to dense displacement fields [Fig. 1(b)] defined over the preoperative liver volume. Displacements at the blinded validation target locations [Fig. 1(c)] are then interpolated to determine the registration errors of the estimated target positions. The total variation in sparse data patterns across the 112 registration instances is depicted in Fig. 2. Additionally, the sparse data challenge was structured to provide participants with the ability to inform algorithmic development in a restricted manner. An incomplete set of ground truth data for 35 of 159 target positions in 4 of the 112 datasets drawn from 2 of the 4 underlying deformation poses was provided to participants. Furthermore, a web portal was implemented using Amazon Web Services (Amazon Web Services Inc., Seattle, Washington, United States) to allow participants to upload in-progress and finalized registration results. These results were automatically processed to yield coarse summary measures of average TRE across the full dataset and stratified across low, medium, and high ranges of data extent. These results files were hosted on a publicly available dashboard at Ref. 5 for the purposes of benchmarking and to offer a limited capability for hyperparameter characterization and algorithmic tuning among participants. The dataset and contributed results will continue to be available through the Open Science Framework at Ref. 8 upon closure of the challenge site. 2.2.EvaluationThe primary endpoint of the sparse data challenge is the set of TRE associated with individual registrations to the 112 unique intraoperative data patterns that comprise the dataset. Registration errors of the 159 validation targets in each registration were measured according to where is the ground truth intraoperative position of target and is the estimated position of target predicted by the registration. Average TRE () of each registration was computed as the mean of . Further, among registrations to each of the 112 datasets were statistically compared between registration methods via the Friedman test at a significance level of with Bonferroni correction applied to reported -values to adjust for multiple comparisons.Furthermore, the sensitivity of TRE to digitization noise was evaluated. The impact of digitization noise was quantified through measures for noise efficiency and noise degradation according to for a noise level of and average TRE of noisy and noise-free registrations and , respectively. The impact of digitization noise on the average value and variance of was assessed for each method using the Wilcoxon rank sum and Brown–Forsythe tests, respectively, at a significance level of .TRE performance was also stratified by the effects of surface data coverage and target location within the segmental anatomy of the liver. The extent of surface data coverage was defined according to the method of Ref. 3, in which the data extent was computed as the percentage of total organ surface area encompassed by an alpha shape fit to the sparse intraoperative point cloud. Moreover, targets were stratified into the eight anatomical Couinaud segments to evaluate the potential impact of clinical designation of lesion location on registration performance. Anatomical segments S2 through S8 each contained 19 to 30 individual targets. The caudate lobe (S1) was not evaluated for lack of a sufficient number of validation targets in this region of the liver. Similarly, the effects of surface data coverage and target segment location on registration accuracy were assessed via the Friedman test with Bonferroni correction with significance level . The impact of initial rigid pose on TRE was also compared among methods that had code available to the authors. Sensitivities to initial alignment were assessed via two-sample Kolmogorov–Smirnov tests at a significance level of to detect whether statistical distributions of associated with the deformable registration methods differed under three conditions of initial rigid alignment: (1) an optimal point-based rigid registration of ground truth target positions, (2) an iterative closest point (ICP) rigid registration algorithm with manual initialization and refinement, and (3) a fully automatic salient feature weighted ICP (wICP) algorithm. Displacement fields of the deformable registration algorithms were also analyzed for biomechanical consistency by computing the norm of the rotation-invariant Green strain tensor and the Jacobian determinant of the displacement fields applied to the liver in each registration, according to for each element displacement gradient tensor , where is the L2 matrix norm, is the matrix determinant, and is the identity matrix.Finally, correlations among the resulting of each method were compared in a Pearson correlation plot to identify similarities in target error behaviors among registration methods. 3.Registration ComparatorsEight liver registration strategies were contributed and evaluated in the sparse data challenge. Complete submissions were made to the sparse data challenge for each method by providing displacement fields for registrations to all 112 sparse data patterns. These methods consisted of the following techniques:
Table 1 briefly summarizes the key features of these algorithms, which are described in detail in Sec. 7. Table 1Algorithmic profile of registration methods contributed to the sparse data challenge. 4.Results4.1.Target Registration ErrorsRegistration results from one of the 112 sparse datasets are visualized in Fig. 3 for the rigid and deformable registration comparators, with predicted and ground truth target positions displayed alongside the intraoperative distribution of sparse data driving the registration. Figure 4 illustrates the distribution of average TRE within registrations to each of the 112 datasets and summarizes the overall mean, standard deviation, and median performance of each method. Among rigid registration methods, the manually supervised ICP approach led to significantly lower average TRE than the fully automatic salient feature wICP algorithm (), although both ICP () and wICP () rigid alignments led to significantly worse average TRE than the optimal PBR alignment of targets. With respect to the deformable registration methods, although the biomechanical boundary condition reconstruction methods of Heiselman and Mestdagh did not significantly differ from each other (), the method of Heiselman provided registrations with significantly lower TRE than the optimal point-based rigid registration of targets (), whereas the method of Mestdagh did not significantly improve over the optimal PBR (). The deep learning method by Jia was not found to produce TRE higher than either the optimal point-based registration () or Mestdagh (), but significantly worse performance was detected compared to Heiselman (). Meanwhile, the biomechanical method by Ringel produced average TRE that did not significantly differ from Jia () or the ICP method (), and the deep learning method by Pfeiffer was significantly less accurate than ICP () but not wICP (). All other pairs of registration methods were found to produce significantly different levels of average TRE (all ). 4.2.Effect of Surface Data CoverageThe 112 sparse intraoperative datasets were stratified according to the extent of surface data coverage as a percentage of the total liver surface area. Of the 112 registration sets, 35 cases were associated with surface coverage between 20–28% extent, 42 cases between 28–36% extent, and 35 cases between 36–44% extent. Figure 5 plots the average TRE of each registration according to the extent of sparse surface data coverage provided over the liver. Across all registration methods, registration performance did not significantly differ as a function of surface data extent (). With respect to the rigid registration algorithms, ICP with manual initialization and refinement achieved average TRE values closer to the optimal rigid registration than the wICP algorithm over the full range of data extent (), which suggests that additional bias introduced by wICP feature weighting may lead to suboptimal rigid alignments despite its excellent practical utility in intraoperative workflows. In comparison with the deformable registration algorithms, Fig. 5 qualitatively shows that the finite element-based biomechanical methods 1 and 2 (Heiselman12 and Mestdagh14) outperform or achieve similar performance to the optimal point-based rigid registration, whereas the deep learning methods are associated with the highest errors among the deformable registration methods. These trends mirror the significance patterns reported in the previous section. The regularized Kelvinlet method (Ringel15) performed similarly to the deep learning method by Jia that incorporates a biomechanical simulation workflow, with slightly improved qualitative stability across extent ranges. Both the methods of Ringel and Jia offered significant improvements over wICP-based rigid registration () but failed to outperform the globally optimal rigid point-based registration. The relative stability of average TRE across the low to high extent ranges is consistent with the work of Refs. 3 and 18, which showed that rigid and nonrigid registration methods tend to reach an error floor plateau beyond extent ranges of . Yet, it should be noted that the deep learning approaches seem to exhibit less consistency in their performance across variations in surface data coverage. 4.3.Effect of Target LocationValidation targets were partitioned into their associated Couinaud segments of the liver to evaluate variations in accuracy contingent on clinical designations of possible target locations. Figure 6 shows the segments identified on the liver mesh and the distribution of distances from each target to the nearest sparse data point in each registration scenario. Columns S2 through S8 in Table 2 summarize the TRE performances of each registration algorithm across the associated anatomical segments. Across all registration methods, TRE significantly varied depending on the anatomical location of the target (). TRE of the deformable registration methods were found to be highest in the most peripheral segments of the liver (S2, S6, and S7) where the informational influence of sparse data tended to be weakest. Compared with S5, significantly higher TRE values were observed across registration methods in S2 (), S6 (), and S7 (), whereas S6 was also found to produce significantly higher TRE than S4 (). Intraoperatively, it should be noted that it is often not possible to collect point cloud data on the surface of S7 due to the dome of the right lobe of the liver obstructing line of sight and direct access to this area of the liver. Comparing the performance of each method across the segmental anatomy, the deep learning method of Jia and the regularized Kelvinlet method of Ringel exhibited highest TRE values in S7, which was the anatomical segment on average farthest away from the sparse data included in the dataset, whereas the deep learning method of Pfeiffer produced the least accurate inference of deformations in S2 and S6. The finite element-based biomechanical registration methods of Heiselman and Mestdagh achieved the most consistent performance across segments, although they exhibited highest errors in S6 and S7. When adjusting registration performance for the effect of target location, the only deformable registration methods to show significant improvement over ICP or wICP in all segments were those of Heiselman ( and 0.002, respectively) and Mestdagh ( [N.S] and 0.002, respectively), whereas all other comparisons among methods did not approach statistical significance after correction for multiple comparisons (). These behaviors highlight the need to control uncertainties in anatomical regions that are distant from data and illuminate the benefit of biomechanical models for stabilizing performance in deformable registration algorithms. This consideration may become especially pertinent for deep learning approaches considering their tendency to develop extrapolative fragility when making inferences beyond the span of their training data. Table 2TRE performance overall and in each anatomical segment, reported as mean ± standard deviation (median). Methods that achieve the lowest error within each segment and over all segments of the liver are emphasized in bold. 4.4.Sensitivity to Measurement NoiseVarying levels of measurement noise are often involved in image-to-physical registrations due to differences in data collection strategies, which may involve user variability, contact or non-contact intraoperative organ digitization techniques, or non-standardized depth reconstruction and tool localization algorithms. Measurement noise was simulated within the sparse data challenge via 84 registration datasets generated with added noise and 28 generated without noise to characterize algorithmic sensitivity to input noise sources. Figure 7 and Table 3 convey the influence of noise on average TRE of each method. The rigid registration and deep learning methods were not significantly affected by differences in input noise (all ), whereas the biomechanical methods were associated with a significant increase in mean TRE (all ). TRE variances of the finite element-based biomechanical methods also significantly increased under conditions of elevated measurement noise (largest ). Although the deep learning and rigid registration methods were less sensitive to added noise, only the biomechanical methods achieved registrations with high efficiency scores indicating TRE magnitudes on par with the relatively small noise level under investigation. Considering the modest noise magnitude, statistical analyses on the degradative effects of noise may be shrouded by the elevated level of baseline error and error variances associated with several of the other methods under investigation. Of the methods evaluated herein, only the finite element-based biomechanical boundary condition reconstruction methods were able to achieve high noise efficiency and small error variances under the 2-mm measurement noise condition. Table 3TRE performance in noise-free and noise-afflicted sparse registration datasets, reported as mean ± standard deviation.
Statistically significant findings are bolded. 4.5.Sensitivity to Initial AlignmentAll deformable registration methods with codes that were made available to the authors were further analyzed to characterize the susceptibility to differences in the choice of rigid pose that initializes the algorithm. The methods of Pfeiffer, Heiselman, and Ringel were included toward this objective. The optimal PBR, ICP, and wICP rigid alignment methods were chosen as initialization comparators for each of the three deformable registration methods, and the resulting distributions of average TRE across the 112 registrations are compared in Fig. 8. The finite element-based biomechanical strategy was robust to the initial pose, and the distributions of average TRE did not significantly shift across initialization strategies (). However, the regularized Kelvinlet-based biomechanical strategy expressed significant differences in TRE distribution under different initial pose configurations (), although the differences in magnitude shifted by . The deep learning method of Pfeiffer exhibited the largest differences in average TRE when varying the initial rigid alignment strategy (). 4.6.Field ConsistencyA biomechanical analysis of displacement fields from each method was performed to analyze constitutive regularity and yield deeper insights toward current algorithmic limitations. The strain norm and Jacobian determinant of displacements on each element were averaged across the 112 displacement fields and are rendered in Fig. 9 for each deformable registration method through a cross-section of the liver. The strain norm plots indicate the locality of where each registration method expects forces to be applied over the liver, and the Jacobian determinant, which measures local volume change and is expected to equal unity for nearly incompressible soft tissue, reveals additional field inconsistencies across registration methods. Given that the underlying deformations applied to the liver phantom consisted of mock laparotomy pads placed under the posterior surface of the liver, the distribution of strain is expected to be primarily distributed along the posterior surface of the liver. Given the absence of non-gravitational constitutive body forces and free exposure of the anterior surface, the remainder of the liver in this phantom experiment is expected to associate with low strain. Briefly, the finite element-based biomechanical methods demonstrate the closest results to the expected distribution of force deposition on the liver, whereas the regularized Kelvinlet method (Ringel15) offers the most concordant distribution of Jacobian determinants. Overall, each method produces distinct deformation field characteristics that are impacted by several modeling decisions and algorithmic choices, which are outlined in Sec. 7 and discussed in Sec. 5. 4.7.Correlation AnalysisFinally, Pearson correlation between individual samples for each combinatorial pair of methods are plotted in Fig. 10. This correlation plot reveals that target error magnitudes within and across rigid registration and deformable registration methods are in general poorly correlated with each other, with few Pearson correlation coefficients exceeding 0.5. Notably, the deep learning method of Pfeiffer is uncorrelated with the other biomechanically informed deformable registration algorithms, with correlation coefficients below 0.16. Similarly, the deep learning method of Jia is weakly correlated, with correlation coefficients below 0.37. Across all registration methods, the strongest correlations are achieved between manually supervised ICP and optimal PBR rigid registrations, and among the three biomechanical boundary condition reconstruction algorithms. Interestingly, the regularized Kelvinlet method of Ringel showed a high correlation with the rigid wICP method of Clements that served as the initial alignment for this method, suggesting that the regularized Kelvinlet approach could be more conservative in preserving the initial pose or could exhibit stiffening effects when performing deformable registration. The overarching lack of strong correlation implies that the specific algorithm choice is profoundly important with respect to the particular way targeting inaccuracies may manifest in prospective guidance applications, since a target location predicted by one algorithm may be only weakly related or even uncorrelated with the same target location predicted by a variant algorithm. Although certain families of registration methods may exhibit greater similarity, it is therefore important for multiple registration algorithms to be compared for performance when proposing new prospective applications for image guidance under sparse data-driven deforming environments. This finding also justifies potential investigation into decision fusion methods to identify consensus registrations that combine results from multiple methods. It is interesting to note that averaging the displacement fields between the two finite element deformable registration methods improves average TRE from individual baselines of and for the methods of Heiselman and Mestdagh, respectively, to merely across the sparse data challenge (, paired -test). 5.DiscussionResults from the sparse data challenge reveal variabilities in the effectiveness of registration strategies evaluated on a common dataset. It should be noted that all deformable registration methods utilized biomechanical deformation models to different degrees, whether through direct finite element simulation or reinforced through training in deep learning approaches. This dataset consisted of small to moderate deformation magnitudes with maximum target displacements of (max 15.5) mm across deformation states after factoring out rigid motion, which is consistent with clinically observed deformation magnitudes for open liver surgery reported in Ref. 19. These modest deformation magnitudes suggest that the linear elastic approaches pursued by many of the participating deformable registration methods are adequately suited to the conditions associated with the sparse data challenge. Importantly, data sparsity is a considerable barrier to driving intricate models with high degrees of freedom. Biomechanical simulation offers an opportunity to incorporate underlying structure to the registration problem and restore algorithm performance under scarce informational constraints. Target error performance of rigid registration is also revealed to be of great importance, with results showing that ICP and wICP algorithms lead to significant differences in rigid alignment and TRE. These findings suggest that determining an optimal rigid alignment from sparse surface data in the presence of underlying deformation is a non-trivial problem when the accuracy of subsurface targets is a primary concern. Due to violation of the rigid body assumption, particular attention toward the extrapolative performance of surface-based ICP registrations is needed when these approaches are applied to soft tissue organs. Moreover, Fig. 8 indicates that deformable registration algorithms may express different levels of sensitivity to variations in the initial rigid alignment. It should be noted that three of the five deformable registration methods (Heiselman,14,15 Jia,13 and Ringel17) take additional precautions to concurrently re-optimize rigid pose parameters during deformable registration. Considering best practices, it may be worthwhile to operate under an assumption that surface-based rigid alignments are fundamentally unreliable in the presence of soft tissue deformation. Nonetheless, the low error associated with the optimal PBR may suggest that, in some situations, a locally rigid approach may be appropriate if a sufficient number of landmarks in close proximity to a target of interest could be measured. However, accurately localized landmark data can be burdensome to collect intraoperatively. Furthermore, the findings in Table 2 suggest that the segmental location of a target of interest in the liver can profoundly affect its registration accuracy. Some biomechanical deformable registration algorithms, especially those configured to match anatomically pertinent boundary conditions, can obtain comparable or better registration errors than the most optimal point-based rigid registration while only using sparse point clouds of the intraoperative organ surface. With respect to commonalities among methods, the two finite element-based deformable registration methods (Heiselman14,15 and Mestdagh16) made use of the fact that physical deformation was applied only to the posterior surface of the liver phantom. This a priori knowledge can be leveraged to reduce the complexity of the problem space and potentially improve the conditioning between latent model parameters and data constraints in optimization-based registration methods. In Heiselman and Mestdagh, this knowledge was incorporated into the registration by taking advantage of natural stress-free boundary conditions inherent to the finite element method and eliminating reconstructive degrees of freedom over the anterior surface, which is expected to remain stress-free. Notably, neither the deep learning methods nor the regularized Kelvinlet method incorporated similar a priori information, which is likely a distinguishing factor separating the performance of these deformable registration methods. It should be remarked that the regularized Kelvinlet method assumes the organ to be embedded in an infinite elastic medium that more naturally represents zero-displacement boundary conditions than stress-free conditions wherever degrees of freedom are removed from the reconstructive framework. Additional characterization detailed in Ref. 17 showed that the performance of the regularized Kelvinlet registration method on the sparse data challenge dataset was optimized when boundary condition reconstruction was performed over the full organ surface, unlike the behavior of finite element simulations, which benefitted from eliminating reconstructive degrees of freedom over stress-free regions. Overall, the biomechanical boundary reconstruction methods demonstrated excellent performance, and the methodologically succinct representations of anticipated boundary conditions most accurately reflecting the underlying anatomy, physiology, and clinical task seemed to offer the most success for accurately predicting motions of deforming targets from sparse intraoperative surface data. Incorporating these insights into boundary condition generation will likely remain important for training future deep learning approaches that attempt to conform to the underlying biomechanics and extrapolate soft tissue behaviors from the intraoperative locale of available data into more distant regions. Figure 9 also illuminates how algorithmic differences and design choices may affect deviations in predicted target displacements among methods. In Fig. 9(a), the approach of Heiselman exhibits strain artifacts on the posterior surface that likely arise from the biomechanical simulation of superposed point load displacements generating numerically induced stress concentrations associated with this formulation of boundary conditions. The method by Mestdagh offers a smoother strain field, although it concentrates strain disproportionately around the portal vein entry point due to the numerical need for a fixed displacement constraint in the force-based reconstruction implemented in this approach. The deep learning method of Pfeiffer is associated with an inconsistent strain field that exhibits voxelization artifacts likely associated with data discretization procedures at input and output layers of the convolutional neural network (CNN), which is in stark contrast to the deep learning method of Jia that performs a biomechanical simulation guided by a learned point-convolutional shape occupancy objective function. Although the method of Jia produces approximately uniform strain, completely uniform strain distributions are inaccurate to the expected underlying biomechanics of elastic soft tissue deformation. The strain deposition of Heiselman and Mestdagh most closely represent the method by which the liver phantom was physically deformed with mock laparotomy pads placed under the posterior surface of the liver in an open surgery configuration. These contact forces are expected to cause the highest concentration of strain on the posterior surface of the liver. Notably, the strain deposition of the Ringel method reveals an effective force distribution applied to the anterior surface of the liver, which is less realistic to the underlying organ deformation but is likely required to compensate for the infinite elastic medium assumption of this method along a stress-free boundary. The Jacobian determinants in Fig. 9(b) reveal that the deep learning method by Pfeiffer produces volumetric deviations of up to 10%. These large deviations likely arise due to the voxelization procedure and high training loss of this method. It should be noted that the training procedure of this method on simulated biomechanical data converged to an error of 6 mm,12 which likely limits the overall ability of the network to accurately represent biomechanically consistent fine structures within displacement fields. By contrast, the deep learning method by Jia displays more uniform Jacobian determinants likely attributable to the use of an underlying biomechanical model and inclusion of strain energy regularization within the objective function. In fact, strain energy regularization was utilized in each of the contributed deformable registration methods except for those of Pfeiffer and Mestdagh. Strain energy regularization will likely continue to be an effective strategy for controlling field irregularities associated with deformable registration algorithms. In Fig. 9(b), the biomechanical registration algorithms all display approximately uniform Jacobian determinants, although the finite element-based methods tended to develop volumetric dilation within the thinner ridges of the liver. These volumetric distortions likely arise due to the use of linear elastic material simulation from a rigidly registered initial pose coordinate, wherein rotational components of finite element displacements relative to this coordinate frame will cause local dilatation due to rotation dependence of the linearized strain tensor. It should be noted that this dilatational effect can be abated by a technique used in Heiselman, Jia, and Ringel, wherein rigid transformation parameters are optimized concurrently with deformation parameters alongside strain energy regularization to redistribute local rotational effects globally throughout the mesh. It is noteworthy that the regularized Kelvinlet method of Ringel is unique among methods for its remarkable consistency with respect to the Jacobian determinant measure. This consistency is made possible due to the closed-form analytic nature of its deformation basis circumventing errors associated with numerical finite element simulation of linear elasticity. Finally, it needs to be emphasized that although the methods of Jia, Heiselman, and Mestdagh were based on numerical linear elastic simulation, the trained deformation responses in Pfeiffer instead were based on numerical simulation of a hyperelastic material. In Fig. 9(c), these choices in the underlying deformation model likely influence the relative symmetry of the Jacobian determinant around unity in Pfeiffer and Ringel as compared with the upward bias influenced by element dilatation evident in the other registration methods that employ linear elastic numerical simulation. With respect to the behavior of TRE among methods, quantitative differences in Table 2 would suggest that the algorithmic choices discussed above have profound consequences on the final accuracy of registrations between a soft tissue organ and sparse point cloud data representing its deformed state. These differences are also reflected in the low correlation of target errors in Fig. 10 and visual differences in predicted organ shapes in Fig. 3. Registration errors may crucially depend on the locations where errors are measured in combination with the locations where forces are imparted on the organ. Consequently, intraoperative data localization, algorithm initialization, and determination of where forces may act upon the organ are remarkably important to the task of image-to-physical deformable registration. With respect to the biomechanical model-based registration algorithms, the fidelity of boundary condition composition likely played a substantial role in separating the performance of Heiselman and Mestdagh from the other deformable registration methods. Comparing against the biomechanical method of Ringel suggests that although a biomechanical deformation basis offers a useful structure for constraining the registration problem, the specific tuning of boundary condition designation for relevant anatomical and physiological factors appears to be critical for optimizing registration performance. The ability to train a generalizable expectation for task-specific anatomical motions will likely be a major next step for similarly improving the accuracy of deep learning registration algorithms driven by sparse intraoperative data. Regarding the plateau behavior of TRE within the 20–40% extent range of data, we note in the general case that registration algorithms are often ill-posed, which results from underspecification of model parameters relative to the model constraints. Although an ordinary approach for resolving this indeterminacy would revolve around incorporating additional a posteriori measurement data, information located beyond the visible anterior surface would be necessary to more thoroughly inform unknown boundary conditions applied to regions where registration constraints are missing. Another potential approach to overcoming this performance plateau involves incorporating a priori domain knowledge such as biomechanical expectation of boundary conditions, large sets of training data, or other forms of regularization to impose bias on the registration model and its parameters. One exciting direction pursued by Jia attempts to alter the efficiency with which the data constraints inform the registration model using a learned objective function to optimize the extraction of model parameters from sparse data. However, all techniques that rely on strong a priori knowledge may ultimately impair generalizability. It is therefore necessary for registration algorithms to be explicitly tested for generalization performance through experimentation with unseen data or prospective validation studies that match the intended use case. The sparse data challenge also highlights the need to design studies that consider environmental factors such as the impact of measurement noise on registration performance, which has long remained underappreciated. Although rigid registrations appear to be relatively robust to measurement noise, deformable registration methods tend to be more susceptible. Particular care should be taken during algorithmic development and evaluation to quantitate or otherwise control elevations in error magnitudes and error variances that may occur due to changes in the level of input noise. This consideration will likely become even more important when training and validating deep learning approaches within medical image registration workflows as these algorithms continue to mature. The main limitations of the sparse data challenge include the simulation of measurement noise at only two noise amplitudes, restriction to a single baseline liver geometry, inclusion of only four distinct deformation states with modest deformations, and dependence on a synthetic silicone liver phantom over clinically obtained human validation data. In addition, the challenge does not provide subsurface data constraints to further inform registration beyond the provided sparse surface data patterns, and computational time requirements of each method were not part of the data collection process. Nonetheless, this challenge offers a detailed look into the performance of current state-of-the-art rigid and nonrigid sparse data registration algorithms for liver interventions and offers comparative insights into common and unique algorithmic traits that will continue to inform the next generation of image-to-physical sparse data registration algorithms. 6.ConclusionResults were presented for the first Image-to-Physical Liver Registration Sparse Data Challenge that evaluated and compared eight distinct rigid and deformable registration approaches with respect to registration accuracy under varying conditions of data coverage, target location, and measurement noise. In addition, sensitivity to algorithm initialization, displacement field consistency, and inter-registration similarity were explored. The overarching findings showed that biomechanical deformation bases tend to achieve the best registration accuracy and field consistency among state-of-the-art methods for sparse data-driven deformable image registration. Furthermore, only the family of biomechanical boundary condition reconstruction deformable registration algorithms outperformed the best achievable rigid registration when they incorporated task-specific insight into boundary condition composition. The results of this challenge suggest that specific implementation choices profoundly affect the TRE that develop from registration algorithms and their estimated displacement fields. 7.Appendix A: Contributed Methods7.1.Optimal Point-Based Rigid RegistrationThe first registration strategy is a common comparator representing an optimal PBR between the ground-truth preoperative and intraoperative positions of all 159 validation targets. A singular value decomposition approach was utilized to find the rigid transformation that minimizes the sum of squared TRE within each registration instance. Although this method incorporates unobtainable information about the true intraoperative target positions and therefore is not achievable in practice, errors associated with this globally optimal rigid registration represent a useful benchmark against which to gauge competing methods. 7.2.Manually Initialized Iterative Closest Point Rigid RegistrationThe second registration strategy is an ICP rigid registration algorithm10 initialized from a manually designated initial pose estimate. The ICP algorithm repeatedly updates a rigid transformation between the sparse data and their closest points on the preoperative organ model using a point-based rigid registration on each iteration. Naïve ICP algorithms are highly susceptible to local minima, and therefore the resulting rigid alignments were visually verified and re-initialized from new starting poses when any instance was deemed unsatisfactory. 7.3.Salient Feature Weighted Iterative Closest Point Rigid Registration (Clements)To overcome the need for manual interaction in the rigid registration process, the third strategy evaluated a fully automatic salient feature wICP algorithm.11 This method biases point correspondences according to preoperatively annotated anatomical feature patches to improve intraoperative robustness of the ICP algorithm. This technique utilizes a weighted point-based registration of sparse feature points to corresponding patches on the preoperative organ surface, with weight functions controlled over an exponentially decaying iteration schedule. Although demonstrated to be more robust than naïve ICP, wICP may incorporate additional bias toward preferentially aligning the specified feature information as opposed to the overall fit of the intraoperative point cloud. The salient features included the falciform ligament and left and right inferior ridges of the liver, which are explicitly marked in the sparse data challenge intraoperative point cloud patterns in Fig. 2. 7.4.Deep Learning Method 1: Signed Distance Map CNN (Pfeiffer)The fourth method is a deep learning deformable registration strategy (V2SNet12) based on a CNN trained on voxelized distance maps computed between a preoperative organ mesh and a partial data patch of the deformed intraoperative organ surface. Briefly, after an initial rigid registration is applied to align intraoperative data with the mesh, the network estimates a function , where is the voxelized signed distance map of the preoperative organ mesh, is the voxelized unsigned distance map of the partial intraoperative surface to the preoperative mesh, and is the estimated displacement field of the deformation. Training data for this method were generated from random organ mesh shapes on which random deformations with known ground truth were simulated using a hyperelastic biomechanical model. The neural network was trained in a multiresolution supervised manner according to a mean absolute error loss function: where is the ground truth voxel displacement; is the estimated voxel displacement; is the resolution of the voxelized image; and is a resolution-dependent training weight hyperparameter. The authors report that training data converged to mean error of .12 Pretrained network weights at the maximum resolution were used without retraining for inference of the liver model and intraoperative data associated with the sparse data challenge, using initial ICP rigid registration alignments from Sec. 7.2.7.5.Deep Learning Method 2: Probabilistic Occupancy Map PCNN (Jia)The fifth registration strategy is a data-driven nonrigid approach13 based on a learned occupancy map and point convolutional neural network (PCNN) to predict the likelihood that a particular volumetric shape takes a certain configuration based on a sparse input point cloud. The authors propose a deep neural network to model a differentiable occupancy map for the probability that a shape occupies position given a point cloud that describes a sparse representation of the shape surface. The occupancy map then represents a fuzzy boundary of the liver over the spatial support of , and an isocontour of represents an estimated organ shape. Agreement between the occupancy of a deforming preoperative liver model and a rigidly registered intraoperative point cloud is then optimized alongside a strain energy penalty to determine a set of rigid and nonrigid deformation parameters according to the following objective function: where is a transformation function that applies a deformation field parameterized by and rigid translations and rotations and . In this work, the authors define according to a superposed finite element deformation basis introduced in Ref. 3. The penalty term represents regularization by the strain energy in the manner of Rucker et al.20 More information and details about this method are provided in Ref. 13. Unlike the end-to-end deep learning method in Sec. 7.4, which attempts to learn a deep deformation basis to alleviate the need for intraoperative biomechanical simulation, this deep learning approach incorporates biomechanical simulation to establish a network-based filter for model-data correspondence errors. Compared with conventional simulation approaches that assume a model-data correspondence function and minimize the error between the set of observed data points and their corresponding locations on the deforming organ model, this technique explores an interesting framework for learned objective functions that alternatively encode preoperative-to-intraoperative correspondences through probabilistic model occupancy, which may offer new approaches to offset deleterious effects of measurement noise and uncertainty associated with anatomical correspondences in image-to-physical registration to sparse point cloud data. A unique feature of the PCNN model is that data voxelization is not required because the occupancy function can be sampled continuously through space at any point of interest.7.6.Biomechanical Method 1: Linearized Iterative Boundary Reconstruction (Heiselman)The sixth registration strategy is a linearized iterative boundary reconstruction method14,15 that uses a biomechanical finite element model of the organ to control nonrigid deformation. The method reconstructs a set of boundary conditions applied to a mesh of the preoperative organ that best explains the intraoperative deformation state observed in sparse data measurements of the organ surface. This technique decomposes the mechanical load applied to the boundary of the organ into a set of localized point forces distributed over the active contact surfaces of the organ. These local point forces are superposed to allow for rapid estimation of the deformation state from a precomputed basis of perturbation responses obtained from a linear elastic finite element model. An intraoperative deformation state is obtained by iteratively optimizing the following weighted least squares objective function: where is the Euclidean model-data error associated with the sparse data point in feature and the penalty term is the strain energy of the deformation state for the vector of deformation basis weights . The relative contributions of error terms and consequently the deformability of the registration is controlled by the ratio of the weight factors and . A total of 20 control points were distributed across the posterior surface of the liver to match the loading configuration applied to the organ in this dataset. The method is initialized with rigid transformation parameters determined by the wICP algorithm in Sec. 7.3 to establish an automatic and robust starting alignment after which the rigid and nonrigid parameters are simultaneously optimized in a Levenberg–Marquardt framework.7.7.Biomechanical Method 2: Adjoint Boundary Reconstruction (Mestdagh)The seventh registration strategy is a biomechanical method using a linear elastic finite element model within an adjoint optimization scheme that solves for boundary forces applied to the posterior surface of the liver mesh.16 First, an ICP rigid registration algorithm is applied prior to initiating the deformable registration. Then, an adjoint method is developed to iteratively minimize the least squares objective function: for nodal boundary forces with associated displacement field and adjoint state for an observed sparse data point . To numerically solve the system of equations, displacements over a small patch on the posterior face of the liver near the portal vein insertion point are fixed to zero-displacement Dirichlet boundary conditions. In contrast to the displacement-driven linearized method in Sec. 7.6, this method utilizes a force-based reconstruction over an iteratively updating forward model solution process. This method may be extended to hyperelastic and other nonlinear deformation models, although more sophisticated deformation models may incur potentially prohibitive costs to computation time. Notably, due to the adjoint approach, this linear elastic model does not concurrently optimize rigid transformation parameters.7.8.Biomechanical Method 3: Regularized Kelvinlet Boundary Reconstruction (Ringel)The eighth registration strategy builds upon the linearized iterative boundary reconstruction approach of Sec. 7.6 and similarly decomposes the mechanical load applied to the organ into a series of localized control point responses. However, this method replaces the finite element simulations with closed-form displacement equations associated with Kelvin state solutions established in formal elastic theory. The Kelvin state analytically models the linear elastic displacement response of a point load perturbation embedded within an infinite linear elastic domain. These point load responses can be superposed to establish a deformation basis consisting of a series of spatially localized Kelvinlet deformations that are distributed across the boundary of the organ. A regularization approach is incorporated to extend the analytic Kelvin response from a point load impulse to a spatially localized smoothed force density function. The regularized Kelvinlet displacement solutions are analytic algebraic equations that remove the need for computationally expensive finite element simulation and greatly accelerate realistic biomechanical simulation by leveraging classical solution methods to 3D linear elasticity. The regularized Kelvinlet displacement solution to a local force perturbation is defined as where is the force magnitude applied to the Kelvinlet center point, is the radial distance from the Kelvinlet center to any position in the 3D spatial support, is a regularized radius incorporating a radial scale , and and are material constants that depend on elastic modulus and compressibility. The regularized Kelvinlet displacement basis is used in the same reconstructive framework as the method of Ref. 14 in Sec. 7.6 with an identical objective function to Eq. (8), except a larger number of 160 control points distributed over the complete liver surface are required to reach optimal algorithmic performance. Additional characterization and implementation details are provided in Ref. 17. It should be noted that, compared with the finite element simulation, this method assumes a specific local force density function that depends on the regularization parameter , and its assumption of an infinitely homogeneous elastic medium renders the solution less adaptive to patient-specific organ geometry and mechanically disjoint contact interfaces.DisclosuresThe authors have no relevant financial conflicts of interests to disclose. The authors note that several methods developed within the laboratory of the principal investigator (M.I.M.) were contributed to this work. The full validation data presented in this manuscript were compartmentalized and did not influence the design or development of any method contributed by the authors. Code and Data AvailabilityThe complete sparse data challenge dataset including unblinded target validation data and analysis techniques are available for download via the Open Science Framework at Ref. 8. AcknowledgmentsThis work was supported by the National Institutes of Health (NIH) Grant Nos. R01EB027498 and T32EB021937. Collection of clinical data patterns was supported by NIH Grant No. P30CA008748. Acquisitions of CT imaging data were supported by NIH Grant No. S10OD012297-01A1 and the PET-CT Scanner housed in the Vanderbilt Center for Human Imaging. This work is extended from conference proceedings presented at SPIE Medical Imaging 2023.21 ReferencesL. Maier-Hein et al.,
“Comparative validation of single-shot optical techniques for laparoscopic 3-D surface reconstruction,”
IEEE Trans. Med. Imaging, 33
(10), 1913
–1930 https://doi.org/10.1109/TMI.2014.2325607 ITMID4 0278-0062
(2014).
Google Scholar
L. W. Clements et al.,
“Organ surface deformation measurement and analysis in open hepatic surgery: method and preliminary results from 12 clinical cases,”
IEEE Trans. Biomed. Eng., 58
(8), 2280
–2289 https://doi.org/10.1109/TBME.2011.2146782 IEBEAX 0018-9294
(2011).
Google Scholar
J. S. Heiselman et al.,
“Characterization and correction of soft tissue deformation in laparoscopic image-guided liver surgery,”
J. Med. Imaging, 5
(2), 021203 https://doi.org/10.1117/1.JMI.5.2.021203 JMEIET 0920-5497
(2018).
Google Scholar
E. L. Brewer et al.,
“The image-to-physical liver registration sparse data challenge,”
Proc. SPIE, 10951 109511F https://doi.org/10.1117/12.2513952 PSISDG 0277-786X
(2019).
Google Scholar
M. I. Miga,
“The Image-to-Physical Liver Registration Sparse Data Challenge,”
www.sparsedatachallenge.org
().
Google Scholar
L. W. Clements et al.,
“Evaluation of model-based deformation correction in image-guided liver surgery via tracked intraoperative ultrasound,”
J. Med. Imaging, 3
(1), 015003 https://doi.org/10.1117/1.JMI.3.1.015003 JMEIET 0920-5497
(2016).
Google Scholar
L. W. Clements et al.,
“Deformation correction for image guided liver surgery: an intraoperative fidelity assessment,”
Surgery, 162
(3), 537
–547 https://doi.org/10.1016/j.surg.2017.04.020 SURGAZ 0039-6060
(2017).
Google Scholar
J. Heiselman,
“Image-to-Physical Liver Registration Sparse Data Challenge,”
https://osf.io/u3dxy/
(2023).
Google Scholar
J. A. Collins et al.,
“Improving registration robustness for image-guided liver surgery in a novel human-to-phantom data framework,”
IEEE Trans. Med. Imaging, 36
(7), 1502
–1510 https://doi.org/10.1109/TMI.2017.2668842 ITMID4 0278-0062
(2017).
Google Scholar
P. J. Besl and N. D. McKay,
“A method for registration of 3-D shapes,”
IEEE Trans. Pattern Anal. Mach. Intell., 14
(2), 239
–256 https://doi.org/10.1109/34.121791 ITPIDJ 0162-8828
(1992).
Google Scholar
L. W. Clements et al.,
“Robust surface registration using salient anatomical features for image-guided liver surgery: algorithm and validation,”
Med. Phys., 35
(6), 2528
–2540 https://doi.org/10.1118/1.2911920 MPHYA6 0094-2405
(2008).
Google Scholar
M. Pfeiffer et al.,
“Non-rigid volume to surface registration using a data-driven biomechanical model,”
Lect. Notes Comput. Sci., 12264 724
–734 https://doi.org/10.1007/978-3-030-59719-1_70 LNCSD9 0302-9743
(2020).
Google Scholar
M. Jia and M. Kyan,
“Improving intraoperative liver registration in image-guided surgery with learning-based reconstruction,”
in IEEE Int. Conf. Acoust. Speech Signal Process. - Proc. (ICASSP),
1230
–1234
(2021). https://doi.org/10.1109/ICASSP39728.2021.9414549 Google Scholar
J. S. Heiselman, W. R. Jarnagin and M. I. Miga,
“Intraoperative correction of liver deformation using sparse surface and vascular features via linearized iterative boundary reconstruction,”
IEEE Trans. Med. Imaging, 39
(6), 2223
–2234 https://doi.org/10.1109/TMI.2020.2967322 ITMID4 0278-0062
(2020).
Google Scholar
J. S. Heiselman and M. I. Miga,
“The image-to-physical liver registration sparse data challenge: characterizing inverse biomechanical model resolution,”
Proc. SPIE, 11315 113151F https://doi.org/10.1117/12.2550535 PSISDG 0277-786X
(2020).
Google Scholar
G. Mestdagh and S. Cotin,
“An optimal control problem for elastic registration and force estimation in augmented surgery,”
Lect. Notes Comput. Sci., 13437 74
–83 https://doi.org/10.1007/978-3-031-16449-1_8 LNCSD9 0302-9743
(2022).
Google Scholar
M. J. Ringel et al.,
“Comparing regularized Kelvinlet functions and the finite element method for registration of medical images to sparse organ data,”
(2023). Google Scholar
M. Pfeiffer et al.,
“Learning soft tissue behavior of organs for surgical navigation with convolutional neural networks,”
Int. J. Comput. Assist. Radiol. Surg., 14
(7), 1147
–1155 https://doi.org/10.1007/s11548-019-01965-7
(2019).
Google Scholar
L. W. Clements et al.,
“Organ surface deformation measurement and analysis in open hepatic surgery: method and preliminary results from 12 clinical cases,”
IEEE Trans. Biomed. Eng., 58
(8), 2280
–2289 https://doi.org/10.1109/TBME.2011.2146782 IEBEAX 0018-9294
(2011).
Google Scholar
D. C. Rucker et al.,
“A mechanics-based nonrigid registration method for liver surgery using sparse intraoperative data,”
IEEE Trans. Med. Imaging, 33
(1), 147
–158 https://doi.org/10.1109/TMI.2013.2283016 ITMID4 0278-0062
(2014).
Google Scholar
J. S. Heiselman et al.,
“Comparison study of biomechanical and data-driven soft tissue registration: preliminary results from the image-to-physical liver registration sparse data challenge,”
Proc. SPIE, 12466 124660M https://doi.org/10.1117/12.2655468 PSISDG 0277-786X
(2023).
Google Scholar
BiographyJon S. Heiselman, PhD, is a postdoctoral researcher at Vanderbilt University with joint appointment at Memorial Sloan Kettering Cancer Center. He received his BSE degree in biomedical engineering from the University of Michigan in 2015 and his PhD in biomedical engineering from Vanderbilt University in 2020. His research interests involve image registration, biomechanics, and computational modeling to improve preoperative assessment, intraoperative delivery, and postoperative monitoring of therapy. He has been a member of SPIE since 2017. Morgan J. Ringel, BSE, is a graduate student in biomedical engineering at Vanderbilt University. She received her BSE degree in biomedical and electrical and computer engineering from Duke University in 2018. She has been a student member of SPIE since 2019. William R. Jarnagin, MD, is an attending surgeon at Memorial Sloan-Kettering Cancer Center since 1997, where he has served as chief of the Hepatopancreatobiliary Service since 2008 and was vice-chairman of the Department of Surgery from 2006 to 2010. He holds the Leslie H. Blumgart MD Chair in surgical oncology and is professor of surgery at Weill Medical College of Cornell University. Michael I. Miga received his PhD from Dartmouth College specializing in biomedical engineering. He joined the faculty in the Department of Biomedical Engineering at Vanderbilt University in 2001 and is the Harvie Branscomb Professor at Vanderbilt and a professor of biomedical engineering. He is director of the Biomedical Modeling Laboratory and co-founder of the Vanderbilt Institute for Surgery and Engineering (VISE, www.vanderbilt.edu/vise). He is also PI and director of a novel NIH T32 training program entitled, “Training Program for Innovative Engineering Research in Surgery and Intervention” that is focused on the creation of translational technologies for treatment and discovery in surgery and intervention. He also was a co-inventor of the first FDA cleared image guided liver surgery system. He is an AIMBE and SPIE fellow. His research interests are in computational modeling, inverse problems/computational imaging, soft-tissue biomechanics/biotransport, technology-guided therapy, image/imaging-guided surgery and intervention, and data-driven procedural medicine. |
Image registration
Deformation
Liver
Biomechanics
Deep learning
Rigid registration
Data modeling