Research Papers: Imaging

Monte Carlo–based fluorescence molecular tomography reconstruction method accelerated by a cluster of graphic processing units

[+] Author Affiliations
Guotao Quan, Hui Gong, Yong Deng, Jianwei Fu, Qingming Luo

Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Britton Chance Center for Biomedical Photonics, Wuhan 430074, China

J. Biomed. Opt. 16(2), 026018 (February 15, 2011). doi:10.1117/1.3544548
History: Received September 17, 2010; Revised December 29, 2010; Accepted January 03, 2011; Published February 15, 2011; Online February 15, 2011
Text Size: A A A

Open Access Open Access

* Address all correspondence to: Hui Gong, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Britton Chance Center for Biomedical Photonics, Wuhan 430074, China. E-mail: huigong@mail.hust.edu.cn.

High-speed fluorescence molecular tomography (FMT) reconstruction for 3-D heterogeneous media is still one of the most challenging problems in diffusive optical fluorescence imaging. In this paper, we propose a fast FMT reconstruction method that is based on Monte Carlo (MC) simulation and accelerated by a cluster of graphics processing units (GPUs). Based on the Message Passing Interface standard, we modified the MC code for fast FMT reconstruction, and different Green's functions representing the flux distribution in media are calculated simultaneously by different GPUs in the cluster. A load-balancing method was also developed to increase the computational efficiency. By applying the Fréchet derivative, a Jacobian matrix is formed to reconstruct the distribution of the fluorochromes using the calculated Green's functions. Phantom experiments have shown that only 10 min are required to get reconstruction results with a cluster of 6 GPUs, rather than 6 h with a cluster of multiple dual opteron CPU nodes. Because of the advantages of high accuracy and suitability for 3-D heterogeneity media with refractive-index-unmatched boundaries from the MC simulation, the GPU cluster-accelerated method provides a reliable approach to high-speed reconstruction for FMT imaging.

Figures in this Article

As fluorescence labeling technology develops,12 an increasing number of biologists want to observe the distribution of fluorescence targets in vivo. Therefore, fluorescence molecular tomography (FMT)34 has been developed. Because FMT can be used to quantitatively reveal the fluorescence marker's fluorescence yield56 and lifetime7in vivo, FMT is a promising, small-animal imaging method for drug development and cancer research.89

The reconstruction algorithm is the core technology of FMT. The algorithm can be summarized as follows.4,10 First, the distribution of the photon density of fluorescence with a different source, which is called Green's function, is calculated. Then, by applying the Fréchet derivative, a Jacobian matrix is formed with the calculated Green's functions. Finally, the distribution of the fluorescence yield is inversely calculated from the Jacobian matrix by an optimization algorithm. The first step in the algorithm is called the forward problem, which investigates the propagation of light in tissue; the second and third steps are called the inverse problem, which reconstructs the distribution of the fluorescence yield. A traditional reconstruction algorithm has been proposed based on a diffusion approximation (DA) and using the finite-element method (FEM).1114 The traditional method is very fast, but it is valid only when the reduced scattering coefficients of the tissue are far greater than the absorption coefficients. Although the FMT reconstruction method based on the high order of the radiative transport equation (RTE) is suitable for heterogeneous media with complex distributions of optical coefficients, low computational efficiency limits its application. For example, Joshi et al. proposed a radiative transport-based frequency-domain fluorescence tomography with the most accurate angular discretization of RTE, but it requires more than 3.5 h on a 16-node Beowulf cluster.15

Monte Carlo (MC) methods were introduced in FMT. They are used to calculate the Green's functions with which the Jacobian matrix is formed to reconstruct the distribution of the fluorescence yield. Because MC methods are the gold standard for simulating the light propagation in tissue and are valid for all kinds of tissue,1618 they are suitable for small-animal imaging, which has a complex distribution of optical coefficients. Kumar et al. proposed a reconstruction method based on a MC method that reconstructs the distribution of the fluorescence lifetime in time-domain fluorescence tomography for small-animal imaging. A phantom experiment showed that this method can clearly separate two fluorochromes with 6-mm spacing.19 Zhang et al. also introduced MC methods into the reconstruction of fluorescence tomography for small-animal imaging.20 However, because of the low computational efficiency and the large number of Green's functions requiring calculation, more than 6 h are required to reconstruct the result, even with a parallel computing cluster of central processing units (CPUs).

Graphics processing units (GPUs) have been introduced in MC simulations to accelerate the simulation of the propagation of light in tissue when the distribution of the optical coefficient is already determined. The GPU is the core of a graphics card. In the past, all software ran on CPU; GPUs were only used in image processing and output until general-purpose computing on graphics processing units were proposed and unleashed the power of GPUs for parallel computation. The peak floating-point operations per/s (flop/s) of marketavailable graphics hardware can reach 1000 Gflop/s, which is 10 times higher than that of a Harpertown CPU with a 3.2-GHz frequency.21 By using the power of the GPU for parallel computation, MC methods, which simulate the propagation of light in tissue, can be accelerated to speeds that are 100 to 1000 times faster than those obtained with the CPU only. For example, Alerstam et al. used a single GPU to accelerate an MC simulation by a factor of 1000 for multilayered tissues;22 Lo et al. proposed a multi-GPU-accelerated MC simulation for photodynamic therapy treatment;23 Fang and Boas sped up an MC simulation with a single GPU by a factor of 100 to 300 for complex 3-D turbid media;24 and Fang also released their code named Monte Carlo eXtreme (MCX).25

Until now, the FMT reconstruction method based on DA with FEM is still limited in high-scattering tissue; however, many kinds of tissues in small animals do not satisfy the condition of high-scattering, such as rabbit muscle, colon adenocarcinomas,26 the cerebral spinal fluid layer in the brain, and cysts in the human breast.24 FMT reconstruction methods based on the high orders of both RTE and MC are suitable for heterogeneity media with a complex distribution of optical coefficients, but poor computational efficiency has tremendously limited their applications. GPU can tremendously accelerate the speed of MC; however, it has not yet been used in FMT reconstruction. Furthermore, single GPU-accelerated FMT reconstruction cannot satisfy the demand for fast reconstruction because of the large number of source-detector pairs.

In this paper, we propose a fast MC-based FMT reconstruction, which is suitable for 3-D heterogeneity media with refractive-index-unmatched boundaries and a complex distribution of optical coefficients. Phantom experiments with either high or low-scattering are used to demonstrate the accuracy and speed of the method. By analyzing the reconstructed localization and concentration of the fluorochromes, we compare the accuracy of our method with a traditional method based on DA with FEM. By the load-balancing strategy, we optimize the performance of the GPU cluster for faster reconstruction and compare with that based on CPU and single GPU.

Theory of FMT

In this section, a brief introduction to the theory of FMT reconstruction based on MC simulations accelerated by a GPU cluster will be given.

The distribution of fluorescence photons in tissue can be expressed as follows:3,7Display Formula

1φf(Rd,Rs)=gλem(Rd,r)x(r)φλex(r,Rs)dr,
where ϕf(Rd, Rs) is the photon density of fluorescence at Rd with a source Rs, gλem(Rd, r) is Green's function with a source at r (whose wavelength is λem) and a detector at Rd, ϕλex(r, Rs) is the excitation photon density with a source at Rs (whose wavelength is λex) and a detector at r and x(r) can be expressed as follows: Display Formula
2x(r)=ημαf(r)1jωτ(r)1+(ωτ(r))2,
where ημαf(r) is fluorescence yield, η is the quantum efficiency, μαf(r) is the absorption coefficient of the fluorochrome, τ(r) is the lifetime of the fluorochrome at r,11 and ω is the modality frequency of the light source. Because our system is a continuous wavelength system, therefore in this paper, ω = 0 and x(r) = ημαf(r).

When the light source is a narrow collimation laser, the light source can be expressed as a δ function, so ϕλex(r, Rs) = gλex(r, Rs) (Refs. 7 and 27), and, according to reciprocity,28gλem(Rd, r) = gλem(r, Rd). Equation 1 can be expressed as follows: Display Formula

3φf(Rd,Rs)=gλem(r,Rd)x(r)gλex(r,Rs)dr.
Applying the Fréchet derivative and combining Eq. 3 yields the Jacobian function:4,7Display Formula
4J(Rd,Rs)=[φf(Rs,Rd)]x=gλem(r,Rd)gλex(r,Rs)dr,
where J(Rd, Rs) is the Jacobian function with a source at Rs and a detector at Rd.

The number of reconstructed fluorescence yields of FMT was determined by the number of discrete points (Np). To reduce the ill-posed number of coefficients for FMT, the measurement points referring to the number of sources (Ns) multiplied by the number of detectors (Nd) must be greater than the number of reconstructed fluorescence yield, which is equal to Np. Therefore, the number of source-detector pairs is very large. When there are Ns sources, Nd detectors, and Np discrete points in the experiment, the Jacobian matrix can be expressed as follows: Display Formula

5J=JRd1,Rs1......JRdi,RsjJRdNd,RsNs.

In this paper, the Green's function in Eq. 4 was calculated by the kernel of MCX. According to Eq. 5, there are Ns + Nd Green's functions to be calculated to construct a (Ns × Nd) × Np Jacobian matrix.

Combining the Jacobian matrix in Eq. 5, and according to Tikhonov regularization,2930 the distribution of the fluorescence yield can be reconstructed. Tikhonov regularization can be expressed as Display Formula

6min(||Jxy||2+λ||Γx||),
where J is the Jacobian matrix shown in Eq. 5, Γ is the Tikhonov matrix (generally, Γ = I where I is the identity matrix) and λ = 10−20, where λ is the relaxation factor, which was determined by the L-curve method.31 The variable y is the photon density of fluorescence in the experiment, and x = ημaf represents the reconstructed fluorescence yield.11 The conjugate gradient method was used to solve Eq. 6.32

Acceleration by a GPU Cluster

To further improve the speed of the reconstruction method based on MC simulations, with the goal of satisfying the demand for fast reconstruction in FMT, a GPU cluster was constructed with MPICH2, which is a high-performance and widely portable implementation of the MPI standard (both MPI 1.0 and MPI 2.0). The entire calculation task of the Green's functions needed by the inverse problem was distributed to the GPU cluster. The hardware configuration is listed in Table 1. Three personal computers in a local area network equipped with a total of six GPUs of the G200 framework (which supports both double and single precision) were used to construct a GPU cluster. A flowchart of the GPU cluster-accelerated FMT reconstruction method based on MC simulations is shown in Fig. 1.

Graphic Jump LocationF1 :

A flowchart of the GPU cluster-accelerated FMT reconstruction method based on MC simulations.

Table Grahic Jump Location
The hardware of the GPU cluster.

The processes in Fig. 1 can be briefly summarized as follows:

  1. The host computer receives the file containing the position and direction of the sources and the detectors, the average optical coefficients, and the phantom size from the experiment. The computer then transmits this information to each computing node in the GPU cluster.
  2. The CPU compute nodes receive all of the experimental information from the host computer and distribute the tasks of Green's function computation into different GPUs. Because the calculation time for Green's functions is different for different GPUs, the time consumption is determined by the GPU that would have the lowest performance if the computing tasks were equally distributed into each GPU of the cluster. Therefore, a load-balancing strategy is used to achieve maximum parallel efficiency. All tasks are distributed into GPUs according to the following equation: Display Formula
    7Ni=T task *P_ GPU iiN GPU P_ GPU i,
    where Ni is the number of computing tasks for Green's functions for the i’th GPU and Ttask is the total number of Green's function tasks. P_ GPU i is the processing power of the i’th GPU, which is listed in Table 1. NGPU is the total number of GPUs. The load-balancing of CPU is not considered because the main compute task in the compute node of CPU is to calculate the Green's function by GPU, so CPU is rarely used.
  3. According to Eq. 7, the start and end indices of the computing task (istart, iend) for each GPU are calculated. The MC simulation is used to calculate the Green's function from istart to iend, and the fluences produced by MC are saved in a file named i.mc2. The MC simulation refers to MCX.
  4. When the entire list of Green's functions has been calculated by the GPU cluster, all of the files that record the results of the Green's functions are transformed to the host computer to construct the Jacobian matrix required for reconstruction.
  5. Tikhonov regularization is used to reconstruct the distribution of the fluorescence yield. The expression is shown in Eq. 6.

To ensure the validity of the Green's function calculation, the kernel of the MC simulation refers to MCX with the fast math library of compute unified device architecture (CUDA) and without the atomic compile option. The random-number generators (RNGs) for the MC simulation use a floating-point based RNG with a chaotic logistic map, the size of the logistic lattice is 5, this RNG has a higher computational efficiency as the optimization of 32 bit floating-point operations.24 In addition, all of the codes based on the CPU were compiled in Visual Studio 2008 with the −02 (MaxSpeed) option on a Windows Server 2008 R2 X64 system. The codes based on the GPU were compiled with the NVCC 3.0 (beta) which is a compiler in CUDA.

In this section, two phantom experiments for both low- and high-scattering media are used to demonstrate the accuracy and speed of MC-based FMT reconstruction accelerated by a GPU cluster.

Reconstruction Accuracy

Four glass tubes (2.2-mm in diameter) with different concentrations of Dir-Boa fluorescence dye33 were placed in a rectangular box (13.5×40.5×60 mm) with a mixed solution of Intralipid and India ink. The concentration in the four glass tubes ranged from 1.8 μmol/L to 1.2 μmol/L. Two phantom experiments with different kinds of optical coefficients for the mixed solution of Intralipid and India ink were conducted. The glass tubes in the phantom experiment are heterogeneous media because the optical coefficients of the solution of Dir-Boa fluorescence dye (high-absorption, low-scattering) are different from those for the mixed solution of Intralipid and India ink, and the refractive index of glass is also mismatched with the mixed solution. The heterogeneity of the fluorescence solution is considered in the calculation of the Green's function for MC-based FMT reconstruction but not in that based on DA with FEM because it is not accurate for DA when μa > μs′.

Case 1. The absorption coefficient of the first experiment was 0.23 mm−1, and the reduced scattering coefficient was 0.68 mm−1. In this case, μα ≈ μs′, which indicates a low-scattering case. In theory, the DA will be invalid. However, the MC simulation was accurate.

Case 2. The absorption coefficient of the second experiment was 0.07 mm−1, and the reduced scattering coefficient was 1.8 mm−1. In this case, μα ≪ μs, which indicates the high-scattering case. In theory, the DA should be as accurate as the MC simulation.

These two types of optical coefficients refer to the muscle tissue and the skin of a rabbit at a wavelength of 790 nm.26 A dual-modality imaging system combined with micro-computer tomography (micro-CT) and FMT 34 was then used to obtain the fluorescence images. A micro-CT slice was used to prove the localization accuracy of the reconstruction.

The reconstruction area (1.35×40.5×20 mm) was smaller than the real size of the phantom because the fluorescence signal only appeared in that area. The spatial resolution of the FMT is about 1 mm;35 therefore, by applying a 1-mm3 discrete accuracy, the reconstruction area was separated into 14×41×20 voxels. In this experiment, there were 99 sources and 220 detectors, so 319 Green functions were calculated. Therefore, the amount of data from the experiment (99×220 = 21,780) was larger than the number of reconstructed fluorescence yield, which was equal to the number of voxels (14×41×20 = 11,480). As a result, the ill-posed FMT reconstruction could be relieved.

Six GPUs were used to calculate the Green's functions, and 105 photon moves were simulated at each thread of the GPU. There were 2560 threads in total and 256 threads in each block of the GPU. We repeated this calculation 12 times to increase the total number of simulation photons and to avoid the kernel launch time-out error.25 In each calculation of a Green's function, there were 1.36 × 107 photons to be simulated on average, and 4.34109 photons were simulated for the total reconstruction process.

The reconstruction method based on a DA with the FEM refers to NIRFAST3637and TOAST.27,38 By applying a 1-mm3 mesh, the reconstruction area was discrete, with 12,054 points and 50,100 tetrahedrons.

Tikhonov regularization was solved with a conjugate gradient method that was stopped at the 18th iteration, when the difference between two consecutive iterations was less than 10−6. The reconstructed distribution of the fluorescence yield is shown in Fig. 2.

Graphic Jump LocationF2 :

Reconstruction slices. (a), (b), and (c) show the reconstruction slices for case 1. (d), (e), and (f) show the reconstruction slices for case 2. Micro-CT refers to the central reconstruction slice with Micro-CT. MC refers to the central reconstruction slice for the FMT reconstruction method based on MC simulations accelerated by a GPU cluster. FEM refers to the central reconstruction slice for the FMT reconstruction method based on a DA with the FEM. The position and size of the white circles in (b) and (c) were determined by the Micro-CT slice in (a), and the white circles in panels (e) and (f) were determined by the Micro-CT slice in panel (d).

The average value of the reconstructed fluorescence coefficient of the four fluorochromes for both case 1 and case 2 was calculated from Figs. 2. These values were calculated with different FMT reconstruction methods based on both MC simulations and DA. Because the fluorescence yield was in direct proportion to the concentration of the fluorochromes, the reconstructed concentrations of four fluorochromes were calculated through a linear fit of the fluorescence yield against the real concentration of the fluorochromes. The results are shown in Fig. 3.

Graphic Jump LocationF3 :

Reconstruction of the concentration of fluorochromes. (a) shows the reconstructed concentration versus the real concentration of four different fluorochromes for case 1. The concentrations were reconstructed by the FMT reconstruction method based on both MC simulations and DA. (b) represents the case 2 experimental conditions. The linear fit function is also shown in (a) and (b).

The error between the linear fit and the real reconstructed concentration was used to determine the coefficient of the linear fit, which refers to the linearity of the reconstructed concentration. The resulting values are listed in Table 2. The sum of squares error (SSE) refers to the error between the linear fit and the reconstructed concentration, and R-squared indicates the linearity of the fit. Under ideal conditions, the SSE is zero and R-squared is 1.

Table Grahic Jump Location
The coefficients of the linear fit.
Acceleration Performance

In this section, the effect of the load-balancing method based on the calculation power of GPU is compared with equal-load-balancing. The acceleration performance for different numbers of GPUs in the GPU cluster is compared for both case 1 and case 2.

We distribute all the tasks into GPU clusters with two different balancing methods: load-balancing with Eq. 7 and equal loading, which equally distributes the task into GPUs. We use the experimental data for case 1 and case 2. The configuration of the MC and GPU is mentioned in Sec. 3b, and the reconstruction time is recorded for comparison. The results are shown in Table 3.

Table Grahic Jump Location
Performance of load-balancing.

From Table 3, we find that load-balancing based on Eq. 7 increases the computational efficiency.

We separately tested the performance of the FMT reconstruction method based on MC simulations with a GPU cluster containing one, two, four, or six GPUs. In every thread of the GPU, 105 photon moves were simulated. There were 7680 threads in total in computer 1 and computer 2, but there were 6144 threads in total in computer 3. Every block in the GPU contained 256 threads. Each GPU in a GTX295 graphics card contains 30 independent multiprocessors (MPs), which contain 16384 registers, and MCX occupies 54 registers at each thread. Therefore, 256 threads could be run simultaneously in each MP. For the entire GPU, 30×256 = 7680 threads could be run simultaneously. However, the GPU on the NVIDIA Quadro FX4800 graphics card of computer 3 only contained 24 MPs, so 24×256 = 6144 threads in total could be run simultaneously. With these settings, the computing power of a graphics card with a G200 framework can be fully unleashed. Each calculation of a Green's function was repeated 12 times to avoid kernel launch time-out errors25 and to increase the total number of simulated photons.

For each calculation of a Green's function, there were 4×107 simulated photons on average and 1.3×1010 in total for the entire reconstruction. Because the total number of photons in the reconstruction process was only determined by the total number of calculated Green's functions, the photon moves in each thread, the total number of threads, and the different numbers of GPUs in the GPU cluster did not influence the total number of simulation photons. However, there were small differences in the total number of photons when the GPU of the Quadro FX4800 in computer 3 was used because the total number of available threads for the Quadro FX4800 was only 6144.

The influence of the number of GPUs in the GPU cluster on the amount of time consumed for the FMT reconstruction method based on MC simulations is shown in Fig. 4. The time data for one and two GPUs in the GPU cluster came from the computing node of computer 1. Four GPUs refers to the computing nodes of computers 1 and 2, and six GPUs in the GPU cluster refers to computing nodes of computers 1, 2, and 3.

Graphic Jump LocationF4 :

The time performance of different numbers of GPUs in the GPU cluster for case 1 and case 2.

We also compared the acceleration ratio between the speeds of the CPU code and the GPU cluster code. The speed of the CPU code was estimated by the tMCimg code,39which was compiled with –O2(maxspeed) in Visual Studio 2008. Because it takes a long time to use the tMCimg in FMT reconstruction, we only tested the speed of tMCimg for one calculation of a Green's function. This speed was approximately equal to the speed for the whole reconstruction because the Green's function calculation consume 95% of the time needed for the total reconstruction. The results showed that for case 1, the calculation speed of a Green's function based on CPU code was 75 photons/ms, and for case 2, the speed was 7.6 photons/ms. The performance using different numbers of GPUs in the GPU cluster is shown in Fig. 5, and the acceleration ratio compared with the CPU is shown in Table 4.

Graphic Jump LocationF5 :

The speed performance of different numbers of GPUs in the GPU cluster for case 1 and case 2.

Table Grahic Jump Location
The acceleration ratio for different numbers of GPUs.

For the low-scattering case (μa ≈ μs′), Figs. 2, and Fig. 3 illustrate that the GPU-cluster-accelerated FMT reconstruction method based on MC simulations accurately reconstructed the localization and concentration of fluorochromes. The traditional reconstruction method based on a DA with the FEM failed in this case. This superiority of our method in reconstructing the concentration is quantitatively demonstrated in Table 2.

In the experiment for the high-scattering case (μα ≪ μs), which has typically been reconstructed with the traditional reconstruction method based on a DA with the FEM, the GPU cluster–accelerated FMT reconstruction method based on MC simulations also accurately reconstructed the localization and concentration of fluorochromes. Figures 2, Fig. 3, and Table 2 show that, in the high-scattering case, the two methods achieve almost identical results. The only disparity is a very small difference in the linearity of the reconstructed concentration because the heterogeneity of the fluorescence solution is considered in the MC-based FMT reconstruction method accelerated by GPU clusters and the size of the fluorescence solution is small, which cannot cause a large error in the calculation of the Green's function. This small difference indicates the advantage of our method for heterogeneous media with refractive-index-unmatched boundaries.

The GPU cluster-accelerated FMT reconstruction method based on MC simulations can accurately reconstruct the localization and concentration of fluorochromes for heterogeneous media under both low- and high-scattering conditions. In contrast, the traditional FMT reconstruction method based on a DA with the FEM was only valid for the case in which μα ≪ μs.

Figure 4 shows that the FMT reconstruction method based on MC simulations accelerated by a single GPU for case 1 and case 2 requires 44 and 49 min to finish the whole process, which does not satisfy the demand for fast reconstruction in FMT. However, the FMT reconstruction method based on MC simulations accelerated with six GPUs for case 1 and case 2 requires only 7.7 and 9.2 min, which is the same as the time required by the FMT reconstruction method based on a DA with the FEM.12 We also found that the more GPUs in the GPU cluster, the less time was required for the FMT reconstruction based on MC simulations, which predicts that a GPU cluster with about 15 Nvidia GTX 480 graphic cards (about 3 times faster than Nvidia GTX 295) may require only 1 min to finish the entire FMT-reconstruction process. Note, however, that the larger reduced scattering coefficient required more processing time in the FMT reconstruction based on MC simulations because a larger reduced scattering coefficient means that the photons need more steps to escape from the tissue, which requires more time to calculate.

The more GPUs in the GPU cluster, the faster the speed of the FMT reconstruction method based on MC simulations. Compared with the CPU code, the FMT reconstruction method based on MC simulations accelerated with 6 GPUs was 3088 times faster for case 2, but only 373 times faster for case 1. This difference is caused by the speed of the FMT reconstruction method based on MC simulations decreasing more quickly for the CPU code than for the GPU cluster when the reduced scattering coefficient increased.

In summary, with the MPI standard and load-balancing method, we propose a fast FMT reconstruction method which is based on MC simulation and accelerated by a GPU cluster. Through two phantom experiments on both high- and low-scattering media, we prove that this method can accurately and rapidly reconstruct the fluorochrome localization and concentration for 3-D heterogeneous media with refractive-index-unmatched boundaries and complex distributions of optical coefficients.

We gratefully acknowledge the financial support of the National Major Scientific Research Program of China (Grant No. UNSPECIFIED 2011CB910400 ) and the National Natural Science Foundation of China (Grant No. NNSF 60828009 ).

Tsien  R. Y., “ Building and breeding molecules to spy on cells and tumors. ,” Febs. Letters.. 579, (4 ), 927–932  ((2005)).
Funovics  M., , Weissleder  R., , and Tung  C. H., “ Protease sensors for bioimaging. ,” Anal. Bioanal. Chem.. 377, (6 ), 956–963  ((2003)).
Ntziachristos  V., “ Fluorescence Molecular Imaging. ,” Annu. Rev. Biomed. Eng.. 8, (1 ), 1–33  ((2006)).
Milstein  A. B., , Oh  S., , Webb  K. J., , Bouman  C. A., , Zhang  Q., , Boas  D. A., , and Millane  R. P., “ Fluorescence optical diffuse tomography. ,” Appl. Opt.. 42, (16 ), 3081–3094  ((2003)).
Churmakov  D. Y., , Meglinski  I. V., , and Greenhalgh  D. A., “ Amending of fluorescence sensor signal localization in human skin by matching of the refractive index. ,” J. Biomed. Opt.. 9, (2 ), 339–46  ((2004)).
Churmakov  D. Y., , Meglinski  I. V., , Piletsky  S. A., , and Greenhalgh  D. A., “ Analysis of skin tissues spatial fluorescence distribution by the Monte Carlo simulation. ,” J. Phys. D. 36, (14 ), 1722–1728  ((2003)).
Kumar  A. T., , Raymond  S. B., , Boverman  G., , Boas  D. A., , and Bacskai  B. J., “ Time resolved fluorescence tomography of turbid media based on lifetime contrast. ,” Opt. Express.. 14, (25 ), 12255–12270  ((2006)).
Montet  X., , Figueiredo  J. L., , Alencar  H., , Ntziachristos  V., , Mahmood  U., , and Weissleder  R., “ Tomographic fluorescence imaging of tumor vascular volume in mice. ,” Radiology. 242, (3 ), 751–758  ((2007)).
Zacharakis  G., , Kambara  H., , Shih  H., , Ripoll  J., , Grimm  J., , Saeki  Y., , Weissleder  R., , and Ntziachristos  V., “ Volumetric tomography of fluorescent proteins through small animals in vivo. ,” Proc. Natl. Acad. Sci. U.S.A.. 102, (51 ), 18252–18257  ((2005)).
Arridge  S. R., and Schotland  J. C., “ Optical tomography: forward and inverse problems. ,” Inverse Probl.. 25, (12 ), 1–59  ((2009)).
Cong  X. A., and Wang  G., “ A finite-element-based reconstrution method for 3-D fluorescence tomography. ,” Opt. Express. 13, (24 ), 9847–9857  ((2005)).
Ralf Schulz  B., , Ale  A., , Sarantopoulos  A., , Freyer  M., , Soehngen  E., , Zientkowska  M., , and Ntziachristos  V., “ Hybrid System for Simultaneous Fluorescence and X-ray Computed Tomography. ,” IEEE Trans. Med. Imaging. 29, (2 ), 465–473  ((2009)).
Kepshire  D., , Mincu  N., , Hutchins  M., , Gruber  J., , Dehghani  H., , Hypnarowski  J., , Leblond  F., , Khayat  M., , and Pogue  B. W., “ A microcomputed tomography guided fluorescence tomography system for small animal molecular imaging. ,” Rev. Sci. Instrum.. 80, (4 ), 043701  ((2009)).
Tan  Y., and Jiang  H., “ Diffuse optical tomography guided quantitative fluorescence molecular tomography. ,” Appl. Opt.. 47, (12 ), 2011–2016  ((2008)).
Joshi  A., , Rasmussen  J. C., , Sevick-Muraca  E. M., , Wareing  T. A., , and McGhee  J., “ Radiative transport-based frequency-domain fluorescence tomography. ,” Phys. Med. Biol.. 53, (8 ), 2069–2088  ((2008)).
Wilson  B. C., and Adam  G., “ A Monte Carlo model for the absorption and flux distributions of light in tissue. ,” Med. Phys.. 10, (6 ), 824–830  ((1983)).
Li  T., “ MCVM: Monte Carlo Modeling of Photon Migration In Voxelized Media. ,” J. Innov. Opt. Health Sci.. 3, (2 ), 91–102  ((2010)).
Wang  L., , Jacques  S. L., , and Zheng  L., “ MCML―Monte Carlo modeling of light transport in multi-layered tissues. ,” Comput. Methods Programs Biomed.. 47, (2 ), 131–46  ((1995)).
Kumar  A. T. N., , Raymond  S. B., , Dunn  A. K., , Bacskai  B. J., , and Boas  D. A., “ A time domain fluorescence tomography system for small animal Imaging. ,” IEEE Trans. Med. Imaging. 27, (8 ), 1152–1163  ((2008)).
Zhang  X., , Badea  C., , Jacob  M., , and Johnson  G. A., “ Development of a noncontact 3-D fluorescence tomography system for small animal in vivo imaging. ,” Proc. SPIE. 7191, , 71910D  ((2009)).
Alerstam  E., , Svensson  T., , and Andersson-Engels  S., “ Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration. ,” J. Biomed. Opt.. 13, (6 ), 060504  ((2008)).
Lo  W. C. Y., , Han  T. D., , Rose  J., , and Lilge  L., “ GPU-accelerated Monte Carlo simulation for photodynamic therapy treatment planning. ,” Proc. SPIE. 7373, , 737313  ((2009)).
Fang  Q. Q., and Boas  D. A., “ Monte Carlo Simulation of Photon Migration in 3-D Turbid Media Accelerated by Graphics Processing Units. ,” Opt. Express.. 17, (22 ), 20178–20190  ((2009)).
Fang  Q.,  Monte Carlo eXtreme (MCX). , (2009); 0.2: Available from: http://mcx.sourceforge.net/cgi-bin/index.cgi?Home.
Beek  J. F., , Blokland  P., , Posthumus  P., , Aalders  M., , Pickering  J. W., , Sterenborg  H. J. C. M., , and vanGemert  M. J. C., “ In vitro double-integrating-sphere optical properties of tissues between 630 and 1064 nm. ,” Phys. Med. Biol.. 42, (11 ), 2255–2261  ((1997)).
Arridge  S. R., “ Optical tomography in medical imaging. ,” Inverse Probl.. 15, (2 ), R41–R93  ((1999)).
Arridge  S. R., and Schweiger  M., “ Photon-measurement density functions. Part 2: Finite-element-method calculations. ,” Appl. Opt.. 34, (34 ), 8026–8037  ((1995)).
Pogue  B. W., , McBride  T. O., , Prewitt  J., , Osterberg  U. L., , and Paulsen  K. D., “ Spatially variant regularization improves diffuse optical tomography. ,” Appl. Opt.. 38, (13 ), 2950–2961  ((1999)).
Tikhonov  A. N., , Leonov  A. S., , and Yagola  A. G.,  Nonlinear Ill-Posed Problems (Applied Mathematics & Mathematical Computation). ,  Springer ,  Berlin  ((1997)).
Calvetti  D., , Morigi  S., , Reichel  L., , and Sgallari  F., “ Tikhonov regularization and the L-curve for large discrete ill-posed problems. ,” J. Comput. Appl. Math.. 123, (1–2 ), 423–446  ((2000)).
Rubaek  T., , Meaney  P. M., , Meincke  P., , and Paulsen  K. D., “ Nonlinear microwave imaging for breast-cancer screening using Gauss-Newton's method and the CGLS inversion algorithm. ,” IEEE Trans. Antenn. Propag.. 55, (8 ), 2320–2331  ((2007)).
Zhang  Z., , Cao  W., , Jin  H., , Lovell  J. F., , Yang  M., , Ding  L., , Chen  J., , Corbin  I., , Luo  Q., , and Zheng  G., “ Biomimetic Nanocarrier for Direct Cytosolic Drug Delivery. ,” Angew. Chem. Int. Ed.. 48, (48 ), 9171–9175  ((2009)).
Yang  X., , Gong  H., , Quan  G., , Deng  Y., , and Luo  Q., “ Combined system of fluorescence diffuse optical tomography and micro-CT for small animal imaging. ,” Rev. Sci. Instrum.. 81, (5 ), 054304  ((2010)).
Graves  E. E., , Ripoll  J., , Weissleder  R., , and Ntziachristos  V., “ A submillimeter resolution fluorescence molecular imaging system for small animal imaging. ,” Med. Phys.. 30, (5 ), 901–11  ((2003)).
Dehghani  H.,  NIRFAST.. (2008); Available from: http://www.dartmouth.edu/∼nir/nirfast/index.php.
Dehghani  H., and Delpy  D. T., “ Near-infrared spectroscopy of the adult head: effect of scattering and absorbing obstructions in the cerebrospinal fluid layer on light distribution in the tissue. ,” Appl. Opt.. 39, (25 ), 4721–9  ((2000)).
Schweiger  M., and Arridge  S.,  TOAST.. (2008); Available from: http://web4.cs.ucl.ac.uk/research/vis/toast/license.html.
Boas  D. A., , Culver  J. P., , Stott  J. J., , and Dunn  A. K., “ Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head. ,” Opt. Express.. 10, (3 ), 159–170  ((2002)).
Comparison of Nvidia graphics processing units. Available from: http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units.
© 2011 Society of Photo-Optical Instrumentation Engineers (SPIE)

Citation

Guotao Quan ; Hui Gong ; Yong Deng ; Jianwei Fu and Qingming Luo
"Monte Carlo–based fluorescence molecular tomography reconstruction method accelerated by a cluster of graphic processing units", J. Biomed. Opt. 16(2), 026018 (February 15, 2011). ; http://dx.doi.org/10.1117/1.3544548


Figures

Graphic Jump LocationF2 :

Reconstruction slices. (a), (b), and (c) show the reconstruction slices for case 1. (d), (e), and (f) show the reconstruction slices for case 2. Micro-CT refers to the central reconstruction slice with Micro-CT. MC refers to the central reconstruction slice for the FMT reconstruction method based on MC simulations accelerated by a GPU cluster. FEM refers to the central reconstruction slice for the FMT reconstruction method based on a DA with the FEM. The position and size of the white circles in (b) and (c) were determined by the Micro-CT slice in (a), and the white circles in panels (e) and (f) were determined by the Micro-CT slice in panel (d).

Graphic Jump LocationF3 :

Reconstruction of the concentration of fluorochromes. (a) shows the reconstructed concentration versus the real concentration of four different fluorochromes for case 1. The concentrations were reconstructed by the FMT reconstruction method based on both MC simulations and DA. (b) represents the case 2 experimental conditions. The linear fit function is also shown in (a) and (b).

Graphic Jump LocationF4 :

The time performance of different numbers of GPUs in the GPU cluster for case 1 and case 2.

Graphic Jump LocationF5 :

The speed performance of different numbers of GPUs in the GPU cluster for case 1 and case 2.

Graphic Jump LocationF1 :

A flowchart of the GPU cluster-accelerated FMT reconstruction method based on MC simulations.

Tables

Table Grahic Jump Location
The coefficients of the linear fit.
Table Grahic Jump Location
Performance of load-balancing.
Table Grahic Jump Location
The hardware of the GPU cluster.
Table Grahic Jump Location
The acceleration ratio for different numbers of GPUs.

References

Tsien  R. Y., “ Building and breeding molecules to spy on cells and tumors. ,” Febs. Letters.. 579, (4 ), 927–932  ((2005)).
Funovics  M., , Weissleder  R., , and Tung  C. H., “ Protease sensors for bioimaging. ,” Anal. Bioanal. Chem.. 377, (6 ), 956–963  ((2003)).
Ntziachristos  V., “ Fluorescence Molecular Imaging. ,” Annu. Rev. Biomed. Eng.. 8, (1 ), 1–33  ((2006)).
Milstein  A. B., , Oh  S., , Webb  K. J., , Bouman  C. A., , Zhang  Q., , Boas  D. A., , and Millane  R. P., “ Fluorescence optical diffuse tomography. ,” Appl. Opt.. 42, (16 ), 3081–3094  ((2003)).
Churmakov  D. Y., , Meglinski  I. V., , and Greenhalgh  D. A., “ Amending of fluorescence sensor signal localization in human skin by matching of the refractive index. ,” J. Biomed. Opt.. 9, (2 ), 339–46  ((2004)).
Churmakov  D. Y., , Meglinski  I. V., , Piletsky  S. A., , and Greenhalgh  D. A., “ Analysis of skin tissues spatial fluorescence distribution by the Monte Carlo simulation. ,” J. Phys. D. 36, (14 ), 1722–1728  ((2003)).
Kumar  A. T., , Raymond  S. B., , Boverman  G., , Boas  D. A., , and Bacskai  B. J., “ Time resolved fluorescence tomography of turbid media based on lifetime contrast. ,” Opt. Express.. 14, (25 ), 12255–12270  ((2006)).
Montet  X., , Figueiredo  J. L., , Alencar  H., , Ntziachristos  V., , Mahmood  U., , and Weissleder  R., “ Tomographic fluorescence imaging of tumor vascular volume in mice. ,” Radiology. 242, (3 ), 751–758  ((2007)).
Zacharakis  G., , Kambara  H., , Shih  H., , Ripoll  J., , Grimm  J., , Saeki  Y., , Weissleder  R., , and Ntziachristos  V., “ Volumetric tomography of fluorescent proteins through small animals in vivo. ,” Proc. Natl. Acad. Sci. U.S.A.. 102, (51 ), 18252–18257  ((2005)).
Arridge  S. R., and Schotland  J. C., “ Optical tomography: forward and inverse problems. ,” Inverse Probl.. 25, (12 ), 1–59  ((2009)).
Cong  X. A., and Wang  G., “ A finite-element-based reconstrution method for 3-D fluorescence tomography. ,” Opt. Express. 13, (24 ), 9847–9857  ((2005)).
Ralf Schulz  B., , Ale  A., , Sarantopoulos  A., , Freyer  M., , Soehngen  E., , Zientkowska  M., , and Ntziachristos  V., “ Hybrid System for Simultaneous Fluorescence and X-ray Computed Tomography. ,” IEEE Trans. Med. Imaging. 29, (2 ), 465–473  ((2009)).
Kepshire  D., , Mincu  N., , Hutchins  M., , Gruber  J., , Dehghani  H., , Hypnarowski  J., , Leblond  F., , Khayat  M., , and Pogue  B. W., “ A microcomputed tomography guided fluorescence tomography system for small animal molecular imaging. ,” Rev. Sci. Instrum.. 80, (4 ), 043701  ((2009)).
Tan  Y., and Jiang  H., “ Diffuse optical tomography guided quantitative fluorescence molecular tomography. ,” Appl. Opt.. 47, (12 ), 2011–2016  ((2008)).
Joshi  A., , Rasmussen  J. C., , Sevick-Muraca  E. M., , Wareing  T. A., , and McGhee  J., “ Radiative transport-based frequency-domain fluorescence tomography. ,” Phys. Med. Biol.. 53, (8 ), 2069–2088  ((2008)).
Wilson  B. C., and Adam  G., “ A Monte Carlo model for the absorption and flux distributions of light in tissue. ,” Med. Phys.. 10, (6 ), 824–830  ((1983)).
Li  T., “ MCVM: Monte Carlo Modeling of Photon Migration In Voxelized Media. ,” J. Innov. Opt. Health Sci.. 3, (2 ), 91–102  ((2010)).
Wang  L., , Jacques  S. L., , and Zheng  L., “ MCML―Monte Carlo modeling of light transport in multi-layered tissues. ,” Comput. Methods Programs Biomed.. 47, (2 ), 131–46  ((1995)).
Kumar  A. T. N., , Raymond  S. B., , Dunn  A. K., , Bacskai  B. J., , and Boas  D. A., “ A time domain fluorescence tomography system for small animal Imaging. ,” IEEE Trans. Med. Imaging. 27, (8 ), 1152–1163  ((2008)).
Zhang  X., , Badea  C., , Jacob  M., , and Johnson  G. A., “ Development of a noncontact 3-D fluorescence tomography system for small animal in vivo imaging. ,” Proc. SPIE. 7191, , 71910D  ((2009)).
Alerstam  E., , Svensson  T., , and Andersson-Engels  S., “ Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration. ,” J. Biomed. Opt.. 13, (6 ), 060504  ((2008)).
Lo  W. C. Y., , Han  T. D., , Rose  J., , and Lilge  L., “ GPU-accelerated Monte Carlo simulation for photodynamic therapy treatment planning. ,” Proc. SPIE. 7373, , 737313  ((2009)).
Fang  Q. Q., and Boas  D. A., “ Monte Carlo Simulation of Photon Migration in 3-D Turbid Media Accelerated by Graphics Processing Units. ,” Opt. Express.. 17, (22 ), 20178–20190  ((2009)).
Fang  Q.,  Monte Carlo eXtreme (MCX). , (2009); 0.2: Available from: http://mcx.sourceforge.net/cgi-bin/index.cgi?Home.
Beek  J. F., , Blokland  P., , Posthumus  P., , Aalders  M., , Pickering  J. W., , Sterenborg  H. J. C. M., , and vanGemert  M. J. C., “ In vitro double-integrating-sphere optical properties of tissues between 630 and 1064 nm. ,” Phys. Med. Biol.. 42, (11 ), 2255–2261  ((1997)).
Arridge  S. R., “ Optical tomography in medical imaging. ,” Inverse Probl.. 15, (2 ), R41–R93  ((1999)).
Arridge  S. R., and Schweiger  M., “ Photon-measurement density functions. Part 2: Finite-element-method calculations. ,” Appl. Opt.. 34, (34 ), 8026–8037  ((1995)).
Pogue  B. W., , McBride  T. O., , Prewitt  J., , Osterberg  U. L., , and Paulsen  K. D., “ Spatially variant regularization improves diffuse optical tomography. ,” Appl. Opt.. 38, (13 ), 2950–2961  ((1999)).
Tikhonov  A. N., , Leonov  A. S., , and Yagola  A. G.,  Nonlinear Ill-Posed Problems (Applied Mathematics & Mathematical Computation). ,  Springer ,  Berlin  ((1997)).
Calvetti  D., , Morigi  S., , Reichel  L., , and Sgallari  F., “ Tikhonov regularization and the L-curve for large discrete ill-posed problems. ,” J. Comput. Appl. Math.. 123, (1–2 ), 423–446  ((2000)).
Rubaek  T., , Meaney  P. M., , Meincke  P., , and Paulsen  K. D., “ Nonlinear microwave imaging for breast-cancer screening using Gauss-Newton's method and the CGLS inversion algorithm. ,” IEEE Trans. Antenn. Propag.. 55, (8 ), 2320–2331  ((2007)).
Zhang  Z., , Cao  W., , Jin  H., , Lovell  J. F., , Yang  M., , Ding  L., , Chen  J., , Corbin  I., , Luo  Q., , and Zheng  G., “ Biomimetic Nanocarrier for Direct Cytosolic Drug Delivery. ,” Angew. Chem. Int. Ed.. 48, (48 ), 9171–9175  ((2009)).
Yang  X., , Gong  H., , Quan  G., , Deng  Y., , and Luo  Q., “ Combined system of fluorescence diffuse optical tomography and micro-CT for small animal imaging. ,” Rev. Sci. Instrum.. 81, (5 ), 054304  ((2010)).
Graves  E. E., , Ripoll  J., , Weissleder  R., , and Ntziachristos  V., “ A submillimeter resolution fluorescence molecular imaging system for small animal imaging. ,” Med. Phys.. 30, (5 ), 901–11  ((2003)).
Dehghani  H.,  NIRFAST.. (2008); Available from: http://www.dartmouth.edu/∼nir/nirfast/index.php.
Dehghani  H., and Delpy  D. T., “ Near-infrared spectroscopy of the adult head: effect of scattering and absorbing obstructions in the cerebrospinal fluid layer on light distribution in the tissue. ,” Appl. Opt.. 39, (25 ), 4721–9  ((2000)).
Schweiger  M., and Arridge  S.,  TOAST.. (2008); Available from: http://web4.cs.ucl.ac.uk/research/vis/toast/license.html.
Boas  D. A., , Culver  J. P., , Stott  J. J., , and Dunn  A. K., “ Three dimensional Monte Carlo code for photon migration through complex heterogeneous media including the adult human head. ,” Opt. Express.. 10, (3 ), 159–170  ((2002)).
Comparison of Nvidia graphics processing units. Available from: http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units.

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Related Book Chapters

Topic Collections

Advertisement
  • Don't have an account?
  • Subscribe to the SPIE Digital Library
  • Create a FREE account to sign up for Digital Library content alerts and gain access to institutional subscriptions remotely.
Access This Article
Sign in or Create a personal account to Buy this article ($20 for members, $25 for non-members).
Access This Proceeding
Sign in or Create a personal account to Buy this article ($15 for members, $18 for non-members).
Access This Chapter

Access to SPIE eBooks is limited to subscribing institutions and is not available as part of a personal subscription. Print or electronic versions of individual SPIE books may be purchased via SPIE.org.