Literature DB >> 35534924

Graphics-processing-unit-accelerated Monte Carlo simulation of polarized light in complex three-dimensional media.

Shijie Yan¹, Steven L Jacques², Jessica C Ramella-Roman³, Qianqian Fang⁴.

Abstract

SIGNIFICANCE: Monte Carlo (MC) methods have been applied for studying interactions between polarized light and biological tissues, but most existing MC codes supporting polarization modeling can only simulate homogeneous or multi-layered domains, resulting in approximations when handling realistic tissue structures. AIM: Over the past decade, the speed of MC simulations has seen dramatic improvement with massively parallel computing techniques. Developing hardware-accelerated MC simulation algorithms that can accurately model polarized light inside three-dimensional (3D) heterogeneous tissues can greatly expand the utility of polarization in biophotonics applications. APPROACH: Here, we report a highly efficient polarized MC algorithm capable of modeling arbitrarily complex media defined over a voxelated domain. Each voxel of the domain can be associated with spherical scatters of various radii and densities. The Stokes vector of each simulated photon packet is updated through photon propagation, creating spatially resolved polarization measurements over the detectors or domain surface.
RESULTS: We have implemented this algorithm in our widely disseminated MC simulator, Monte Carlo eXtreme (MCX). It is validated by comparing with a reference central-processing-unit-based simulator in both homogeneous and layered domains, showing excellent agreement and a 931-fold speedup.
CONCLUSION: The polarization-enabled MCX offers biophotonics community an efficient tool to explore polarized light in bio-tissues, and is freely available at http://mcx.space/.

Entities: Chemical

Keywords: Monte Carlo method; light transport; optical imaging; polarization

Mesh：

Year: 2022 PMID： 35534924 PMCID： PMC9084406 DOI： 10.1117/1.JBO.27.8.083015

Source DB: PubMed Journal: J Biomed Opt ISSN： 1083-3668 Impact factor: 3.758

Introduction

Polarized light has been found to be highly sensitive to medium structures and hence has been widely adopted in optical imaging to probe microstructural features inside biological tissues. For example, the polarization status of the backscattered light can be measured to characterize the superficial layer of skin for cancer diagnostic purposes., The measurements of tissue birefringence permit quantification of abnormalities of the retinal nerve fiber layer and cornea, as well as three-dimensional (3D) reconstruction of nerve fiber orientations inside human brains and orientations of collagen within the uterine cervix. By measuring the unequal absorption of left-handed and right-handed circularly polarized light, circular dichroism can rapidly determine the folding properties of proteins. Quantification of collagen and birefringent media alignment can improve evaluation of a therapeutic strategy and its outcome in scar management. Polarized light imaging (PLI) uses linearly co-polarized images subtracted by those of cross-polarized light to create a differential image based on the small population of superficially scattered photons that still retain much of the incident polarization state. PLI subtracts the large randomized population of multiple-scattered photons that produce a blinding background of diffuse light. The resulting difference image enhances the contrast of superficial tissue layers and rejects deeper tissue structures, enabling wide-field screening of epidermal or epithelial layers. Mueller matrix polarimetry and the use of various decomposition methods can also be used to pinpoint different regions and structures within biological tissue., Accurately simulating polarized light transport inside complex tissues allow quantitative investigations of the depth response of polarized light and the perturbations produced by local tissue abnormalities. The propagation of polarized light inside scattering media can be described by the vector radiative transfer equation (VRTE). Analytically solving the VRTE is not viable in complex media such as human tissues. Owing to its high flexibility and simplicity in programming, the Monte Carlo (MC) method, among other numerical techniques, has been one of the limited approaches available to quantitatively model interactions of polarized light with scattering media. Depending on the vectorial representations of polarization states, polarized light MC algorithms can be largely categorized into two formalisms—Jones calculus and Mueller calculus.,, The Jones calculus used in the electric-field MC (EMC) algorithm traces the amplitudes and phases of two orthogonal electric-field components (Jones vector) and is therefore well suited for simulating light coherence effects. On the other hand, the Mueller calculus describes the state of polarization using the Stokes vector. The Stokes vector does not contain the absolute phase of the electrical field but allows to model unpolarized, partially polarized and fully polarized light. It can be obtained by measuring four intensity values. Similarly, Mueller matrices can be obtained through 16 intensity measurements and using Mueller matrix decomposition, quantities, such as tissue retardation, depolarization and de-attenuation can be obtained. A well-known limitation of MC methods is the long computation time. Due to the rapid emergence of massively parallel computing techniques, benefit largely from the fast advances in many core processors such as graphics processing units (GPUs), MC simulations of polarized light have seen significant speed improvement over the past decade. Several groups have reported parallel EMC implementations. Wang et al. presented a compute unified device architecture (CUDA)-based EMC to model coherent light in a single homogeneous slab and achieved over speedup compared to the central processing unit (CPU) counterpart. Ding et al. extended the GPU-based EMC algorithm to consider multi-layered media at the expense of reduced speedup (). In addition, Li et al. presented a CUDA-based polarized light MC algorithm to model interstitial media embedded with spherical and cylindrical scatterers. It employed a single-kernel scheme and was hundreds of times faster than its CPU version. In 2019, Oulhaj et al. reported a GPU-accelerated MC algorithm to efficiently compute the sensitivity profile for polarized light inside homogeneous media. The reported GPU implementation was verified against a widely used CPU-based code developed by Ramella-Roman et al. and reported over speedup. Although these studies have demonstrated significantly improved simulation speed, most of these simulators only support layered domains and can not address the needs in modeling increasingly complex media. In the simulation of biological tissues with irregular-shaped structures, employing simplifications in domain geometries could introduce significant errors. For example, in the MC modeling of human brains, noticeable differences have been observed between layered-slab models and more anatomically realistic models such as voxel-based and mesh-based brain models. In this work, we present an open-source and GPU-accelerated MC simulator to model polarized light inside 3D heterogeneous media. This MC algorithm utilizes a 3D voxelated grid to represent spatially varying distributions of spherical scatterers, characterized by their radii and densities. We use Muller calculus to update the Stokes vectors of simulated photon packets, from which we can compute various polarimetry related measurements along the surface of the domain. In the remainder of this paper, we first briefly review the steps of the meridian-plane MC algorithm. Then we detail our GPU-implementation of this algorithm, as part of our enhanced open-source MC software—Monte Carlo eXtreme (MCX), including the preprocessing steps to encode the distribution of particles into a 3D array data structure and optimization strategies to better use GPU resources. In Sec. 3, we validate the proposed GPU-based polarization-enabled MCX (pMCX) against the widely used CPU MC simulator “meridianMC” written by Ramella-Roman et al. and quantify the speed improvement using several benchmarks of homogeneous and heterogeneous domains. Finally, we summarize the key findings and discuss future directions.

Methods

Meridian-Plane Polarized Light MC

The meridian-plane polarized light MC algorithm largely follows the standard MC photon transport simulation steps, including “launch,” “move,” “absorb,” “scatter,” and “detection.” At the “launch” stage, the initial weight, position, and direction vector are defined for each photon packet depending on the profile of the incident beam. To describe the polarization state, the Stokes vector is defined with respect to the initial meridian plane for every simulated photon packet. The meridian plane is defined by the plane spanned by the photon propagation direction and the axis, as shown in Fig. 2 in Ref. 24. The Stokes vector consists of four quantities [], where () describes the total light intensity, () controls the mixing between horizontally () and vertically () linearly polarized light, () controls the mixing between () and () linearly polarized light, and () controls the mixing between right () and left () circularly polarized light. After the “launch” step, the photon packet starts propagating inside the simulation domain. In lossy media, the packet weight is monotonically reduced along the photon’s paths and the weight loss is accumulated into the local grid element (such as a voxel or tetrahedral element). When arriving at an interaction site, the photon packet changes direction due to scattering. To compute the new direction cosines, the scattering zenith angle and azimuth angle are statistically sampled. Compared to the standard MC, the scattering step in a polarized light simulation requires additional computation to properly update . First, the probability density function (also known as the scattering phase function) of polarized light has a bivariate dependence on both and . For incident light with a Stokes vector , the phase function is where and are elements from the scattering matrix from a homogeneous spherical particle, computed via the Mie theory A rejection method is employed to select both angles and . Once and are determined, the Stokes vector must be rotated relative to the new meridian plane using to update the polarization states. To efficiently perform the rejection method and Stokes vector rotation, the elements of the scattering matrix of all user-specified spherical scatter species are pre-computed over a discreteized set of . A photon packet is terminated when it escapes from the simulation domain or, if enabled, fails to survive a Russian roulette. It is noteworthy that the original meridian-plane polarized light MC assumes refractive-index matched domain boundaries. The Stokes vector of the escaping photon is rotated relative to the meridian plane of the detector placed immediately outside the domain boundaries, before being accumulated to generate desired output quantities. A detailed description of the formulas used in the meridian plane MC algorithm can be found in the literature.

Implementing Meridian Plane MC in MCX

The original CPU-based meridian-plane MC program, referred to as “mcMeridian” (stok1.c) hereinafter, is dedicated to modeling homogeneous infinite slab geometries. In contrast, the CUDA-based MCX is capable of modeling arbitrarily heterogeneous media represented by a 3D voxelated domain. In a non-polarization MCX simulation, the domain is represented by a 3D integer array with each number representing the index or label of the tissue at each voxel. The actual optical properties of the tissue label are stored in a look-up table, with four element per tissue type: absorption coefficient (1/mm), scattering coefficient (1/mm), anisotropy , and refractive index . To simulate polarized light, the scattering properties of each type of scatterer must be included in addition. To simply the computation, here we only consider spherical scatterers. The radius (), refractive index , and volumetric number density () of the spherical particle scatters can be specified for each tissue type. When spherical particle properties are specified, the corresponding , , and elements of the scattering matrix are pre-computed using the Mie theory on the host (i.e., CPU). As shown in Eq. (2), the scattering matrix of homogeneous spherical scatterers consists of four independent floating-point numbers , and . In our implementation, the scattering parameters are sampled at 1000 evenly spaced points between 0 and , as done in mcMeridian. The pre-computed optical properties and scattering matrix data are then transferred to the device (i.e., GPU), as shown in Fig. 1.

Fig. 1

Media representation and media data preprocessing in a polarization-enabled Monte Carlo simulation.

Results and Discussions

In this section, we first validate the aforementioned pMCX in a homogeneous slab using the single-threaded CPU-based implementation (mcMeridian) as a reference, which has been extensively used by the community and validated by experimental studies. The speed improvement is also quantified. It is noteworthy that mcMeridian simulates an infinite slab media geometry in the directions, whereas in an MCX simulation, a photon is confined inside a bounding box with user-specified dimensions. To ensure that our speed comparison is valid, we modified the source code of mcMeridian and added an implicit bounding box to match the dimension settings in pMCX. In the first benchmark, the simulation domain is a homogeneous slab, the Mie scattering parameters of the embedded spherical scatterer are , , , and . The refractive index of the background medium is . A monochromatic pencil beam source is positioned at the bottom center of the domain, pointing toward the axis and emitting horizontally polarized light at wavelength nm. The initial Stokes vector of the incident beam is . The backscattered photons are collected by a square-shaped area detector () placed on the boundary at . In this benchmark, photon packets are simulated on a desktop running Ubuntu 18.04 with an Intel i7-6700K CPU and an NVIDIA RTX 2080 GPU. In Fig. 2, we compare the distribution of backscattered [] components using contour plots in MATLAB (MathWorks, Inc., Natick, Massachusetts, United States) and observed excellent agreement between mcMeridian and pMCX solutions. For further quantitative analysis, the root-mean-square errors of (in scale) between mcMeridian and pMCX are computed on the matching detector area (), reporting 0.0076, 0.0881, 0.1015, and 0.0938, respectively. We measure the total runtimes, including input data preprocessing, photon transport simulation, and output image generation, with mcMeridian reporting 18111.44 s using the Intel CPU and pMCX reporting 19.45 s on the NVIDIA RTX 2080 GPU, suggesting a speedup. In addition, we also benchmark simulation speeds when storing the scattering matrix data over different GPU memory locations, including global, shared, and constant memories. The global memory implementation reports the fastest speed at 8401 photons/ms, followed by the shared memory (4965 photons/ms) and constant memory (2182 photons/ms) implementations. Although the shared memory is known to be the fastest among the three memory types, its has a very small size, up to 48 KB per block. For storing the scattering matrix of a single species of scatterer at 1000 angular steps, a total of 16 KB memory is needed. Allocating a large amount of shared memory can lead to drastically reduced active block number, which explains the lower speed compared to the global memory case. On the other hand, constant memory also has a small size (64 KB). It is most efficient when a memory value is being reused many times after a single read. However, the use of the rejection method requires random access to the buffer which fails to be accelerated by the constant memory due to high “cache-miss.”

Fig. 2

Contour plots of the absolute backscattered (a) , (b) , (c) and (d) (in scale) generated by mcMeridian (black solid lines) and pMCX (white dashed lines).

Contour plots of the absolute backscattered (a) , (b) , (c) and (d) (in scale) generated by mcMeridian (black solid lines) and pMCX (white dashed lines). In the next benchmark, we further validate our pMCX simulator by comparing with an extended mcMeridian (with added support of layered media) in a two-layer domain. The slab-shaped simulation domain has a size of with the thickness of the superficial layer, , ranging between 0 and 10 mm. The Mie scattering parameters are , , sphere radius , number density , , and for the superficial layer. The bottom layer has , , and the other parameters are the same as the superficial layer. These choices of and yield a reduced scattering coefficient for both layers, along with the same absorption and hence approximately the same reflected intensity for all . A pencil beam is located on the surface of the slab at , pointing toward the axis and emitting horizontally polarized light, with the initial Stokes vector . A total of photon packets are simulated for both mcMeridian and pMCX. In Fig. 3, we plot the total reflected and components as a function of the superficial layer thickness . We do not include total and plots because they are nearly zeros across all values, this is expected as and distributions sum to zeros due to symmetric positive and negative components. The outputs from mcMeridian (black solid lines) and those from pMCX (red circles) once again show excellent agreements. The plot of increases with the thickness of the superficial layer, which contains smaller spherical scatters and hence stronger back-scattering than the deeper layer. With thickness growing from 0 to 1 cm, the total value shows a minuscule increase by 0.17% (from 0.9080 to 0.9095), as shown in the inset in Fig. 3(a), as a result of sub-diffusive scattering. The two-phase transition of matches our expectations: when the superficial layer is very thin, the reflectance values are close to the value as if the domain is entirely filled with the bottom medium (green dashed line); as we increase , the reflectance values asymptotically approach those determined by the media in the superficial layer (blue dashed line).

Fig. 3

Validation of pMCX in a two-layer domain. We plot the backscattered (a) and (b) components computed by pMCX and mcMeridian as the superficial layer thickness () increases from 0 to 10 mm. Two dashed lines in (b) indicate back-scattered values computed from a homogeneous slab filled only with the medium of the bottom layer (green) and that of the superficial layer (blue). The inset in (a) shows a zoom-in view of the axis of the component obtained by pMCX to demonstrate subtle variations due to sub-diffusive scattering effect. Finally, we show simulation of a slab-shaped medium embedded with a spherical inclusion, showcasing pMCX’s capability of modeling heterogeneous domains. In this benchmark (Fig. 4), the simulation domain is a slab with a spherical inclusion of radius 0.5 mm centered at . The inclusion and the slab share identical absorption coefficient , reduced scattering coefficient , and refractive index . However, the Mie scatters inside both domains are different. The background medium is filled with scatterers of radius and volume density ; the inclusion is filled with scatterers of radius and volume density . The choices of and values in either domain was computed based on the Mie theory to ensure their reduced scattering coefficients are the same. All spherical scatterings have a refractive index of, . A uniform planar light source is placed at the bottom () surface, pointing toward the axis and emitting horizontally polarized light with the wavelength nm. A cyclic boundary condition is applied to the four bounding box facets at directions to approximate an infinite slab and infinite-plane source. The incident Stokes vector is . A total of photons are simulated on an NVIDIA RTX 2080 GPU. We compare the distributions of backscattered at , as shown in Fig. 4. Because the inclusion and background slab share the same , , and , a regular diffuse optics forward model without polarization capability would generate no contrast to the inclusion. However, our pMCX simulation has revealed distinct image contrasts in , , and images at the correct inclusion locations, suggesting the potential to detect tissue microstructure differences using polarized light. It is noteworthy that the noise level in each of the images is partially related to the anisotropy determined based on the background scatterer parameters—a larger spherical radius results in a higher value and less back-scattered photons. The significantly higher amplitude of inclusion contrast in image compared to that in further demonstrates the advantages of using polarized imaging in detecting scattering differences compared to traditional diffuse optics where only is typically measured.

Fig. 4

Distributions of (a) , (b) , (c) and (d) backscattered from a slab. A spherical inclusion of radius 0.5 mm is centered at (6,6,0.6) mm.

Conclusion

In summary, we report a massively parallel implementation of polarized MC algorithm in our MCX simulator for modeling the propagation of polarized light inside complex media filled with spherical scatterers. Enabled by its built-in voxel-based geometric representation, the pMCX can handle arbitrarily heterogeneous media. We have described the preprocessing steps to encode the scattering properties of spherical particles with various radii and volume densities into a 3D voxel-based data structure. We provide validation and speed benchmarks ranging from simple homogeneous to complex heterogeneous domains. In all benchmarks, our pMCX solver reports excellent match with the widely used reference solver mcMeridian, while providing a speedup nearly three orders of magnitude. In addition, we observed different GPU memory utilization efficiency among global, constant, and shared memories, with the global memory implementation yielding the highest speed and least restriction. It is noteworthy that a limitation in both mcMeridian and pMCX simulations is that all media boundaries are assumed to have matched refractive indices. We plan to further extend this work to update the Stokes vector across mismatched boundaries.

23 in total

1. Light propagation in biological tissue.

Authors: Arnold D Kim; Joseph B Keller
Journal: J Opt Soc Am A Opt Image Sci Vis Date: 2003-01 Impact factor: 2.129

2. Imaging skin pathology with polarized light.

Authors: Steven L Jacques; Jessica C Ramella-Roman; Ken Lee
Journal: J Biomed Opt Date: 2002-07 Impact factor: 3.170

3. Monte Carlo simulations of the diffuse backscattering mueller matrix for highly scattering media.

Authors: S Bartel; A H Hielscher
Journal: Appl Opt Date: 2000-04-01 Impact factor: 1.980

4. Three Monte Carlo programs of polarized light transport into scattering media: part II.

Authors: Jessica C Ramella-Roman; Scott A Prahl; Steven L Jacques
Journal: Opt Express Date: 2005-12-12 Impact factor: 3.894

5. Robust concentration determination of optically active molecules in turbid media with validated three-dimensional polarization sensitive Monte Carlo calculations.

Authors: Daniel Côté; I Vitkin
Journal: Opt Express Date: 2005-01-10 Impact factor: 3.894

6. Three Monte Carlo programs of polarized light transport into scattering media: part I.

Authors: Jessica Ramella-Roman; Scott Prahl; Steve Jacques
Journal: Opt Express Date: 2005-06-13 Impact factor: 3.894

7. Monte Carlo modeling of polarized light propagation: Stokes vs. Jones. Part I.

Authors: H Günhan Akarçay; Ansgar Hohmann; Alwin Kienle; Martin Frenz; Jaro Rička
Journal: Appl Opt Date: 2014-11-01 Impact factor: 1.980

8. GPU acceleration of Monte Carlo simulations for polarized photon scattering in anisotropic turbid media.

Authors: Pengcheng Li; Celong Liu; Xianpeng Li; Honghui He; Hui Ma
Journal: Appl Opt Date: 2016-09-20 Impact factor: 1.980

9. Mesh-based Monte Carlo method using fast ray-tracing in Plücker coordinates.

Authors: Qianqian Fang
Journal: Biomed Opt Express Date: 2010-07-15 Impact factor: 3.732

10. Improving model-based functional near-infrared spectroscopy analysis using mesh-based anatomical and light-transport models.

Authors: Anh Phong Tran; Shijie Yan; Qianqian Fang
Journal: Neurophotonics Date: 2020-02-22 Impact factor: 3.593

2 in total

1. History of Monte Carlo modeling of light transport in tissues using mcml.c.

Authors: Steven L Jacques
Journal: J Biomed Opt Date: 2022-07 Impact factor: 3.758

2. Special Section Guest Editorial: Introduction to the Special Section Celebrating 30 years of Open Source Monte Carlo Codes in Biomedical Optics.

Authors: Qianqian Fang; Fabrizio Martelli; Lothar Lilge
Journal: J Biomed Opt Date: 2022-08 Impact factor: 3.758

2 in total