Direct Parametric Reconstruction With Joint Motion Estimation/Correction for Dynamic Brain PET Data

—Direct reconstruction of parametric images from raw photon counts has been shown to improve the quantitative analysis of dynamic positron emission tomog- raphy (PET) data. However it suffers from subject motion which is inevitable during the typical acquisition time of 1–2 hours. In this work we propose a framework to jointly estimate subject head motion and reconstruct the motion- corrected parametric images directly from raw PET data, so that the effects of distorted tissue-to-voxel mapping due to subject motion can be reduced in reconstructing the parametric images with motion-compensated attenu- ation correction and spatially aligned temporal PET data. The proposed approach is formulated within the maximum likelihood framework, and efﬁcient solutions are derived for estimating subject motion and kinetic parameters from raw PET photon count data. Results from evaluations on simulated [ 11 C]raclopride data using the Zubal brain phantom and real clinical [ 18 F]ﬂorbetapir data of a patient with Alzheimer’s disease show that the proposed joint direct parametric reconstruction motion correction approach can improve the accuracy of quantifying dynamic PET data with large subject motion.


D YNAMIC Positron Emission Tomography (PET)
imaging plays an important role in medical research as it allows for the in vivo quantification of a wide range of biological parameters [1]- [3]. To derive the biological parameters of interest, conventionally the raw projection data recorded by PET detectors are divided and reconstructed into a series of temporal frames to provide the spatial distribution of the PET tracer over time, and then the time activity curves on a voxel/region basis are extracted for kinetic analysis with a selected model. For such indirect methods, modelling the noise distribution in the kinetic analysis of the sequence of reconstructed activity images is difficult, and direct parametric reconstruction approaches [4]- [11] have been developed to reduce noise amplification in kinetic quantification, by incorporating the kinetic model into the reconstruction to derive the kinetic parameters directly from the raw PET data, with improved modelling of the noise statistics.
Meanwhile, subject motion is another fundamental problem in dynamic PET imaging as the scan duration is frequently one hour long and sometimes can exceed two hours depending on the tracer kinetics. Subject motion is therefore inevitable, causing a mismatch in the attenuation correction and displacements in tracer distribution that propagate into the reconstructed PET data. If not corrected for, subject motion may lead to significant errors in the kinetic quantification. Dynamic PET motion correction can be performed retrospectively to realign the reconstructed PET time frames [12]- [16], however such approaches cannot address the motion that occurred within the time frames. External motion tracking systems can also be applied to measure the subject's rigid head movements during a PET scan [17]- [20]. The tracking-based methods obtain the motion information independent of the PET data, and therefore can be applied in principle to PET data using any tracer. However this requires additional devices and calibrations, as well as synchronisation of motion tracking data with the dynamic PET data.
In this work we focus on PET data-based motion correction methods. We presented the preliminary work at the 17th International Conference on Medical Image Computing and Computer Assisted Intervention [21] where we proposed This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ Fig. 1. Overview of the proposed work for direct parametric reconstruction of dynamic brain PET data with joint subject motion estimation/correction. to estimate subject motion and kinetic parameters directly from the raw PET photon count data. Here we make additional modifications to the PET forward model in the maximum likelihood framework to include the motion-transformed attenuation. To estimate subject motion and kinetic parameters directly from the raw PET photon count data, efficient solutions are derived with a linearised kinetic model and analytical derivatives for the motion parameters. The proposed approach for direct parametric reconstruction with joint motion correction shows improved accuracy of quantifying dynamic PET data with large subject movements in evaluations on simulated [ 11 C]raclopride data using the Zubal brain phantom and real clinical [ 18 F]florbetapir data of a patient with Alzheimer's disease.
This paper is organised as follows. We start with the description of the proposed method in Sec. II. Then Sec. III presents the simulation-based evaluation of the proposed method and a comparison with other PET data-based motion correction methods, and the application of these methods to clinical [ 18 F]florbetapir data. Finally the discussion is provided in Sec. IV. Figure 1 provides an overview of the framework for direct parametric reconstruction of dynamic brain PET data with subject motion correction. The method uses alternating optimisation of the likelihood function (Sec. II-A) in terms of the kinetic (Sec. II-B) and motion (Sec. II-C) parameters.

A. Likelihood Function for Dynamic PET Data With Motion
In PET, the photon counts g i,l measured for the line-ofresponse (LOR) indexed by i during a time interval l is Poisson distributed. Given the expected number of countsḡ i,l , the loglikelihood function for all measured events is Here we formulate the expectedḡ i,l by a forward projection with attenuation of the activity in the image space, modelled by tracer kinetics θ and subject motion m, as illustrated by Figure 2.
Firstly, we use a linearised kinetic model to describe the motion-free tracer time activity. The kinetic parameters are defined in the image space on a voxel grid x = {x j } n v j =1 ⊂ R 3 , where n v is the number of voxels, and are represented by the parametric image θ ∈ R n v ×n k + , of which the j -th row θ j ∈ R n k + represents the kinetic parameters at voxel j , n k being the number of kinetic parameters. Using the matrix of temporal basis functions B ∈ R n k ×n t + , the time activity f ∈ R n v ×n t + is modelled by B is pre-calculated and has non-negative values. Here B is generated by using spectral analysis [22], and each entry [ B] q,l = B q,l , where q is the index of basis functions, is computed by where ⊗ stands for the convolution operator, t denotes time, C p (t) is the metabolite-corrected plasma input function of the tracer, and φ = {φ q } n k q=1 is chosen in a physiologically plausible range. For more details on spectral analysis the readers are referred to [22].
For a time frame l, the activity f l = [θ B] l is transformed due to subject motion parameterised by a vector m l . In this work m l ∈ R 6 because we limit our study to rigid transformations. Using a square matrix W m l ∈ R n v ×n v to describe the image transformation, the motion-transformed activity is W m l f l and the attenuation map μ sampled on the same grid is affected by the same transformation. Note that no assumptions need to be made on the alignment of the attenuation map and the real activity, the motion parameters are estimated relative to the attenuation map μ, and { f l } are aligned with μ.
With the system information matrix P ∈ R n b ×n v + , the expected count of photons is where is the attenuation factor with P a being the projection matrix for attenuation, and r i,l is the expected count of background events (scatter and randoms). In (4) and (5), the transformation is performed before applying projection. To calculate W m l f l and W m l μ, discrete f l and μ are interpolated to continuous f l and μ in R 3 , such that f l = f l (x) and μ = μ(x), and the transformation is applied by composing f l and μ with the 3-dimensional transformation w m l , and where for brain PET data, w m l is the rigid transformation.

B. Update Kinetic Parameters θ
With a given estimation of the motion m = {m l }, updating the kinetic parameters θ can be considered as a standard direct parametric reconstruction with additional image transformation steps. Since a linear kinetic model is used, θ can be estimated by applying the MLEM algorithm to maximise the likelihood function in (1) with respect to θ . However as pointed out in [9], [23], the convergence of this approach can be very slow when the temporal basis functions B in the spectral analysis are highly correlated. Here we adopted the nested-EM method [24] that improves the convergence by introducing surrogate functions that transfers the optimisation into a problem where the estimation of the kinetic parameters with given motion parameters is separated from the image reconstruction.
At iteration (n + 1), firstly the intermediate dynamic images where f (n) denotes the transpose of a matrix. W m l and W m l are both performed by image transformations as in (6). Then θ is updated as another EM step by n z iterations with { f (n),em l } by the equation where and B j,q,l = [W m l P A l 1] j B q,l .This step is initialised with θ (n,1) = θ (n) , and terminated with θ (n+1) = θ (n,n z ) . Equation (10) is derived from the maximisation of the surrogate function for the log-likelihood function. The surrogate function used in this work is a Kullback-Leibler divergence solved by the nested-EM steps using Equation (10). Full details on the derivations of the surrogate functions are given in [24].

C. Update Motion Parameters m
Estimation of the subject motion m with given θ involves maximising the likelihood function L in (1) with respect to m. We derive the gradient ∇ m L and approximate the Hessian H m (L) from the second order Taylor expansion of L, and use a trust-region algorithm to update m. 1) Derivation of the Gradient ∇ m L: By applying the chain rule, the gradient can be derived for m being a collection of 3D rigid transformation parameters, as Rewrite (4) and (5) as where Denote m l = [m l,1 , . . . , m l,γ , . . . , m l,6 ], and from (6) and (7), we can derive where ·, · is the inner product in R 3 , and can be derived from the rigid transformation defined in (8).
2) Approximation of the Hessian H m (L): To use a second order optimisation algorithm, from the Taylor series of L(ḡ(m)) we have (17) where J m (ḡ) is the Jacobian ofḡ with respect to m, and the Hessian can be approximated by where and ∂ḡ i,l ∂m l,w can be calculated by (15) and (16).

3) Optimisation:
With the gradient and Hessian, m can be solved by an appropriate second order optimisation algorithm. In this work we use the trust-region algorithm [25] implemented in MATLAB © optimisation toolbox and {m l } are estimated in parallel.

D. Calculation of Outcome Measures
Using the kinetic parameters θ directly reconstructed from raw projection PET data with motion compensation, the outcome measure for quantifying the dynamic PET data can be derived. For reversible tracers, the outcome measures of interest which is the volume of distribution V T , can be derived using θ , by [26]

E. Summary
With the formulation of the likelihood function, the kinetics θ and motion m are estimated by alternation from the raw PET projection data by maximising the likelihood. A summary of the proposed approach is given in Algorithm 1.

III. EXPERIMENTS AND RESULTS
This section describes the experiments and results of evaluating the joint motion correction and parametric reconstruction performance of the proposed approach. Simulated 3D [ 11 C]raclopride data with various motion and noise levels were used to quantitatively assess the motion estimation performance and the resulting parametric reconstruction with motion-compensated attenuation and removal of event-byevent motion as well as continuous motion. Then the proposed approach was applied to real clinical [ 18 F]florbetapir data of a patient with Alzheimer's disease who made large head movements during the scan.

A. Validation With Simulated Data
In this work, simulated 3D dynamic PET scans were generated based on the Zubal brain phantom [27] and dopamine

Algorithm 1 Joint 4-D Parametric Reconstruction and Motion Correction
Input: PET projection data g, the μ map, and basis functions B. Output: Motion-corrected kinetic parameters θ and the motion estimation m. Initialisation m = 0, θ = 0.01; for I = 1, . . . , n I do update kinetics θ : for n = 1, . . . , n em do calculate f n,em by (9); θ (n,1) = θ (n) ; for z = 1, . . . , n z do update θ (n,z+1) by (10); end θ (n+1) = θ (n,n z ) ; end update motion m: calculate gradient ∇ m L by (12) (15) (16); calculate Hessian H m (L) by (18); optimise m using trust-region algorithm with ∇ m L and H m (L); end calculate the outcome measures using θ by (20). receptor imaging with [ 11 C]raclopride. The brain phantom was defined on a grid of 128 × 128 × 80 voxels (isotropic voxel size 2.2 mm), and various kinetics were defined on background, blood, grey matter, white matter, cerebellum, putamen and caudate nucleus as shown in Figure 3, together with the plasma input (blood). All the time activity curves were extracted from clinical data, and were used to derive the ground truth kinetic parameter V T by using a two-tissue compartment model. A CT image was synthesised for the phantom from its MR data using the work in [28] for calculating attenuation. The simulated dynamic [ 11 C]raclopride PET scans were 60 minutes long, and were firstly divided into 26 frames (6 × 30s, 7 × 60s, 5 × 120s, 8 × 300s) based on [ 11 C]raclopride kinetics. Some of the frames were then subdivided into 30-second intervals to introduce more motion events (shown in the Figure 5 as an example). Rigid head movements were introduced by transforming the activity images and the CT image at various time points. Based on the average displacement caused to the phantom, the movements were divided in two groups, and therefore the experiments were performed at motion levels of on average 6-voxel displacement and 3-voxel displacement.
The motion-transformed 3D activity images and CT image were then forward projected to generate the photon count data with a system resolution of 5 mm FWHM. 20% of the total counts in each frame were added as uniform background events (scatter and randoms, c.f. r in (4)). After scaling the projection data such that the total counts in the whole scan were 2 million, 20 million, and 200 million counts, Poisson noise was introduced to the projection data. The simulated data were thus generated at 3 count levels and 2 motion levels,  To evaluate the performance of the proposed joint direct parametric reconstruction and motion correction approach (referred to as the joint direct MC), a post-reconstruction joint motion correction and kinetic analysis method [16] was also applied to the simulated data for comparison (referred to as the indirect MC). For the post-reconstruction indirect MC method, the simulated PET projection data were firstly reconstructed by the MLEM method. If not stated otherwise, indirect MC refers to the case where the CT image was not transformed with the motion when performing attenuation correction. To investigate the effects of mismatched attenuation on estimating motion from the PET data by the indirect MC method, simulated [ 11 C]raclopride projection data were also reconstructed with matched attenuation correction by transforming the CT image with the simulated motion. For both methods, the motion estimation performance was quantified by the target registration error (TRE) calculated as the difference between the displacements derived from the ground truth motion parameters and the estimated motion parameters averaged over the image volume.

1) Abrupt Motion:
This section describes the main results from the simulation-based experiments in which the motion occurred abruptly and the duration of the motion is negligible compared to the whole length of a dynamic PET scan. TRE was calculated for both the indirect MC and the proposed joint direct MC methods to assess the motion correction performance at various motion levels and count levels. Figure 4 shows that the proposed joint direct parametric reconstruction and motion correction approach achieved subvoxel motion compensation residual in the experiments, and outperformed the indirect MC method at lower count levels. This indicates that the accuracy of estimating the subject motion is improved by directly using raw PET data in the joint motion correction and parametric reconstruction framework, which avoids the influence of low image quality and artefacts from possible mismatched attenuation correction on the postreconstruction indirect MC method. It is also shown that the effects on the estimated motion of attenuation mismatch depend on the count level and motion level of the data. Figure 5 shows an example of the simulated abrupt head motion and the motion compensation residual of the proposed joint direct MC method over n I = 50 iterations.
The advantage of the proposed joint direct parametric reconstruction and motion correction approach can also be seen in Figure 6, which shows example parametric maps of volume of distribution (V T ) at various count levels. As a result of robust motion estimation and compensation to sub-voxel size accuracy, the proposed joint direct MC method improves the reconstructed V T images with similar quality of the details and contrast to those from motion-free data.      [29]. The proposed joint direct MC method achieved lower bias than the indirect MC method.
2) Slow Motion: In this work the performance of the proposed joint direct MC method was also assessed on simulated dynamic [ 11 C]raclopride PET data with slow motion. The slow motion was estimated from the short frames obtained by dividing the affected PET data. This poses a more challenging problem for motion estimation due to very limited photon counts in the short frames. The data were simulated with realistic continuous motion occurring in a scan. An example of the motion is shown in Figure 8. Since the count rate is not constant during a dynamic PET scan, the simulated slow motion was introduced to the dynamic [ 11 C]raclopride data in both early and late scan times to assess the performance under different count rates. Figure 9 shows the rigid motion

B. Evaluation With Clinical Data
The proposed joint direct MC approach was further evaluated on clinical [ 18 F]florbetapir data of a patient with Alzheimer's disease (AD). The clinical study was approved by the Queen Square Research Ethics Committee. [ 18 F]florbetapir is a PET radiotracer that binds to amyloid-β, which is considered to be a major target in the AD brain. The PET data were acquired on a Siemens Biograph mMR PET/MR scanner in list mode over a period of 50 mins following the injection of 340 MBq [ 18 F]florbetapir. The T1-weighted and T2-weighted MR images of the subject acquired from the mMR scanner in the same imaging session was used to synthesise a CT image by using [28] to calculate the μ map. Normalisation was applied to the PET data using the manufacturer's software. Scatter and random events were estimated by using the manufacturer's software, and once motion was estimated, the scatter estimation was updated with the subject's μ map transformed by the estimated motion parameters. A population plasma input function was used to generate the basis functions in the absence of blood sampling. To detect motion and define the dynamic frames, firstly the list mode data were binned into 15-second sinograms and then converted into 2D projections using single-slice rebinning. The length of 15-second was chosen as a balance between count statistics and temporal resolution for detecting motion. Motion was then detected by visual inspection of the 2D projection data, and the list mode data were binned again where the 29 frames based only on the kinetics of [18F]florbetapir (4 × 15s, 8 × 30s, 9 × 60s, 2 × 180s, 6 × 300s) were further subdivided to  For estimating motion directly from the raw projection PET data, Figure 11 shows the translation and rotation parameters for the head movements estimated by applying the proposed joint direct MC approach to the [ 18 F]florbetapir data of the AD subject.
Quantification of the amyloid-β burden with [ 18 F]florbetapir was calculated by distribution volume ratio (DVR) [30], [31], which is derived by DVR = V T /V ref with V ref being the volume of distribution in a reference region free of amyloidbeta. Cerebellar grey matter was used as the reference region. The reconstructed DVR images, together with the subject's T1-weighted images, are shown in Figure 12.

IV. DISCUSSION
Direct parametric reconstruction of dynamic PET data outperforms the indirect methods, however the effects of subject motion need to be addressed which is inevitable during a typical scan duration. This work shows that the subject motion can be incorporated into the likelihood function to be estimated jointly with the kinetic parameters directly from the raw PET projection data using the MLEM framework. The subject motion affects the spatio-temporal distribution of the PET radiotracer, as well as the attenuation map. These factors can be both formulated in the forward model in the likelihood function, thus the motion can be estimated and compensated for the estimation of kinetics. Based on the image transformation model that describes the motion, first and second order derivatives of the likelihood function with respect to the motion parameters can be analytically derived and the optimisation can be performed by a suitable second order algorithm. For parametric estimation, in this work we used a linearised kinetic model (spectral analysis) with the flexibility to describe a wide range of kinetics which makes the proposed approach generic to be applicable to different clinical studies. Also the use of a linearised kinetic model allows the separation of the kinetic parameters to have a closed-form solution for updating all the parameters in parallel, and in this work we used a GPU-based implementation that greatly accelerated the computation.
The proposed joint direct MC approach was evaluated with simulated and real clinical PET data and compared to an indirect MC approach. The experiments conducted in this work aimed to assess the performance of these methods under realistic as well as extreme conditions with low PET counts and large subject movements. The proposed joint direct MC showed that, for estimating abrupt motion occurring at discrete time points, the results with sub-voxel size residual in simulation-based evaluation at low count level, for which the Fig. 12. AD patient data: the T1-weighted MR images to show the anatomical structures, the result distribution volume ratio (DVR) images by applying the proposed joint direct parametric reconstruction and motion estimation/correction approach, the DVR images by applying the post-reconstruction indirect joint motion correction and parameter analysis method, and the result DVR images directly reconstructed from the patient [ 18 F]florbetapir data with large head movements. The proposed approach effectively improves the reconstruction of the DVR images, by showing consistency in the anatomical structures between the MR and DVR images. In addition the cortical amyloid-beta deposition suggested by the [ 18 F]florbetapir DVR images reconstructed by the proposed approach as amyloid positive is in good agreement with the diagnosis of the AD patient. The part of the head that moved out of the scanner was removed from the displayed image because of the lack of photon data.
indirect MC approach failed for motion estimation due to low image quality. This suggests the direct estimation of motion from raw PET data overcomes the limitations of the indirect methods by allowing more accurate noise modelling, in the same way direct parametric reconstruction methods outperform the indirect ones. Given the advantage of estimating motion from low count data, the proposed approach has the potential to address continuous motion by dividing the PET data into very short frames with limited photon counts, and the performance was shown as evaluated in the experiments with simulated continuous motion.
The proposed approach achieved effective motion estimation and joint direct parametric reconstruction in the experiments performed in this work, however as a PET data-based method, it relies on the fact that the photon count data contains sufficient spatial information for reliable motion estimation and therefore it is limited for accurately estimating high frequency motion.
A possible solution could be adding a regularisation term to the likelihood function. Introducing prior knowledge to the ill posed inverse problem usually improves PET reconstruction. However this will also increase the computational burden and choosing the regularisation term and its weight in the penalised likelihood function is non-trivial. Another possible approach for a simultaneous PET/MR scanner is to use the MR data to facilitate motion estimation. Successful applications of using a specifically designed MR sequence for motion tracking have been developed [32], however, this is at the expense of not using the MR for clinical purpose. To extend the proposed approach, the incorporation of MR data acquired simultaneously with a typical clinical protocol will be explored in the future.
In the proposed method, subject motion was incorporated into the PET forward model, and the likelihood function was not concave. Due to the non-concavity in motion, local maxima exist and could be avoided by using global optimisation techniques. Such techniques will be investigated in the future. Also due to this non-concavity, the convergence cannot be established. The proposed algorithm is monotonic and in each iteration, both the kinetic parameter update and motion parameter update increase the likelihood function with proof given in [24] and [25] respectively. However, full convergence is not guaranteed.
In this work scatter was not handled as a function of the activity and attenuation map, which are both transformed by the subject motion. This is based on the considerations that scatter correction can be addressed by using the Gaussian fit method, which involves fitting a Gaussian profile to the scatter tails at the edge of each projection after random correction. The inclusion of a more sophisticated scatter model in the PET forward projection will be explored in the future.

V. CONCLUSIONS
In this work we propose a framework to jointly estimate subject head motion and reconstruct the motion-corrected parametric images directly from raw PET data, so that the effects of distorted tissue-to-voxel mapping due to subject motion can be reduced in reconstructing the parametric images with motion-compensated attenuation correction and spatially aligned temporal PET data. The proposed approach is formulated within the maximum likelihood framework, and efficient solutions are derived for estimating subject motion and kinetic parameters from raw PET photon count data. Results from evaluations on simulated [ 11 C]raclopride data using the Zubal brain phantom and real clinical [ 18 F]florbetapir data of a patient with Alzheimer's disease show the proposed joint direct parametric reconstruction motion correction approach can effectively improve the accuracy of quantifying dynamic PET data with large subject motion.