Abstract
Objectives: To determine the diagnostic efficiencies of multiple diffusion-weighted imaging (DWI) techniques for hepatic fibrosis (HF) staging under the premise of high inter-examiner reliability.
Methods: Participants with biopsy-confirmed HF were recruited and divided into the early HF (EHF) and advanced HF (AHF) groups; healthy volunteers (HVs) served as controls. Two examiners analyzed intravoxel incoherent motion (IVIM) using the IVIM-DWI and diffusion kurtosis imaging (DKI) models. Intravoxel incoherent motion-DWI, DKI, and diffusion tensor imaging parameters with intraclass correlation coefficients (ICCs) of ≥0.6 were used to create regression models: HVs vs. EHF and EHF vs. AHF.
Results: We enrolled 48 HVs, 59 EHF patients, and 38 AHF patients. Mean, radial, and axial kurtosis; fractional anisotropy; mean, radial, and axial diffusivity; and α exhibited excellent reliability (ICCs: 0.80-0.98). Fractional anisotropy of kurtosis, f, and apparent diffusion coefficient showed good reliability (ICCs: 0.69-0.92). The real (0.58-0.67), pseudo- (0.27-0.76), and distributed diffusion coefficients (0.58-0.67) showed low reliability. In the HVs versus (vs.) EHF model, α (p=0.008) and ADC (p=0.011) presented statistical differences (area under curve [AUC]: 0.710). In the EHF vs. AHF model, α (p=0.04) and distributed diffusion coefficient (p=0.02) presented significant differences (AUC: 0.758).
Conclusion: Under the premise of high inter-examiner reliability, DWI and IVIM-derived stretched-exponential model parameters may help stage HF.
In patients with chronic hepatic disease, the development of advanced hepatic fibrosis (AHF) and hepatic cirrhosis has been correlated with significantly increased risks of hepatocellular carcinoma and death.1 Although untreated hepatic fibrosis (HF) can progress to cirrhosis, timely treatment with anti-fibrosis drugs may reverse HF.1 Therefore, a reliable method of evaluating HF is essential for monitoring the therapeutic response to anti-fibrosis drugs and for detecting HF progression early. Liver biopsy can be used to evaluate HF; however, this procedure is invasive, and the results rely on the accuracy of sampling and examiner skill. Thus, non-invasive methods are urgently required for HF staging.2
Diffusion-weighted imaging (DWI) has been applied to noninvasively quantify HF, but is insufficient for differentiating between HF stages.3-5 In contrast, intravoxel incoherent motion (IVIM) and diffusion kurtosis imaging (DKI) more accurately reflect the variation in non-Gaussian water diffusion on DWI. Intravoxel incoherent motion is calculated using the biexponential model and multiple b-values, and reflects both tissue perfusion and true water-molecule diffusion. Three quantitative parameters can be derived using IVIM: real diffusion coefficient (D), pseudo-diffusion coefficient (D*), and perfusion fraction (f).6 The stretched-exponential model of IVIM describes variations in the rates of intravoxel water diffusion represented by the parameter α and the distributed effect of water molecule diffusion indicated by the distributed diffusion coefficient (DDC).7 However, studies on HF staging using IVIM derived-parameters have yielded inconsistent results.8-19 Diffusion kurtosis imaging can explain the limitation of water diffusion in the complex tissue microstructure and estimate the excess kurtosis of the probability distribution of diffusion displacement.20 Studies evaluating the value of DKI in HF staging have also reported inconsistent results.21-26 Furthermore, most of these studies investigated mean diffusivity (MD) and mean kurtosis (MK), and other DKI-derived parameters, including radial diffusivity (RD), axial diffusivity (AD), axial kurtosis (AK), radial kurtosis (RK), and fractional anisotropy of kurtosis (FAK), are scarcely mentioned in the literature on HF staging.27
We consider that the above inconsistent results of IVIM and DKI may be related to the low reliability of the parameters investigated; moreover, few studies have explored the combined application of all three techniques in HF staging. Therefore, this study aimed to determine the diagnostic efficiencies of parameters derived using DWI, IVIM, and DKI in HF staging based on the premise of high inter-examiner reliability.
Methods
The study protocol was approved by the Institutional Review Board and Ethics Committee of Guangdong Provincial Hospital of Chinese Medicine, Zhuhai, China and was carried out according to the principles of the Helsinki Declaration. All participants signed informed consent forms before being enrolled in the study. We consecutively recruited hepatitis B patients with biopsy-confirmed HF and healthy volunteers (HVs; all aged >18 years) from March 2021 to September 2022. Hepatic fibrosis patients were eligible for this study if they satisfied the following criteria: i) a history of hepatitis; ii) absence of severe ascites; and iii) acceptable image quality. Hepatic fibrosis patients who met either of the subsequent criteria were excluded: i) hepatitis patients without HF on biopsy and ii) inability to undergo biopsy examination. Healthy volunteers were eligible for inclusion in this study if they lacked a history of hepatitis or HF caused by other pathological factors. They were excluded from the study if they satisfied any one of the following criteria: i) any other disease potentially affecting the study results; ii) image quality too poor for taking measurements; and iii) unacceptable image quality.
Hepatitis and HF were diagnosed using histological examination of needle or laparoscopic biopsy specimens and the Scheuer scoring system.28 The biopsy examination was considered the gold standard for HF staging. Hepatitis patients with HF stages 1 and 2 on histology were assigned to the early hepatic fibrosis (EHF) group, while those with HF stages 3 and 4 were allocated to the AHF group. In both groups, liver biopsy was carried out less than one month after the magnetic resonance imaging (MRI) examination.
Magnetic resonance imaging was carried out on a 3.0 T device (Signa Discovery 750w; GE Healthcare, Pittsburgh, MA, USA) with a 16-channel abdominal coil. Axial, 3-dimensional, in-phase and opposed-phase T1-weighted fast-spin-echo pulse sequences and respiratory-triggered, 2-dimensional, fat-suppressed, axial T2-weighted fast-spin-echo pulse sequences were obtained.5
Intravoxel incoherent motion-DWI in the axial plane was carried out with the patient in a supine position and breathing freely. The scanning parameters were as follows: average repetition time, 7000 ms; average echo time, 69 ms; slice thickness, 6 mm; interslice gap, 1 mm; matrix, 96 × 128; field of view, 340 mm × 272 mm; b-values, 0, 25, 50, 75, 100, 150, 200, 300, 400, 500, 600, and 800 s/mm2; number of excitations = 1 (b=25-200), 2 (b=0 and 300-500), and 3 (b=600 and 800); and total scan time, 25 minutes.
Diffusion kurtosis imaging in the axial plane was carried out with the patient in a supine position and breathing freely. The scanning parameters were as follows: average repetition time, 2900 ms; average echo time, 75 ms; slice thickness, 6 mm; interslice gap, 1 mm; field of view, 340 mm × 272 mm; matrix = 96 × 128; number of slices, 28; b-values, 0, 800, and 1600 s/mm2 with 30 directions at each b-value; and total scan time, 10 minutes.
Image post-processing, region of interest (ROI) placement, quality assessment, and image analysis were carried out using an AW 4.6 workstation (GE Healthcare).
Quantitative IVIM-DWI parameters were calculated using the mono-, bi-, and stretched-exponential models. Apparent diffusion coefficient (ADC) was calculated using the monoexponential linear fitting technique and the following equation:
where S(b) represents the mean signal intensity at a given b value, and S0 indicates the mean signal intensity at b=0 s/mm2.
The biexponential model of IVIM was represented by the following equation:
D was calculated using b values of >200 s/mm2, and D* was calculated using b values of <200 s/mm2.
The stretched-exponential model was represented by the following equation:
The α value varies between 0-1, and higher α values reflect decreased heterogeneity of intravoxel diffusion.
Diffusion kurtosis imaging parameters were derived using the following equation:27
K describes the degree to which molecular motion deviates from the perfect Gaussian distribution. When K is equal to 0, the above equation evolves into a conventional monoexponential equation.29,30 Multiple DKI-derived parametric mappings were obtained for the diffusion tensor imaging (DTI) parameters fractional anisotropy (FA), AD, RD, and mean diffusivity (MD) as well as for the DKI parameters MK, AK, RK, and FAK.31
The DWI, IVIM-DWI, and DKI scans were analyzed using the post-processing software provided with the AW4.6 workstation (GE Healthcare). Maps of multiple DKI-derived parameters were obtained. Image quality was assessed by an examiner with 17 years of experience in abdominal MRI-based diagnosis. The images that passed the quality assessment were post-processed and quantitatively analyzed by 2 trained examiners with 10 and 16 years of experience, in abdominal MRI-based diagnosis. The examiners worked independently and were blinded to all other patient data. To obtain relatively objective data regarding the right hepatic lobe, we placed 3 discrete ROIs in each right-lobe segment (Couinaud segments V-VIII) while avoiding liver margins, blood vessels, and artifacts. Region of interest position and size (mean: 100 mm2; range: 80-120 mm2) were identical on different parametric maps. Apparent diffusion coefficient, D, D*, f, DDC, α, FA, MD, AD, RD, MK, AK, RK, and FAK values were averaged across 12 ROIs (3 ROIs × 4 hepatic segments), and the mean values were finally used for the analysis.
Statistical analysis
Statistical analyses were carried out using the Statistical Package for the Social Sciences, version 26.0 (IBM Corp., Armonk, NY, USA). Continuous variables were presented as mean ± standard deviation (SD). To evaluate the inter-examiner reliability of the DWI, IVIM-DWI, DKI, and DTI measurements, we calculated intraclass correlation coefficients (ICCs) by using a 2-way random model and absolute agreement. ICCs of 0.0-0.2 indicated poor, 0.21-0.4 indicated fair, 0.41-0.6 indicated moderate, 0.61-0.8 indicated good, and 0.81-1 indicated excellent inter-examiner reliability. Multivariate logistic regression models constructed using the enter method were used to analyze parameters that were significantly associated with diagnostic efficiency and had ICCs exceeding 0.60. Between-group differences were compared using the Mann-Whitney U test. Highly correlated independent variables were eliminated using the multicollinearity test. The selected parameters were entered as independent variables, and the different study groups were entered as dependent variables. Receiver operating characteristic (ROC) curves were plotted to determine the diagnostic efficacy of the models. Differences were considered to be statistically significant at p-values of <0.05.
Results
We initially recruited 62 EHF patients, 44 AHF patients, and 54 HVs, of whom, 3 EHF patients, 6 AHF patients, and 6 HVs were excluded owing to inadequate image quality. Thus, the study ultimately included 38 AHF (17 women, 21 men) patients, 59 EHF (26 women, 33 men) patients, and 48 HVs (28 women, 20 men). The mean ages of the patients in the AHF group was 43.7 years (range: 27-72 years), 40.4 years (range: 21-62 years) in the EHF group, and 39.2 years (range: 21-69 years) in the HV group. No significant differences were found among the 3 groups in terms of age (AHF versus [vs.] EHF, p=0.11; AHF vs. HVs, p=0.13; EHF vs. HVs, p=0.10) and gender (AHF vs. EHF, p=0.07; AHF vs. HVs, p=0.10; EHF vs. HVs, p=0.09). Representative liver biopsy specimens are shown in Figure 1, and the demographic characteristics of the participants are presented in Table 1.
The inter-examiner reliability was excellent for the DKI parameters MK (ICC: 0.91-0.98), AK (ICC: 0.86-0.96), and RK (ICC: 0.88-0.95); for the DTI parameters FA (ICC: 0.83-0.91), MD (ICC: 0.80-0.96), AD (ICC: 0.80-0.97), and RD (ICC: 0.86-0.96); and for the IVIM-DWI parameter α (ICC: 0.81-0.84). The inter-examiner reliability was good for the DKI-derived FAK (ICC: 0.75-0.92), IVIM-DWI-derived f (ICC: 0.71-0.79), and DWI-derived ADC (ICC: 0.69-0.83). Moderate-to-poor inter-examiner reliability was observed for the DWI-IVIM parameters D (ICC: 0.58-0.67), D* (ICC: 0.27-0.76), and DDC (ICC: 0.58-0.67). The above results are shown in Table 2.
Using the Mann-Whitney U test, we evaluated whether DWI, IVIM, DKI, and DTI parameters with ICCs exceeding 0.60 significantly differed between the different study groups (Table 3). After the test of parallel lines, 2 multivariate regression models were established for differential diagnosis: HVs vs. EHF and EHF vs. AHF.
Based on the screening results, ADC (p=0.01), α (p=0.01), and FAK (p=0.04) formed the HVs vs. EHF regression model, and DDC (p=0.02) and α (p=0.04) made up the EHF vs. AHF regression model.
In the HVs vs. EHF regression model, ADC (p=0.038) and α (p=0.015) showed significant differences, while FAK did not (p=0.07; AUC: 0.710). To avoid missing potentially significant parameters, we added the FAK parameter to this model and repeated the statistical analysis, but the AUC (0.704) did not improve.
In the EHF vs. AHF regression model, the DDC (p=0.001) and α (p=0.001) values exhibited significant differences. The AUC for this model was 0.758. The results of the 2 regression models are shown in Appendix 1 and their ROC curves are shown in Figure 2.
Discussion
Our findings showed that D, D*, and DDC were associated with moderate-to-poor inter-examiner reliability (ICC of <0.60) in some groups. However, all DKI- and DTI-derived parameters were excluded because they showed no statistical differences. Finally, only ADC and α were entered into the HVs vs. EHF regression model, and DDC and α were entered into the EHF vs. AHF regression model. Both models presented medium diagnostic efficiency (AUC: 0.710-0.758).
Several studies have hypothesized that ADC values would decrease with the progression of HF, possibly due to the increased connective tissue limiting the Brownian motion of water molecules.5,32,34,35 However, a high degree of overlap in ADC values was found between different HF stages.18 Intravoxel incoherent motion and DKI, which are both derived from DWI technology, have been applied for HF staging in many studies but the conclusions were inconsistent.18,19,36-38 Considering the above results, we hypothesized that more stable and prominent models could be established by combining DWI-, IVIM-, DKI-, and DTI-derived parameters.
We found that the inter-examiner reliability of D, DDC, and especially D* was low in all groups, which is partially consistent with previous studies.12,14,15,39 Although several research studies have claimed that D, D*, and DDC have eminent diagnostic efficiency, we had to exclude these parameters to ensure the reliability and stability of the diagnostic models.14,15,40 Diffusion-weighted imaging, DKI, and DTI parameters showed good-to-excellent reliability, but most DKI and DTI parameters did not significantly differ between the study groups. From previous studies, we consider that there are 2 reasons for this finding: First, the sensitivity and specificity of DKI and DTI in the differentiation of HF stages are poor. Yoon et al18 and Yang et al24 concluded that the kurtosis model offered no additional value over the mono- and biexponential models. Second, we only applied 3 b values (0, 800, and 1600 s/mm2) for DKI, which may have decreased its sensitivity and specificity for HF diagnosis. The unsatisfactory inter-examiner reliability of D* may be related to the high sensitivity of DWI to the perfusion of body fluids; however, D* has shown significant diagnostic efficiency in some studies.9,10,14 Increasing the reliability of this parameter is an obstinate problem. Although we collected ROIs from multiple right-lobe segments, the reliability of D* remained unsatisfactory. Other novel data-acquisition methods should be attempted to potentially improve the reliability of D*.35
The AUC of the ROC curve indicated acceptable diagnostic efficiency of the models in this study, which is consistent with previous results.12 However, the specificity and sensitivity of the HVs vs. EHF and the specificity and sensitivity of the EHF vs. AHF models, were poor, indicating that the models are not suitable for clinical application. Parameters derived via the stretched-exponential model (DDC and especially α) showed the highest diagnostic efficiency in this study, which suggests that stretched-exponential model parameters may be the most valuable factors for HF staging with good inter-examiner reliability as the premise.15,38
Study limitations
First, sample sizes across groups were not even. Second, DKI was possibly carried out with too few b values. Third, histological data could not be obtained from the HVs, potentially decreasing the credibility of the data from these participants. Lastly, the AUCs of the ROC curves were not as good as those reported in other similar study.18 We believe that this may be related to the removal of partially derived parameters.
In conclusion, on the premise of high inter-examiner reliability, parameters derived from DWI and the stretched-exponential model of IVIM may be more useful than DKI- and DTI-derived parameters to establish a model for HF staging.
Acknowledgment
The authors gratefully acknowledge Medjaden Inc. for their English language editing.
Footnotes
Disclosure. This study was funded by the Zhuhai Medical Research Project, Zhuhai, China, 2022 (No.: 2220009000170) and the Zhuhai Social Development Science and Technology Plan Project, Zhuhai, China, 2023 (No.: 2320004000261).
- Received January 16, 2024.
- Accepted August 10, 2024.
- Copyright: © Saudi Medical Journal
This is an Open Access journal and articles published are distributed under the terms of the Creative Commons Attribution-NonCommercial License (CC BY-NC). Readers may copy, distribute, and display the work for non-commercial purposes with the proper citation of the original work.