Abstract
Objectives: To assess the usefulness of sonographically measured anogenital distance (AGD) in predicting fetal gender in Saudi fetuses during the first trimester and to provide normal reference centiles for AGD.
Methods: A retrospective cohort study was conducted at King Faisal Specialist Hospital and Research Center, Riyadh, Saudi Arabia between November 2020 and May 2021. The ultrasound scans of 313 singleton pregnancies between 11–13 plus 6 gestational weeks and their gender-at-birth were collected. Anogenital distance was measured from the inferior base of the genital tubercle to the rump. Binominal logistic regression and receiver operating characteristic curves (ROC) evaluated the predictive performance of AGD for determining fetal gender.
Results: There was a significant difference of approximately 15% in mean AGD between female (5.92 mm [95% CI= 6.70, 6.14]) and male (6.80 mm [95% CI= 6.61,7.00]) fetuses (p<0.001). Anogenital distance significantly correlated with gestational age (r=0.573, p<0.001) and crown-rump length (r=0.562, p<0.001). The logistic regression determined AGD as a significant predictor of fetal gender (p<0.001). However, ROC analysis showed that overall accuracies were low at 68% (p=0.001) for 11 weeks, 70% (p<0.001) for 12 weeks, and 64% (p=0.017), and for 13 weeks. The average AGD of our Saudi cohort was longer than what the literature reported from other populations.
Conclusion: The first-trimester ultrasound evaluation of AGD was feasible and reliable. It showed a difference between the genders but did not yield high predictive accuracy. Future research should consider racial factors when evaluating AGD.
Fetal gender identification in early pregnancy is of great interest to expecting parents. However, physicians also take interest in gender identification for fetuses at higher risks of inheriting gender-dependent genetic disorders. Several X-linked recessive inherited diseases are male-specific, while adrenal congenital hyperplasia can affect female fetuses. Hence, early determination of fetal gender could be vital for early diagnosis and interventions.
Early gender identification during the first trimester can be achieved with sonography and genetic testing. Definitive results can be achieved with the latter by sampling the chorionic villus under ultrasonography guidance.1 However, it is an invasive procedure with an associated 1.46 relative risk of pregnancy loss before 28 weeks.2 Alternatively, analysis of cell-free fetal DNA is non-invasive but rather expensive and limited in availability.3 In contrast, ultrasonography can be used non-invasively to determine fetal gender with great accuracy in the second and third trimesters by visual identification of the penis or labia majora and minora.4 Ultrasonography can also be used to determine gender by measuring the genital tubercle angle, with 100% sensitivity at 13 weeks but lower sensitivity between 11 and 12 weeks.5,6
Anogenital distance (AGD) is a recently introduced sonographic marker of fetal gender.7-11 It is based on measuring the distance between the anus and the base of the genital tubercle in the perineal region.12 Anogenital distance is sexually dimorphic since its length is dependent on hormonal levels. Hence, AGD in male fetuses can be longer than that in female fetuses.12,13 In one study, prenatal AGD differences between male and female fetuses were found to be maintained across their life spans.12 Experimental preclinical studies suggested that androgen exposure affects AGD during masculinization.14 In humans, sonographically measured AGD successfully dichotomized male and female fetuses.7-11,15 In these studies, AGD measurement was a safe, non-invasive, and relatively cheap method of early gender identification. However, AGD measurement lacks the required validation in large samples from different populations. Recent studies reported significantly different AGD values between races, recommending that population-specific normative values are needed for accurate clinical assessment.9,10 These factors support the rationale for conducting this study.
The primary objective of this study was to evaluate the accuracy of AGD measurement via ultrasound for the early determination of fetal gender during the first trimester. Secondary objectives included evaluating the inter-rater agreement in AGD measurement and identifying population-specific normative AGD values for each gender.
Methods
This study was conducted as a retrospective cohort study at King Faisal Specialist Hospital and Research Center (KFSHRC), Riyadh, Saudi Arbia between November 2020 and May 2021. The KFSHRC Research Ethics Committee approved the study and granted a waiver of informed consent (approval memo EC ref: C380/308/42). The local records of obstetric ultrasound scans performed between January 2015 and January 2019 were screened consecutively to identify eligible cases based on the inclusion criteria: i) singleton pregnancies from 11 weeks to 13 weeks plus 6 days of gestation; ii) acquisition of the ultrasound image transabdominally in the mid-sagittal plane with the fetus lying in a natural position (neck and spine are neither hyper-flexed nor hyper-extended); and iii) Saudi nationality. The exclusion criteria were as follows: i) cases where the rump and genital tubercle were not captured clearly; and ii) cases where information on the gender at birth was not available (such as, delivery occurred at a different hospital)
All scans were performed by accredited sonographers using 2 calibrated ultrasound systems (Philips EpiQ 7 and Philips Affiniti 70). An accredited obstetric sonographer screened the database and consecutively recruited eligible cases until the required sample size was attained. The sonographer exported a single mid-sagittal plane image for each case. Next, the definitive gender of the fetus after birth was documented. Data on additional variables were collected, including gestational age, maternal age, use of assisted reproduction technology, history of diabetes or polycystic ovarian syndrome, and birth weight. Sonographic information included fetal heart rate and crown-rump length (CRL).
The images for all eligible cases were extracted in Digital Imaging and Communications in Medicine (DICOM) format. Two experienced sonographers (AMA and MJA) measured the AGD to assess its inter-reader reproducibility using the OSIRIX DICOM viewer. A proximal caliper was placed at the inferior base of the genital tubercle and a distal caliper was placed at the most prominent rump location similar to the distal CRL measurement.7,9 The sonographers were blinded to all clinical information except for the ultrasound images. To standardize measurement acquisition, the sonographers practiced on a random subset of cases, and then reviewed the technique for measuring AGD.
Based on previously published data,7 to detect a minimum AGD difference of 10% on independent sample t-test with alpha level of 0.05 and power (1‒β) of 0.95, the minimum sample size is 250, assuming an expected even allocation ratio between the male and female groups.
Statistical analysis
Descriptive and inferential statistics were carried out using SPSS version 27 for Mac (IBM Corp., Armonk, N.Y., USA). To compare the AGD difference between male and female fetuses, independent samples student’s t-test was conducted. An alpha value of 0.05 was considered significant. To evaluate the performance of AGD in identifying fetal gender at specific gestational ages, the data were sub-grouped into the following: a) Group 1: women with gestational age of 11 weeks to 11 weeks, 6 days; b) Group 2: women with gestational age of 12 weeks to 12 weeks, 6 days; and c) Group 3: women with pregnancy age of 13 weeks to 13 weeks, 6 days.
Centiles for AGD reference ranges were calculated using a previously described method.16 Briefly, the required centiles for AGD per gestational week were calculated using centile= mean + K*SD, where mean refers to the mean AGD, SD is the standard deviation of the mean, and K is the corresponding centile of the standard distribution. For example, determination of the 25th and 75th centiles requires that K= ±0.675.
Binominal logistic regression was used to evaluate the predictive performance of AGD for gender determination. To specify valid and relevant independent variables, primary maternal and fetal characteristics were evaluated for normality, linearity in the logit (via the Box-Tidwell procedure), and multicollinearity.17 Relevant variables that satisfied all assumptions were entered into the model. Receiver operating characteristic curves were used to define optimum cut-offs for AGD in each group and to assess diagnostic performances. The AGD cutoff that yielded the highest Youden index was selected as the best AGD cutoff.
The inter-reader reproducibility was analyzed using the Bland-Altman plot and intraclass correlation coefficients (ICC). The results were interpreted as follows: 0.00-0.20, ‘poor agreement’; 0.21-0.40, ‘fair agreement’; 0.41-0.60, ‘moderate agreement’; 0.61-0.80, ‘substantial agreement’; and >0.80, ‘almost perfect agreement’.18
Results
A total of 322 pregnancy cases was collected, of which 9 (2.7%) were later excluded due to poor image quality in the genital pedicle region. Hence, 313 cases were finally included in the analysis. The descriptive statistics for the main characteristics are presented in Table 1. Among the included cases, 23 (7.3%) pregnancies occurred following assisted reproduction technologies, 44 (14.1%) mothers had diabetes, and only 2 mothers had history of polycystic ovarian syndrome. All these 3 factors were not significantly associated with AGD (p>0.1). No fetus had pathological development in the genitalia at birth. When AGD was analyzed, male fetuses had longer measurements than those of female fetuses by 14.8% (p<0.001). The mean AGD measurements per gestational week were also significantly different, ranging from 10.7% in week 13 to 17.4% in week 11 (Table 2). The calculated 2.5th to 95th centile reference ranges for normal AGD per gender are listed in Appendix 1. An example of the AGD measurement is illustrated in Figure 1. A comparison between the current values to measurements from different populations is shown in Table 3. AGD significantly correlated with gestational age (r=0.573, p<0.001) and CRL (r=0.562, p<0.001). There were no significant correlations of AGD with maternal age (r=0.085, p=0.134), fetal heart rate (r= -0.031, p=0.584), and birth weight (r=0.012, p=0.836). The charts in Figures 2 and 3 show the changes in AGD relative to fetal age.
In the logistic regression, AGD was the only independent variable that satisfied the model’s assumptions for predicting fetal gender. The model was statistically significant: χ2 = 51.7, p<0.001. It explained 20% (Nagelkerke R2) of the variance in gender and correctly classified 65% of cases. Anogenital distance was a significant predictor of fetal gender (p<0.001), with larger measurements holding 2.27 times higher odds of being those for male fetuses. Reanalyzing the data for each gestational week independently yielded similar results.
When considering different cutoffs in the receiver operator characteristic analysis, the best overall cutoff value of 6.00 mm achieved the highest Youden index of 29, with sensitivity of 69%, specificity of 60%, and area under the curve of 0.686. This can be interpreted as poor discrimination according to Hosmer et al.19 The cutoffs and their predictive performances for each gestational week are listed in Appendix 2. The accuracy describes the overall number of correctly identified fetal genders. Sensitivity refers to the percentage of correctly identified male fetuses, whereas specificity is the percentage of correctly identified female fetuses. The positive predictive value is the chance of being male when the AGD is above or equal to the cutoff, and negative predictive value is the chance of being female when the AGD is above or equal to the cutoff. The positive likelihood ratio is the increase in the probability that the fetus is expected to be male provided that the AGD is higher than the cutoff, and the negative likelihood ratio is the increase in the probability that the fetus is expected to be female procided that the AGD is higher than the cutoff.
As for the inter-reader reproducibility of the AGD measurements, the mean difference was -0.14 mm [95% confidence interval (CI)= -0.32, 0.49]. The interclass correlation coefficient demonstrated almost perfect agreement at 0.896 [95% CI= 0.845, 0.930].
Discussion
This study aimed to compare the AGD between male and female fetuses during the first trimester. The most important finding was that the mean AGD in male fetuses was statistically significantly longer than that in female fetuses by approximately 15%. However, the magnitude of the difference was not large enough to yield high accuracy for predicting the genders. The diagnostic metrics tended to yield higher confidence for identifying male fetuses when AGD was higher than the mentioned cutoffs (high sensitivity). However, when AGD was small, the post-test probability for identifying a female fetus was low (poor specificity).
Our results agree with what Najdi et al8 reported approximately 11-week-old fetuses where the sensitivity was 70% and area under the curve was 0.748, corresponding to 72% and 0.738 in our study. Although, the majority of previous studies reported significant predictive performances in contrast with the current findings. Najdi et al8 reported >85% sensitivity, specificity, positive predictive value, and negative predictive value for 12- and 13-week-old fetuses. The first study that evaluated first trimester fetuses was carried out by Arfi et al.7 They reported sensitivity (identification of male fetuses) of 87% and specificity (identification of female fetuses) of 89%. Later, Sipahi et al9 reported 76% accuracy in identifying male fetus and 97% accuracy for female fetus between 11 and 13 weeks.
In older fetuses aged between 26 and 30 weeks, AGD was significantly different between the 2 denders (p<0.001), but the researchers, unfortunately, did not report the predictive performance.10 Likewise, Gilboa et al11 found significant AGD differences in 20- to 35-week-old fetuses, where the mean difference increased from 2.2 mm at 20 weeks to 11.8 mm at 35 weeks (r2=0.808). It should be noted that in such advanced gestational age, AGD is measured differently. It is acquired in an axial plane from the center of the anus to the posterior convergence of the fourchette in female fetuses and to the posterior base of the scrotum in male fetuses. The similarity of this technique to the sagittal plane acquisition is unknown.
An importable and notable observation from the existing literature on AGD is the variability in normal AGD values from different populations. In first-trimester pregnancies, female and male fetuses had mean AGDs of 3.6 mm and 5.1 mm respectively in a Turkish population, 4.0 mm and 4.7 mm in a Persian population, and 4.1 and 5.90 mm in a French study.7-9 Such values expressed 2.2-23% differences in each gender. The Arabic Saudi population had a larger absolute mean AGD compared with values found in previous studies, ranging from 14-36% in male fetuses and from 36-48% in female fetuses. With a standardized ultrasound protocol and no demonstrable concerns over the methodologies used in the recent studies, it is evident that AGD is race-dependent. In neonatal studies, Bedouin neonates had larger AGD than Jewish neonates,20 and Caucasians had larger AGD than those of native Americans and Asians.12 This indicates that no universal AGD cutoffs can be employed across different populations.
The current study provides normative values for AGD in the Arabic Saudi population. These results are valuable for assessing genital anomalies in the Arabic ethnic group that we evaluated. A recent study demonstrated that AGD is longer in mothers who had polycystic ovarian syndrome and in those who used assisted reproduction treatments.21 In contrast, a study on suspected isolated abnormal male genitalia showed that fetuses with hypospadias had AGD below the fifth percentile of normal cases.22 These reports suggest that AGD can be a useful biomarker for prenatal hormonal status and genital development.
Despite the low predictive performance of AGD, it demonstrated excellent reliability. The inter-reader reproducibility was relatively similar to that reported by Aydin et al10 in second-trimester fetuses. Arfi et al7 found a higher ICC of 0.97 in first-trimester fetuses. Other studies also confirmed the low variability and high reproducibility of AGD.11,15,21 Overall, sonographic AGD is a feasible objective measure, which can be easily acquired by sonographers.
An alternative technique for gender determination in the late first trimester is the sagittal sign, which is based on calculating the angle of the genital tubercle relative to the lumbosacral skin surface. The fetus would be identified as male if the angle is >30 degree or female if <10 degree.5,23 However, a systematic review of this technique demonstrated a high failure rate, ranging from 7.5-40.6%.1 Its overall accuracy was between 70% and 100%, and the accuracy was particularly poor in fetuses aged 11 and 12 weeks. Indeed, this technique requires post-processing and presents difficulty for sonographers regarding the accuracy of angle placement. Another study employed 3D ultrasound and found poor gender prediction accuracy at 56%.24
Study limitations
Despite using an optimal sample size, the retrospective design of the study prevented us from controlling the mid-sagittal plane acquisition of AGD, in order to ensure accurate detection of the genital pedicles. This however resulted in a failure rate of only 2.7%. The retrospective design also meant that the images were not acquired by only one sonographer. The study also did not include fetuses that were not positioned optimally, which can be frequently encountered in routine scans. Additional contributors to fetal growth, such as maternal anxiety, socioeconomic status, and education, were not controlled due to the unavailability of retrospective data for these variables.25 Future studies are encouraged to further compare the AGD results in Arabic Saudi fetuses to those of fetuses from other ethnic groups. A longitudinal examination of AGD in the Arabic population from prenatal life to adulthood is needed in order to understand the potential associations of AGD with developmental pathologies.
In conclusion, AGD was significantly different between male and female fetuses in the first trimester. Contrary to the existing literature, the magnitude of the difference was not substantial enough to yield high predictive accuracy. Anogenital distance increased gradually with gestational age and demonstrated reliable readings. It appears that AGD varies among races, and it appears to be longer in Arabic fetuses than the reported measurements from other populations. The study reported the normative centile reference ranges for AGD.
Further longitudinal research is warranted to investigate the usefulness of AGD as an imaging biomarker for congenital genital anomalies and fetal androgen levels.
Acknowledgment
The authors gratefully acknowledge Vicas Narange form Editage (www.editage.com) for the English language editing.
Footnotes
Disclosure. Authors have no conflict of interests, and the work was not supported or funded by any drug company.
- Received June 13, 2021.
- Accepted September 1, 2021.
- Copyright: © Saudi Medical Journal
This is an Open Access journal and articles published are distributed under the terms of the Creative Commons Attribution-NonCommercial License (CC BY-NC). Readers may copy, distribute, and display the work for non-commercial purposes with the proper citation of the original work.