Abstract
Objectives: To translate the pregnancy physical activity questionnaire (PPAQ) into Arabic language, cross-culturally adapt and test its reliability and validity among Saudi pregnant women.
Methods: Pregnancy physical activity questionnaire, which consisted of 36 items, was translated to Arabic following the World Health Organization’s guidelines for tool translation (forward translation, expert panel and back translation, pretesting and cognitive interviewing, and final version), followed by validation by experts. This is a cross-sectional study and data were collected from 118 healthy pregnant Saudi women from May to June 2019. Validity included content validity indices (CVI) and construct validity by Rasch analysis. Reliability was assessed by test-retest reliability and Cronbach’s alpha coefficient.
Results: The mean age of the participants was 30.15 ± 5.59 years; 38.2% of them had normal pre-gestational body mass index (n=45). The median of total energy expenditure in physical activity was 356.1 METs.h/week (IQR=162.3-648.3). Item content validity index was good ranging between 0.8-1. Rasch analysis showed good construct validity and excellent reliability for all types of physical activity (>0.89).
Conclusion: This Arabic PPAQ is a reliable and valid tool that can be used in Arab countries.
Physical activity, defined as any bodily movement produced by skeletal muscles and resulting in energy expenditure,1 is one of the most important factors in preventing and controlling non-communicable diseases such as diabetes, hypertension and coronary heart disease.2 It is important for the health of human body in all population subgroups including adults, children, elderly as well as pregnant women.
Physical activity during pregnancy has been shown to be not only safe,3 but also beneficial for the health of mother and baby. For instance, it has been associated with better fetal birth weight,3 decreased risk of gestational diabetes4-6 and preeclampsia.7,8 It is recommended that pregnant women spend at least 30 minutes of moderate-intensity physical activity a day for almost every day in a week in the absence of any medical or obstetric contraindication to physical activity.9,10
Despite the benefits of physical activity during pregnancy, the level of physical activity among pregnant women is generally low at a global level. A review by Povidine et al concluded that 60% of pregnant women were physically inactive.11 Another review by Gatson et al that included 25 peer-reviewed papers showed similar results of less than 30% of pregnant women were sufficiently active.12
The recent Saudi Arabia Ministry of Health initiatives13 acknowledged these challenges and stressed the importance of a healthy lifestyle, including PA. A report published in 2018 recommended that more intensive efforts toward promoting PA and reducing sedentary behaviors among the Saudi population are needed.14 Moreover, recent recommendations15 have highlighted the issues of healthy pregnancies including PA for pregnant women to help adapt to physical and mental changes and be prepared for delivery and to contribute to preventing back pain and constipation.16
As direct measurement of physical activity is difficult and challenging, It is crucial to develop and improve tools that are easy to use yet accurate in assessing physical activity for pregnant women. Tools should have the ability to assess different types and intensities of physical activity. Many tools have been developed to assess physical activity among adults,15 but Pregnancy Physical Activity Questionnaire (PPAQ) was developed specifically for pregnant women and included various activities that can be performed by pregnant women.17
Pregnancy Physical Activity Questionnaire has been translated to different languages including Japanese,18 Chinese,19 French,20 Polish,21 and Turkish,22 and have been widely used in epidemiologic studies evaluating physical activity for pregnant women. A systematic review conducted in 2018 to assess PPAQ showed sufficient reliability for total and vigorous physical activity.23
The aim of this study is to translate the English PPAQ to Arabic, culturally adapt it and to assess its validity and reliability among pregnant Saudi women.
Methods.
This was a cross-sectional study that included 118 pregnant Saudi women who were followed up at King Khalid University Hospital (KKUH), Riyadh, Saudi Arabia between May 13, 2019 and June 10, 2019. The exclusion criteria were multiple pregnancies, medical or obstetric conditions that may have prevented PA such as vaginal bleeding or paralysis, and receipt of advice for bedrest for any reason. This study was approved by the Institutional Review Board at King Saud University, Riyadh, Saudi Arabia (Research Project No.: E-15-1555).
All participants provided informed consent before their enrolment in the study.
Data collection tool
The PPAQ is a self-administered questionnaire that was originally developed in English by Chasan-Taber et al.17 The questionnaire contains 36 questions. The first 3 questions ask about the date of completion of the questionnaire, date of last menstrual period, and expected date of delivery, followed by 33 questions covering different types of PAs performed during household or caregiving activities, occupational activities, sports or exercise, and transportation, as well as including inactivity assessment. Two additional questions are open for participants to add other types of PA that are not included in the questionnaire. The participants were asked to select the approximate number of hours spent per day/week (depending on the question) in each activity during the last 3 months. Durations ranged between 0 and ≥6 hours per day and 0 and ≥3 hours per week.17 For each question, the number of hours was multiplied by the reported intensity of that activity to obtain the average weekly energy expenditure in metabolic equivalent hours per week units (MET.h/wk). The activities included in the questionnaire were further classified according to their intensity in metabolic equivalents of task (METs) as sedentary (<1.5 METs), light (1.5-3.0 METs), moderate (3.0-6.0 METs), or vigorous (>6.0 METs), based on calculations as reported by Chasan-Taber et al.17
Translation
Questionnaire translation followed the World Health Organization’s guidelines for tool translation, which consisted of 4 steps: forward translation, expert panel and back translation, pretesting and cognitive interviewing, and the final version.24
1) Forward translation. The questionnaire was translated by a health professional whose mother tongue was Arabic and was fluent in English. Translation was conceptual rather than literal, aiming to use common words used by the target population to avoid professional jargon, long sentences, and any words that might be considered offensive by participants.
2) Expert panel and back-translation. After forward translation, a panel of 5 experts (2 family physicians, a preventive medicine specialist, an obstetrician, and a medical researcher with experience in questionnaire translation) reviewed the Arabic version of the questionnaire and compared it to the original English version. All 5 physicians were fluent in both English and Arabic. The panel suggested changing the terminology “عندما لا تكونین في العمل” (“when you are not at work”) in questions 11 and 13 to “عندما تكونین في المنزل” (“when you are at home”) as negative sentences might have been confusing for participants and changing the terminology “during this trimester” to “during the last 3 months” as the word “trimester” is not commonly used in the Arabic language.
Later, back-translation of the translated Arabic questionnaire to English was performed by a certified English linguist who was fluent in Arabic and had good knowledge about local spoken terminology and expressions. The resulting version of the questionnaire was compared with the original version for consistency. Both versions of the questionnaire were found to be consistent.
3) Pre-testing and cognitive interviewing. The translated questionnaire was piloted in 30 participants who were pregnant Saudi women being followed up at KKUH. Face-to-face interviews were conducted with participants to ask about the clarity of questions, assuring the correct understanding of the participants. These participants were not included in further analyses for the validity and reliability of the questionnaire.
4) Final version. The Arabic version of the PPAQ was subjected to validity and reliability testing.
Validity
Face and content validity were assessed by consulting a panel of 10 experts that included 3 obstetricians, 3 family physicians who run antenatal clinics, 2 general physicians, one preventive medicine specialist and one epidemiologist. The panel did not include any of the authors nor the experts who translated the questionnaire.
Experts reviewed the contents and wording of the questions. They also checked the appearance and format of the questionnaire to ensure it was clear and appropriate for the participants. They were asked to evaluate each question on a 4-point likert scale as follows (1-not relevant, 2-item needs some revision, 3-relevant but needs minor revision, 4-very relevant). Later, points 1 and 2 were combined and labeled “not relevant” and points 3 and 4 were also combined and labeled “relevant”.
Item content validity index (I-CVI) was calculated for each question. It ranges between 0 and 1 and was calculated as the proportion of experts giving the question a rate of 3 or 4 (agreeing on that question).
Scale content validity index (S-CVI) was calculated by 2 methods. The first was scale content validity index based on average (S-CVI\ave), which represented the sum of I-CVI divided by the number of items. The second method was scale content validity index based on universal agreement (S-CVI\UA), which was calculated as the sum of universal agreement of items divided by the number of items. Universal agreement (UA) score for an item was given as one if it was rated as relevant (points 3 or 4) by all experts; otherwise, the UA score was given as 0.25
Items with an I-CVI more than 0.78 were considered acceptable and those with lower I-CVI should be eliminated or revised. However, S-CVI was considered excellent if it was 0.9 or more.26
Construct validity
The construct validity of this questionnaire was tested by 3 main features of Rasch analysis; item fit to Rasch model, unidimensionality of data, and item difficulty hierarchy. Rasch analysis was performed by Andrich’s Rating Scale Method (RSM) as the items of this questionnaire had multiple response categories and fixed intervals between categories.27 Items in each domain of the questionnaire were analyzed separately.
Item fit to Rasch model was tested by infit and outfit mean squares. Fit mean square is the ratio of the observed response to the expected response which should ideally be 1.0. As Rasch model is probabilistic it is expected to find some variations in results. The adequate range for fit mean squares is between 0.5-1.5. However, measures between 1.5-2 can be acceptable.28,29
Unidimensionality means items within one domain measure the same construct. It was tested by evaluating eigenvalues from principal component analysis (PCA) on item residuals. Eigenvalues show how much of the construct is explained by residuals from Rasch model. The smaller the eigenvalues for residuals the stronger the unidimensionality of items in that domain. For each domain, unidimentionality was decided if the ratio-of-the-first-to-second-eigenvalue was less than 3.30
Item difficulty hierarchy
Difficulty level for each item were calculated and expressed as logits, the natural logarithm of the odds of a person being able to perform a certain task. The greater the item logit was, the more difficult the task that item was asking about.31 If the items of the tool have theoretical hierarchy, difficulty measurements yielded from Rasch analysis can be compared with theoretical hierarchy in order to test the internal construct validity of the tool.31 Items were listed ascendingly according to the activity MET value considering that activities with lower MET values are easier than the ones with higher MET values. These values were then compared with difficulty level from Rasch analysis to test construct validity. Internal consistency was assessed by measuring item separation index and item reliability. Item separation index was calculated for each domain and showed the ability of items in that domain to distinguish groups of people according to the differences in their ability levels. Item separation index of 1.5 was considered acceptable, an index of 2 was considered good and an index of 3 was considered excellent.31,32 Reliability measure was also calculated for each domain separately and was interpreted in the same way as Cronbach’s alpha.31
Reliability
In addition to Rasch model, internal consistency was assessed by calculating Cronbach’s alpha coefficient for each subscale. A score of less than 0.50 indicated poor internal consistency, scores ranging from 0.51 to 0.69 were considered suspicious, scores ranging from 0.70 to 0.80 were considered acceptable, scores ranging from 0.81 to 0.90 were considered good, and scores greater than 0.90 indicated excellent internal consistency.33
Test-retest reliability was checked by calculating the intraclass correlation coefficient (ICC) for each subscale. For this method, 59 participants were asked to complete the questionnaire again one or 2 weeks after completing the first one. Their responses on both occasions were then checked for consistency. The ICC was calculated for log-transformed data, as PA scores were not normally distributed. Values less than 0.5 were an indication of poor reliability, values between 0.5 and 0.75 indicated moderate reliability, values between 0.75 and 0.9 indicated good reliability, and values greater than 0.90 indicated excellent reliability.34 Rasch analysis was performed by the Software (jMetrik, version 4.1.1)35 all other analyses were performed by SPSS version 25.36
Results
A total of 118 pregnant women were enrolled in this study. The mean age was 30.15 years (SD=5.59). Most of the participants (n=64) were in their second trimester, followed by the third trimester (n=47). About two-thirds (n=74) did not have previous abortions. Most of the participants (n=69) were housewives, and more than one-third had normal pre-gestational body mass index (n=45) (Table 1).
The median of total energy expenditure in PA was 356.1 METs.h/week (interquartile range [IQR]=162.3-648.3). Most of this was expended in sedentary activities with a median (IQR) of 135.4 (36.5-355.1) METs.h/week. Regarding the types of PA, most of the participants’ energy expenditure was in household/caregiving activities (median=123.5), while the least was in occupational activities (median=0.0) (Table 2).
Instrument content validity
Item content validity index ranged between 0.8 and 1 which indicates excellent content validity. However the S-CVI was excellent by the average method (0.94) but was not adequate by the universal agreement method (0.72). Average proportion of items judged as relevant across the 10 experts was 0.97 (Table 3).
Construct validity. Item fit to Rasch model
The infit mean square for all items was between the range of 0.5 - 1.5 except for item number 29 where it was 1.65. Item number 29 did not show perfect fit but it did not distort the integrity of the model. Outfit mean squares which include outliers were excellent for all items except for items number 11 and 26 that were acceptable and for items number 27 and 28 which were questionable (Table 4).
Unidimensionality
The ratio-of-the-first-to-second-eigenvalue for all domains were less than 3 representing that little information was explained by residuals and thus unidimensionality of items (Table 5).
Item difficulty hierarchy
Item difficulty ranged between -1.71 and 1.38 indicating that the tool covered tasks with a variety of difficulty levels; contrarily, no items were extremely difficult or extremely easy.
When item difficulty measures were compared with activity MET values, the order of item difficulty was consistent MET values (theoretical hierarchy) for transportation and occupational activities indicating high construct validity for these 2 domains. For household activities, the order of difficulty in comparison to MET values was slightly distorted yet easy items were listed before difficult items. However, for leisure/sports activities, the order of MET values and difficulty were consistent except for items 24 and 25 which have high MET value but low scores in difficulty level (Table 4).
Internal consistency
The highest item separation index was for household activities (5.45) and the lowest was for leisure activities (2.81) indicating that at least 3 groups of participants can be distinguished by items in different domains of the questionnaire. In addition, item reliability was 0.97 for household and work activities while it was 0.89 for transportation and leisure activities representing excellent internal consistency for all domains (Table 5).
In addition, Cronbach’s alpha coefficient was applied to each subscale. Cronbach’s alpha coefficient was good for occupational and leisure/sports activities (0.83), while it was lower for household and transportation activities (0.68 and 0.56, respectively), indicating suspicious internal consistency (Table 6).
Based on the ICC results, the test-retest reliability was good for the total PA (0.78). Regarding the type of activity, the reliability was moderate to good (0.59-0.85), with the highest value recorded for household activity. The exception to this was that for occupational activity, for which the ICC was 0.15; however, this was statistically insignificant. Moreover, the reliability was moderate to good for activity intensity, ranging from 0.61 to 0.80. The highest reliability was for moderate-intensity and lowest was for vigorous activity, although the latter was statistically insignificant (Table 6).
Discussion
Regular PA during pregnancy can help psychological well-being, physical fitness, and weight management.10 In addition, PA can reduce the risk of gestational diabetes in obese women.10 Hence, it is essential to establish an instrument for assessing pregnant women’s PA to help prevent complications and maintain a healthy pregnancy.
This study involved the translation and cross-cultural adaptation process of the PPAQ from English to Arabic for women living in Saudi Arabia. The availability of this tool in many languages adds value to this study in terms of comparing the results.
The median of the total PA (light and more vigorous in intensity) was 163.8 METs.h/w. This finding is higher than those determined in other studies. However, most of the energy expenditure in PA corresponded to light-intensity activities and household/caregiving activities. This is in line with findings of studies in different populations.17-22,37
In this study, the least amount of energy was consumed in occupational activities. This may be explained by the fact that most of the participants in this study were unemployed (58.5%). However, the type of activity that consumed the second-least amount of energy in this study was sports/exercise. This agrees with the findings of previous studies showing that pregnant women’s energy expenditure is the least in occupational activities and sports/exercise.17-22,37
The study finding of low sports exercise activity and vigorous-intensity activity may be explained by the lack of information and common misbeliefs about exercise and sports during pregnancy in Saudi Arabia. These results are consistent with those of other studies conducted in Japan and Taiwan.18,38
The content validity for the Arabic version of the tool was good. Although the S-CVI\UA was not adequate for this questionnaire, both S-CVI and I-CVI were excellent.
Construct validity for this questionnaire was good as all constructs were unidimensional with good fit to Rasch model. In addition, item difficulty order was consistent with theoretical hierarchy in almost all domains except for leisure/sports activities where 2 items (number 24 and number 25) distorted the order. These items were about walking quickly and walking quickly up hills respectively. Although these 2 activities consume higher energy in comparison to other activities, walking in general is a common activity for pregnant women and is performed regularly throughout pregnancy. This might explain the low difficulty level yielded for these 2 items.
Cronbach’s alpha coefficients for different types of PA ranged from 0.56 to 0.83. However, Rasch analysis revealed good to excellent item separation index (2.81-5.45) and excellent reliability (0.89-0.97) indicating that this tool has excellent internal consistency.
The current study showed good test–retest reliability for total PA (ICC=0.78). This was comparable with other translations and adaptations of the PPAQ, as the ICC ranged from 0.75 to 0.9.17.19-22
During the preparation of this manuscript, an Arabic version of the PPAQ was published.39 It showed validity in a sample of 179 pregnant Lebanese women with different educational backgrounds, socioeconomic statuses, and gestational ages.39 The advantages of the current study are that our tool is a translation of the original English PPAQ without major changes. The researchers retained the same questions and number of items with the same content and type of activities without deleting any items. During the stage of validating the tool, all questions were relevant to the population, and most of the participants answered them. Thus, we believe that this tool might be more applicable to pregnant women in the Arabian Gulf region. In addition, the reliability of this tool was tested at 2-week intervals and showed good repeatability and internal consistency.
There were some limitations in this study, as it was a self-report measure of PA that could have been limited by recall bias. Adding a pedometer in future research, for example, can add a more objective assessment method. Another limitation was the low number of participants in the post-test, which could have weakened the ICC values.
In conclusion, the results of our reliability and validity testing are in line with those of previous studies indicating successful translation and adaptation of the PPAQ to the Saudi/Arabian Gulf culture and acceptable reliability of the Arabic PPAQ.
Acknowledgment
This research was supported by the College of Medicine Research Center, Deanship of Scientific Research, King Saud University, Riyadh, Saudi Arabia. The authors would like to thank Dr. Ali Al-Hazmi, Dr. Hayfaa Wahabi and Dr. Samia Esmaeil for revising the initial questionnaire after translation.
Footnotes
Disclosure. Authors have no conflict of interests, and the work was not supported or funded by any drug company.
- Received December 29, 2020.
- Accepted March 22, 2021.
- Copyright: © Saudi Medical Journal
This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial License (CC BY-NC), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.