Skip to main content

Main menu

  • Home
  • Content
    • Latest
    • Archive
    • home
  • Info for
    • Authors
    • Reviewers
    • Subscribers
    • Institutions
    • Advertisers
    • Join SMJ
  • About Us
    • About Us
    • Editorial Office
    • Editorial Board
  • More
    • Advertising
    • Alerts
    • Feedback
    • Folders
    • Help
  • Other Publications
    • NeuroSciences Journal

User menu

  • My alerts
  • Log in

Search

  • Advanced search
Saudi Medical Journal
  • Other Publications
    • NeuroSciences Journal
  • My alerts
  • Log in
Saudi Medical Journal

Advanced Search

  • Home
  • Content
    • Latest
    • Archive
    • home
  • Info for
    • Authors
    • Reviewers
    • Subscribers
    • Institutions
    • Advertisers
    • Join SMJ
  • About Us
    • About Us
    • Editorial Office
    • Editorial Board
  • More
    • Advertising
    • Alerts
    • Feedback
    • Folders
    • Help
  • Follow psmmc on Twitter
  • Visit psmmc on Facebook
  • RSS
Research ArticleOriginal Article
Open Access

Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics

Comparative study with traditional tools

Anusha S. Meo, Narmeen Shaikh and Sultan A. Meo
Saudi Medical Journal December 2024, 45 (12) 1383-1390; DOI: https://doi.org/10.15537/smj.2024.45.12.20240454
Anusha S. Meo
From the The School of Medicine (AS Meo), Medical Sciences and Nutrition, University of Aberdeen, Scotland, United Kingdom; from the College of Medicine (Shaikh), King Saud University; and from the Department of Physiology (SA Meo), College of Medicine, King Saud University, Riyadh, Kingdom of Saudi Arabia.
MBBS, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Narmeen Shaikh
From the The School of Medicine (AS Meo), Medical Sciences and Nutrition, University of Aberdeen, Scotland, United Kingdom; from the College of Medicine (Shaikh), King Saud University; and from the Department of Physiology (SA Meo), College of Medicine, King Saud University, Riyadh, Kingdom of Saudi Arabia.
MBBS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sultan A. Meo
From the The School of Medicine (AS Meo), Medical Sciences and Nutrition, University of Aberdeen, Scotland, United Kingdom; from the College of Medicine (Shaikh), King Saud University; and from the Department of Physiology (SA Meo), College of Medicine, King Saud University, Riyadh, Kingdom of Saudi Arabia.
MBBS, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sultan A. Meo
  • For correspondence: [email protected]
  • Article
  • Figures & Data
  • eLetters
  • Info & Metrics
  • References
  • PDF
Loading

Article Figures & Data

Figures

  • Tables
  • Figure 1
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1

    - Large dataset file upload and command entry into Chat GPT Omni. ANOVA: Analysis of variance

  • Figure 2
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2

    - Comparison of large dataset-based graphs generated by Chat GPT-Omni versus Statistical Package for Social Sciences.

Tables

  • Figures
    • View popup
    Table 1

    - Summary of type of statistical tests used for data analysis.

    Descriptive AnalysisFrequency, percentage, mean, median, mode, range, standard deviation, and skewness of the data.
    Bivariate analysis
    • a. T-test or Mann–Whitney U test depending on variables in the dataset, between a continuous and a dichotomous categorical variable.

    • b. ANOVA or Kruskal–Walli’s test depending on variables in the dataset. In the case of ANOVA, a Post HOC ‘Scheffe/Tukey HSD’ test was also done for intergroup comparison of means.

    • c. Chi-square test was done between two categorical variables.

    • d. Correlation analysis: Spearmen or Pearson tests depending on variables in the dataset.

    • e. Simple Linear Regression was done to find a linear relationship between two continuous variables.

    Multivariate analysisMultiple linear regression if the defined outcome was a continuous variable.

    ANOVA: Analysis of variance, HSD: Honestly significant difference

      • View popup
      Table 2

      - The script of the nine questions used to describe various statistical tool.

      Q1Re-code the string data into numeric values by allotting each group a number
      Q2Transform the continuous age variable into a categorical age variable.
      Q3Find mode, median and range.
      Q3bCalculate mean and standard deviation for continuous variables and frequency and per cent for categorical variables.
      Q3cCalculate the skewness of the data.
      Q4aWhat is the most accurate statistical test for the variables (continuous variable) and (continuous variable)? Perform it and give its coefficient and p-value.
      Q4bFor the above continuous variables, perform simple linear regression analysis if applicable with y as dependant and x as an independent variable.
      Q5What is the most accurate statistical test for the variables (continuous variable) and (dichotomous categorical variable)? Perform it and give its coefficient and p-value.
      Q6What is the most accurate statistical test for the variables (continuous var) & (categorical variable > 2 levels)? Perform it and give its coefficient and p-value. Do a post hoc test for the ANOVA test.
      Q7What is the most accurate statistical test for the variables (categorical variable) & (categorical variable)? Perform it and give its coefficient and p-value.
      Q8Perform multiple linear regression/Logistic regression with (y) as the dependent variable (based on the dataset).
      Q9Make appropriate charts: continuous vs continuous variable plot, chart of normality for a continuous variable and cluster bar chart of frequency for categorical variable
      • View popup
      Table 3

      - T=Performance of ChatGPT Omni tool and recorded response time

      S #QuestionSmall datasetMedium datasetLarge datasetTotal score
      ScoreTime (s)ScoreTime (s)ScoreTime (s)
      1Re-code the string data into numeric values by allotting each group a number (1)1/117.700/120.331/129.432/3
      2Transform continuous age variable into categorical age variable (1)1/115.381/123.881/111.003/3
      3aFind mode, median and range (3)3/344.113/340.613/337.0223/24
      3bCalculate the mean and standard deviation for continuous variables (2)2/233.862/219.812/222.11
      Calculate frequency and per cent for categorical variables (2)1.5/247.352/230.022/213.16
      3cCalculate the skewness of the data (1)0.5/127.231/134.231/120.03
      4aWhat statistical test will be performed between (cont. variable) and (cont. variable)?--34.96--26.34--31.5311/13
      Choice of Test (1)0/10/11/1
      Calculation of correct test statistic and p-value (2)2/234.532/234.032/2
      4bFor the above continuous variables, perform a simple linear regression analysis, with y as the dependent and x as the independent variable (4). R2 variance. A p-value of the regression Intercept (constant) SlopeN/A*N/A*N/A*N/A*4/4143.59
      5What statistical test will be performed between cont. variable and dichotomous cat variable 62.91--67.51--108.718/9
      Choice of Test (1)1/11/10/1
      Calculation of correct test statistic and p-value (2)2/22/22/2117.14
      6What statistical test will be performed between cont. variable & cat > 2 levels variable 103.79 120.24 80.1411/11
      Choice of Test (1)1/11/11/1
      Calculation of correct test statistic and p-value (2)2/22/22/2
      Post Hoc for ANOVA (2)
      The mean difference, P-value
      N/A†N/A†N/A†N/A†2/2260.19
      7What statistical test will be performed between? cat variable and cat variable--41.50--54.98--119.927/9
      Choice of Test (1)1/11/11/1
      Calculation of correct test statistic and p-value (2)0/22/22/2
      8Perform Multiple Linear Regression with defined outcome (4), Adjusted R2 (variance), F statistic, and P-value. CoefficientsN/A§N/A§N/A§N/A§2/439.182/4
      9Make appropriate charts (bar graph) based on the dataset and specific variables (3) Cont. versus cont. variable plot Chart of normality for cont. variable, a Cluster bar chart of frequency for cat variable3/324.510/365.013/339.166/9
       Total21/25487.7420/25747.0232/351071.31Score: 73/85
      Time: 2306.7 secs

      ANOVA: Analysis of variance, Cont.: continuation, sec: seconds

      • ↵* N/A as linear regression could not be performed for the small and medium datasets since the data was not linear and not normally distributed.

      • ↵† Post-hoc analysis was only done for ANOVA and datasets and there not applicable where the Kruskal Wallis test was performed instead.

      • ↵§ N/A as multiple linear regression could not be performed for the small and medium datasets.

    PreviousNext
    Back to top

    In this issue

    Saudi Medical Journal: 45 (12)
    Saudi Medical Journal
    Vol. 45, Issue 12
    1 Dec 2024
    • Table of Contents
    • Cover (PDF)
    • Index by author
    Print
    Download PDF
    Email Article

    Thank you for your interest in spreading the word on Saudi Medical Journal.

    NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

    Enter multiple addresses on separate lines or separate them with commas.
    Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics
    (Your Name) has sent you a message from Saudi Medical Journal
    (Your Name) thought you would like to see the Saudi Medical Journal web site.
    Citation Tools
    Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics
    Anusha S. Meo, Narmeen Shaikh, Sultan A. Meo
    Saudi Medical Journal Dec 2024, 45 (12) 1383-1390; DOI: 10.15537/smj.2024.45.12.20240454

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    Share
    Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics
    Anusha S. Meo, Narmeen Shaikh, Sultan A. Meo
    Saudi Medical Journal Dec 2024, 45 (12) 1383-1390; DOI: 10.15537/smj.2024.45.12.20240454
    Twitter logo Facebook logo Mendeley logo
    • Tweet Widget
    • Facebook Like
    • Google Plus One
    Bookmark this article

    Jump to section

    • Article
      • ABSTRACT
      • Methods
      • Results
      • Acknowledgment
      • Footnotes
      • References
    • Figures & Data
    • eLetters
    • References
    • Info & Metrics
    • PDF

    Related Articles

    • No related articles found.
    • PubMed
    • Google Scholar

    Cited By...

    • No citing articles found.
    • Google Scholar

    More in this TOC Section

    • The risk factors for cardiovascular disease and chronic kidney disease in patients with nonalcoholic fatty liver disease in Saudi Arabia
    • Prolonged flight exposure and its effects on sinonasal health among aircrew members
    • Identifying individuals at risk of post-stroke depression
    Show more Original Article

    Similar Articles

    Keywords

    • artificial intelligence
    • GPT-4 Omni
    • medical statistics
    • statistical analysis

    CONTENT

    • home

    JOURNAL

    • home

    AUTHORS

    • home
    Saudi Medical Journal

    © 2025 Saudi Medical Journal Saudi Medical Journal is copyright under the Berne Convention and the International Copyright Convention.  Saudi Medical Journal is an Open Access journal and articles published are distributed under the terms of the Creative Commons Attribution-NonCommercial License (CC BY-NC). Readers may copy, distribute, and display the work for non-commercial purposes with the proper citation of the original work. Electronic ISSN 1658-3175. Print ISSN 0379-5284.

    Powered by HighWire