Skip to main content

Table 3 Performance metrics of each AI model based on etiological groups

From: Performance of the Large Language Models in African rheumatology: a diagnostic test accuracy study of ChatGPT-4, Gemini, Copilot, and Claude artificial intelligence

AI Models

Etiological Groups

Sensitivity (%)

Specificity (%)

PPV (%)

NPV (%)

Accuracy (%)

AUC (95% CI)

ChatGPT-4

Infectious diseases

91.83

18.51

50.56

71.42

53.39

0.552 (0.451–0.650)

Degenerative diseases

86.66

13.63

14.60

85.71

24.72

0.502 (0.401–0.602)

Chronic inflammatory rheumatic diseases

94.11

15.11

17.97

92.85

28.15

0.546 (0.445–0.645)

Microcrystalline diseases

84.61

13.33

12.36

85.71

22.33

0.490 (0.390–0.590)

Neoplastic diseases

44.44

9.57

4.49

64.28

12.62

0.270 (0.187–0.367)

Gemini

Infectious diseases

69.38

25.92

45.94

48.27

46.60

0.477 (0.377–0.577)

Degenerative diseases

80.00

29.54

16.21

89.65

36.89

0.548 (0.447–0.646)

Chronic inflammatory rheumatic diseases

94.11

32.55

21.62

96.55

42.71

0.633 (0.533–0.726)

Microcrystalline diseases

61.53

26.66

10.81

82.75

31.06

0.441 (0.343-0542)

Neoplastic diseases

44.44

25.53

5.40

82.75

27.18

0.350 (0.259–0.450)

Copilot

Infectious diseases

81.63

29.63

51.28

64.00

54.36

0.556 (0.455–0.654)

Degenerative diseases

80.00

25.00

15.38

88.00

33.01

0.525 (0.424–0.624)

Chronic inflammatory rheumatic diseases

82.35

25.58

17.94

88.00

34.95

0.540 (0.439-0649)

Microcrystalline diseases

69.23

23.33

11.53

84.00

29.12

0.463 (0.364–0.564)

Neoplastic diseases

33.33

20.21

3.84

76.00

21.35

0.268 (0.185–0.364)

Claude AI

Infectious diseases

91.83

20.37

51.13

73.33

54.36

0.561 (0.460–0.659)

Degenerative diseases

73.33

12.50

12.50

73.33

21.35

0.429 (0.332–0.530)

Chronic inflammatory rheumatic diseases

94.11

16.27

18.18

93.33

29.12

0.552 (0.451–2.568)

Microcrystalline diseases

76.92

13.33

11.36

80.00

21.35

0.451 (0.353–0.552)

Neoplastic diseases

66.66

12.76

6.81

80.00

17.47

0.397 (0.302–0.492)

  1. PPV: Positive Predictive Value NPV: Negative Predictive Value