Assessing ChatGPT's Potential in HIV Prevention Communication: A Comprehensive Evaluation of Accuracy, Completeness, and Inclusivity

Creators: De Vito, Andrea; Colpani, Agnese; Moi, Giulia; Babudieri, Sergio; Calcagno, Andrea; Calvino, Valeria; Ceccarelli, Manuela; Colpani, Gianmaria; d'Ettorre, Gabriella; Di Biagio, Antonio; Farinella, Massimo; Falaguasta, Marco; Focà, Emanuele; Giupponi, Giusi; Habed, Adriano José; Isenia, Wigbertson Julian; Lo Caputo, Sergio; Marchetti, Giulia; Modesti, Luca; Mussini, Cristina; Nunnari, Giuseppe; Rusconi, Stefano; Russo, Daria; Saracino, Annalisa; Serra, Pier Andrea; Madeddu, Giordano

Others:: De Vito, Andrea; Colpani, Agnese; Moi, Giulia; Babudieri, Sergio; Calcagno, Andrea; Calvino, Valeria; Ceccarelli, Manuela; Colpani, Gianmaria; D'Ettorre, Gabriella; Di Biagio, Antonio; Farinella, Massimo; Falaguasta, Marco; Focà, Emanuele; Giupponi, Giusi; Habed, Adriano José; Isenia, Wigbertson Julian; Lo Caputo, Sergio; Marchetti, Giulia; Modesti, Luca; Mussini, Cristina; Nunnari, Giuseppe; Rusconi, Stefano; Russo, Daria; Saracino, Annalisa; Serra, Pier Andrea; Madeddu, Giordano

Description

With the advancement of artificial intelligence(AI), platforms like ChatGPT have gained traction in different fields, including Medicine. This study aims to evaluate the potential of ChatGPT in addressing questions related to HIV prevention and to assess its accuracy, completeness, and inclusivity. A team consisting of 15 physicians, six members from HIV communities, and three experts in gender and queer studies designed an assessment of ChatGPT. Queries were categorized into five thematic groups: general HIV information, behaviors increasing HIV acquisition risk, HIV and pregnancy, HIV testing, and the prophylaxis use. A team of medical doctors was in charge of developing questions to be submitted to ChatGPT. The other members critically assessed the generated responses regarding level of expertise, accuracy, completeness, and inclusivity. The median accuracy score was 5.5 out of 6, with 88.4% of responses achieving a score >= 5. Completeness had a median of 3 out of 3, while the median for inclusivity was 2 out of 3. Some thematic groups, like behaviors associated with HIV transmission and prophylaxis, exhibited higher accuracy, indicating variable performance across different topics. Issues of inclusivity were identified, notably the use of outdated terms and a lack of representation for some communities. ChatGPT demonstrates significant potential in providing accurate information on HIV-related topics. However, while responses were often scientifically accurate, they sometimes lacked the socio-political context and inclusivity essential for effective health communication. This underlines the importance of aligning AI-driven platforms with contemporary health communication strategies and ensuring the balance of accuracy and inclusivity.

Assessing ChatGPT's Potential in HIV Prevention Communication: A Comprehensive Evaluation of Accuracy, Completeness, and Inclusivity

Description

Additional details