Ahmad Pratama
Data Literacies Lead
Stony Brook University
Artificial intelligence (AI) text detection tools are considered a means of preserving the integrity of scholarly publication by identifying whether a text is written by humans or generated by AI. In the presented study, three popular tools (GPTZero, ZeroGPT, and DetectGPT) are tested in two experiments: first, telling human-written abstracts apart from ones generated by ChatGPT o1 and Gemini 2.0 Advanced; second, evaluating AI-assisted abstracts where the original text has been enhanced by the same large language models (LLMs) to improve readability. The results reveal notable trade-offs in accuracy and bias, disproportionately affecting non-native speakers and certain disciplines. This study highlights the limitations of detection-focused approaches and urges a shift toward ethical, responsible, and transparent use of LLMs in scholarly publication.