STEM item generation: Can ChatGPT be culturally responsive?

Laura Lambert; Mason Jones

doi:10.7275/pare.3152

Options

Article

STEM item generation: Can ChatGPT be culturally responsive?

Authors

Laura Lambert (James Madison University)
Mason Jones (James Madison University)

Abstract

This exploratory study investigates bias in multiple-choice biology items generated by ChatGPT-4o, focusing not only on the impact of prompt phrasing but also on how a user’s query history influences item content. Specifically, it addresses three research questions: (1) How well does ChatGPT generate introductory biology items? (2) How does it interpret a request for culturally relevant content? and (3) How do outputs vary across three distinct user profiles? Using a standardized series of prompts, 100 items were generated per condition. All items were analyzed for factual accuracy, content representation, and patterns in correct answer distribution. Additional analyses for each research question evaluated the representation of scientists (e.g., perceived name diversity, gendered pronouns) and the depth of culturally responsive framing. While ChatGPT produced largely accurate items across conditions, there were biases that emerged. Culturally responsive prompts often yielded tokenized cultural statements rather than contextually rich items. Correct answers were non-randomly distributed, posing threats to test validity. Crucially, user query history influenced multiple aspects of the generated items: representation of content topics, representation of scientists, and what is considered “culture.” These findings have implications for test developers at any level considering genAI tools that preserve a user’s query history in assessment design, emphasizing the need for careful consideration of both prompt engineering as well as user history.

Keywords: AI, Culturally responsive, Test development

How to Cite:

Lambert, L. & Jones, M., (2026) “STEM item generation: Can ChatGPT be culturally responsive?”, Practical Assessment, Research, and Evaluation 30(2): 8. doi: https://doi.org/10.7275/pare.3152

Downloads:
Download PDF
View PDF

194 Views

19 Downloads

Published on
2026-03-02

Peer Reviewed

License

Creative Commons Attribution 4.0

Authors

Laura Lambert (James Madison University)
Mason Jones (James Madison University)

Downloads

Issue

Volume 30 • 2025 • Special Issue - Putting Research into Practice: Advancing the Development, Analysis, and Use of Culturally Relevant, Culturally Responsive, and Culturally Sustaining Assessment

Identifiers

DOI: https://doi.org/10.7275/pare.3152

Publication details

Article Number: 8
Submitted on: 2025-05-30
Accepted on: 2026-01-17

Competing Interests

No conflict of interests to declare.

File Checksums (MD5)

PDF: d14f47df39bd65a27f61228ce7a98021

STEM item generation: Can ChatGPT be culturally responsive?

Abstract

Harvard-Style Citation

Vancouver-Style Citation

APA-Style Citation

Non Specialist Summary