Evaluating LLM-Generated Versus Human-Authored Responses in Role-Play Dialogues [Oral]
The 18th International Natural Language Generation Conference (INLG 2025) [Slides] [Poster]
Description: We evaluate LLM-generated responses compared to human-authored responses in role-playing multi-turn interactions. We observed multi-turn degradation and consistency issues in LLM-generated responses.