To find out how well GPT-4 can act as a dialogue agent in the video game Disco Elysium: The Final Cut, we conducted a statistically robust research study with 28 players recruited from the r/DiscoElysium subreddit. Players were tasked with playing through a conversation from the game and where GPT-4 outputs were evaluated against the game designers’ writing via both preference judgements and free-form feedback using a web interface that recreates the game’s core conversation functionality.

You can experience a slightly modified version of the same conversation our study participants played. The conversation is taken from a late stage in the game (taken from the validation split), which is why we only recruited players who had previously completed at least one full playthrough of the game.

Note: You can use any username/password combination. The login screen is purely decorative.

The setup is nearly identical, except for two small modifications:

  • Registering and logging in are just decorative. As a consequence, we do not collect any of the feedback you might provide through the interface. It is only stored locally.
  • Once you’ve chosen a dialogue option, hovering over the info icon of the selected dialogue option will allow you to determine whether the option you’ve chosen was human-authored or generated by GPT-4.

Disclaimer: The dialogue displayed below is intended for educational and research purposes only. Copyright is retained by the original authors.

Hint: You can maximize the display by pressing the  button.