Game of Questions: An automated method for unconventional evaluation of Large Language Models

Przemysław Świat; Łukasz Hein; Marcel Cymanowski

doi:10.34808/tq2025/29.4/b

Game of Questions: An automated method for unconventional evaluation of Large Language Models

Abstract

The rapid advancement of Large Language Models (LLMs) has created a need for methods to evaluate their performance, particularly in assessing their domain-specific knowledge and the ability to apply such knowledge in reasoning tasks. Current benchmarks often require substantial manual effort for test case construction and answer scoring. We address this limitation by providing a robust, automatic evaluation method that relies only on unstructured domain text. We introduce the Game of Questions, a method that allows the model's knowledge to be tested via an interaction with another model, inspired by the popular web-based game Akinator. The approach requires minimal input from the evaluator and no prepared questions, making it convenient to apply.

Keywords:

Large Language Model, benchmark

Details

Issue

Vol. 29 No. 4 (2025)

Section

Research article

Published

2026-05-25

DOI:

https://doi.org/10.34808/tq2025/29.4/b

Licencja:

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors

Przemysław Świat
Gdańsk University of Technology https://orcid.org/0009-0005-7547-1151 ##linkOpensInNewTab##
Łukasz Hein
Gdańsk University of Technology
Marcel Cymanowski
Gdańsk University of Technology

Download paper

pdf

Main menu

Game of Questions: An automated method for unconventional evaluation of Large Language Models

Abstract

Keywords:

Details

Authors

Przemysław Świat

Łukasz Hein

Marcel Cymanowski

Download paper