Hostname: page-component-77f85d65b8-7lfxl Total loading time: 0 Render date: 2026-03-29T19:50:09.726Z Has data issue: false hasContentIssue false

Automating agentic collaborative ontology engineering with role-playing simulation of LLM-powered agents and RAG technology

Published online by Cambridge University Press:  19 December 2025

Andreas Soularidis*
Affiliation:
Department of Cultural Technology and Communications, Intelligent Systems Lab, University of the Aegean, University Hill, 81100 Lesvos, Greece
Dimitrios Doumanas
Affiliation:
Department of Cultural Technology and Communications, Intelligent Systems Lab, University of the Aegean, University Hill, 81100 Lesvos, Greece
Konstantinos Kotis
Affiliation:
Department of Cultural Technology and Communications, Intelligent Systems Lab, University of the Aegean, University Hill, 81100 Lesvos, Greece
George A. Vouros
Affiliation:
Department of Digital Systems, AI Lab, Gr. Lampraki 126, University of Piraeus, Piraeus, Greece
*
Corresponding author: Andreas Soularidis; Email: soularidis@aegean.gr
Rights & Permissions [Opens in a new window]

Abstract

Motivated by the astonishing capabilities of large language models (LLMs) in text-generation, reasoning, and simulation of complex human behaviors, in this paper, we propose a novel multi-component LLM-based framework, namely LLM4ACOE, that fully automates the collaborative ontology engineering (COE) process using role-playing simulation of LLM agents and retrieval augmented generation (RAG) technology. The proposed solution enhances the LLM-powered role-playing simulation with RAG ‘feeding’ the LLM with three different types of external knowledge. This knowledge corresponds to the knowledge required by each of the COE roles (agents), using a component-based framework, as follows: (a) domain-specific data-centric documents, (b) OWL documentation, and (c) ReAct guidelines. The aforementioned components are evaluated in combination, with the aim of investigating their impact on the quality of generated ontologies. The aim of this work is twofold, (a) to identify the capacity of LLM-based agents to generate acceptable (by human-experts) ontologies through agentic collaborative ontology engineering (ACOE) role-playing simulation, at specific levels of acceptance (accuracy, validity, and expressiveness of ontologies) without human intervention and (b) to investigate whether and/or to what extent the selected RAG components affect the quality of the generated ontologies. The evaluation of this novel approach is performed using ChatGPT-o in the domain of search and rescue (SAR) missions. To assess the generated ontologies, quantitative and qualitative measures are employed, focusing on coverage, expressiveness, structure, and human involvement.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (https://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. The Sim-HCOME COE methodology

Figure 1

Figure 2. Basic RAG architecture

Figure 2

Figure 3. The ReAct approach

Figure 3

Figure 4. Proposed framework architecture

Figure 4

Figure 5. Example screenshot of reasoning traces following the ReAct approach tailored to OE methodology

Figure 5

Table 1. Values of hyperparameters tested in the first experimental phase

Figure 6

Figure 6. Experimental settings of the second experimental phase

Figure 7

Table 2. Comparative results of first experimental phase

Figure 8

Table 3. Results of the first experimental phase

Figure 9

Table 4. Average, SD, and median measures of the first experimental setting

Figure 10

Table 5. Average, SD and median measures of the second experimental setting

Figure 11

Table 6. Average, SD and median measures of the third experimental setting

Figure 12

Table 7. Average, SD and median measures of the fourth experimental setting

Figure 13

Table 8. Comparative average measures among experimental settings

Figure 14

Table 9. Average, SD and median measures of augmentation of domain-specific data

Figure 15

Figure 7. Sequential Approach

Figure 16

Table 10. Comparative average measures among all experimental approaches

Figure 17

Table B1. Comparative average measures among similarity method of RAG in the first experimental phase

Figure 18

Table B2. Comparative average measures regarding LLM temperature in the first experimental phase

Figure 19

Table B3. Comparative average measures of the first experimental phase

Figure 20

Table C1. Experimental results from the first experimental setting

Figure 21

Table C2. Measures of the first experimental setting

Figure 22

Table C3. Experimental results from the second experimental setting

Figure 23

Table C4. Measures of the second experimental setting

Figure 24

Table C5. Experimental results from the third experimental setting

Figure 25

Table C6. Measures of the third experimental setting

Figure 26

Table C7. Experimental results from the fourth experimental setting

Figure 27

Table C8. Measures of the fourth experimental setting

Figure 28

Table D1. Measures of the domain-specific data augmentation experiment

Figure 29

Table D2. Experimental results from the augmentation of domain-specific data

Figure 30

Table D3. Measures of the sequential approach

Figure 31

Table D4. Experimental results from the sequential approach