LLM-based embeddings for clustering and predicting integrated reporting quality levels of companies

dc.contributor.author Mert Sarioglu
dc.contributor.author Gorkem Sariyer
dc.contributor.author Mert Erkan Sözen
dc.contributor.author Sarioglu, Mert
dc.contributor.author Sariyer, Gorkem
dc.contributor.author Sozen, Mert Erkan
dc.date.accessioned 2025-10-06T17:48:32Z
dc.date.issued 2025
dc.description.abstract Artificial Intelligence (AI) offers various useful functions and algorithms that provide numerous benefits for firms to enhance their decision-making process. Moreover with the adoption of Integrated Reporting (IR) reporting practices which are critical communication channels for companies have become more practical. Given the importance of subjects it is believed that addressing LLM embeddings based AI methodologies will contribute positively to IR quality (IRQ) to achieve better results. Additionally grouping companies according to their IRQ characteristics will lead time and cost efficiency in decision-making. So that the main purpose of this study is to cluster companies with respect to their IRQ characteristics based on LLM embeddings and to use this grouping in further decision-making. This paper therefore provides significant evidence whether LLM is useful tool of AI techniques in IR practices and LLM-based clustering is an efficient way of generating predictions for decision-making. To do so the sample size of the study consists of 260 published IR in 2019. This study also introduces a novelty to the literature on the applicability of LLM with small data sets considering that the number of integrated reports published in a year is low or when the sample considered will be small. The findings reveal the superiority of LLM while indicating the usefulness of LLM in prediction of IRQ regarding different indicators of firms. Given the empirical evidence shown the techniques and steps should be adapted by firms both in identifying ways to improve IRQ and in different AI applications in the future. © 2025 Elsevier B.V. All rights reserved.
dc.identifier.doi 10.1007/s10791-025-09590-6
dc.identifier.issn 29482992
dc.identifier.issn 2948-2992
dc.identifier.issn 2948-2984
dc.identifier.scopus 2-s2.0-105006715760
dc.identifier.uri https://www.scopus.com/inward/record.uri?eid=2-s2.0-105006715760&doi=10.1007%2Fs10791-025-09590-6&partnerID=40&md5=ed70194bf313db3bcb55b25742bd8fc2
dc.identifier.uri https://gcris.yasar.edu.tr/handle/123456789/7964
dc.identifier.uri https://doi.org/10.1007/s10791-025-09590-6
dc.language.iso English
dc.publisher Springer Science and Business Media B.V.
dc.relation.ispartof Discover Computing
dc.rights info:eu-repo/semantics/openAccess
dc.source Discover Computing
dc.subject Artificial Intelligence, Integrated Reporting Quality, K-means, Large Language Model, Sbert, Xgboost
dc.subject K-means
dc.subject SBERT
dc.subject Integrated Reporting Quality
dc.subject Large Language Model
dc.subject XGBoost
dc.subject Artificial Intelligence
dc.title LLM-based embeddings for clustering and predicting integrated reporting quality levels of companies
dc.type Article
dspace.entity.type Publication
gdc.author.id SÖZEN, Mert Erkan/0000-0002-7965-6461
gdc.author.id sariyer, görkem/0000-0002-8290-2248
gdc.author.id Sarıoğlu, Mert/0000-0001-7186-228X
gdc.author.scopusid 59919309700
gdc.author.scopusid 57430116000
gdc.author.scopusid 57189867008
gdc.author.wosid Sarıoğlu, Mert/LQK-2569-2024
gdc.author.wosid sariyer, görkem/AAA-1524-2019
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department
gdc.description.departmenttemp [Sarioglu, Mert] Yasar Univ, Fac Business, Business Adm Dept Part Time Lecturer, Izmir, Turkiye; [Sariyer, Gorkem] Yasar Univ, Fac Business, Business Adm Dept, Izmir, Turkiye; [Sozen, Mert Erkan] Izmir Metro Co, Business Dev & Data Sci Execut, Izmir, Turkiye
gdc.description.issue 1
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
gdc.description.volume 28
gdc.description.woscitationindex Science Citation Index Expanded
gdc.identifier.openalex W4410791185
gdc.identifier.wos WOS:001497385600001
gdc.index.type Scopus
gdc.index.type WoS
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 2.0
gdc.oaire.influence 2.503339E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 4.0656E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration National
gdc.openalex.fwci 6.6607
gdc.openalex.normalizedpercentile 0.96
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 10
gdc.plumx.scopuscites 2
gdc.scopus.citedcount 2
gdc.virtual.author Sözen, Mert Erkan
gdc.virtual.author Pedergnana, Matthieu Joseph
gdc.wos.citedcount 2
person.identifier.scopus-author-id Sarioglu- Mert (59919309700), Sariyer- Gorkem (57189867008), Sözen- Mert Erkan (57430116000)
publicationissue.issueNumber 1
publicationvolume.volumeNumber 28
relation.isAuthorOfPublication 9aa64924-97d5-47da-8bde-f07877223c7c
relation.isAuthorOfPublication cbdb2729-def4-496b-882b-ca6102fcbd56
relation.isAuthorOfPublication.latestForDiscovery 9aa64924-97d5-47da-8bde-f07877223c7c
relation.isOrgUnitOfPublication ac5ddece-c76d-476d-ab30-e4d3029dee37
relation.isOrgUnitOfPublication.latestForDiscovery ac5ddece-c76d-476d-ab30-e4d3029dee37

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
s10791-025-09590-6.pdf
Size:
1.58 MB
Format:
Adobe Portable Document Format