Large Language Models and Data Quality for Knowledge Graphs
摘要截稿:
全文截稿: 2024-09-01
影响因子: 4.787
期刊难度:
CCF分类: B类
中科院JCR分区:
• 大类 : 计算机科学 - 1区
• 小类 : 计算机:信息系统 - 1区
• 小类 : 图书情报与档案管理 - 1区
Overview
Knowledge Graphs (KGs) have become crucial for virtual assistants, web search, and organizational data comprehension in recent years. Examples include Wikidata, DBpedia, YAGO, and NELL, which large companies utilize for data organization. Building KGs involves AI areas like data integration, cleaning, named entity recognition, relation extraction, and active learning. However, automated methods often result in sparse and inaccurate KGs. Evaluating KG quality is vital for insights, refining construction processes, and ensuring accurate information for downstream applications. Despite its significance, there's limited research on data quality and evaluation for KGs at scale.
Large Language Models (LLMs) present opportunities and challenges for KG construction and evaluation, bridging human-machine capabilities. Integrating LLMs into KG systems can enhance context-awareness but may introduce mis/disinformation. Managing LLM hallucinations is crucial to prevent KG pollution. Investigating LLMs and quality evaluation integration has potential, as seen in relevance judgments for information retrieval.The special issue advocates human-machine collaboration for KG construction and evaluation, emphasizing the intersection of KGs and LLMs. Submissions are encouraged on LLMs in KG systems, KG quality evaluation, and quality control systems for KG and LLM interactions in research and industry. Topics include KG construction, LLM use in KG generation, deploying LLMs on large-scale KGs, efficient KG quality assessment, human-in-the-loop architectures, domain-specific applications, and industry-scale KG maintenance. The issue aims to advance understanding and application of KGs and LLMs, fostering innovation in this evolving intersection.
Guest editors:
1. Dr. Gianmaria Silvello (Managing Guest Editor)University of Padua, Department of Information Engineering, Padua, Italy.2. Dr. Omar AlonsoAmazon, Palo Alto, California, United States of America.3. Dr. Stefano MarchesinUniversity of Padua, Department of Information Engineering, Padua, Italy.
Special issue information:
In recent years, Knowledge Graphs (KGs), encompassing millions of relational facts, have emerged as central assets to support virtual assistants and search and recommendations on the web. Notable examples are Wikidata, DBpedia, YAGO, and NELL. Moreover, KGs are increasingly used by large companies and organizations to organize and comprehend their data, with industry-scale KGs fusing data from various sources for downstream applications. Building KGs involves data management and artificial intelligence areas, such as data integration, cleaning, named entity recognition and disambiguation, relation extraction, and active learning [1, 2].
However, the methods used to build these KGs involve automated components that could be better, resulting in KGs with high sparsity and incorporating several inaccuracies and wrong facts. As a result, evaluating the KG quality plays a significant role, as it serves multiple purposes – e.g., gaining insights into the quality of data, triggering the refinement of the KG construction process, and providing valuable information to downstream applications. In this regard, the information in the KG must be correct to ensure an engaging user experience for entity-oriented services like virtual assistants. Despite its importance, there is little research on data quality and evaluation for KGs at scale [3].
In this context, the rise of Large Language Models (LLMs) opens up unprecedented opportunities – and challenges – to advance KG construction and evaluation, providing an intriguing intersection between human and machine capabilities. On the one hand, integrating LLMs within KG construction systems could trigger the development of more context-aware and adaptive AI systems. At the same time, however, LLMs are known to hallucinate and can thus generate mis/disinformation, which can affect the quality of the resulting KG. In this sense, reliability and credibility components are of paramount importance to manage the hallucinations produced by LLMs and avoid polluting the KG. On the other hand, investigating how to combine LLMs and quality evaluation has excellent potential, as shown by promising results from using LLMs to generate relevance judgments in information retrieval [4, 5].
Thus, this special issue promotes novel research on human-machine collaboration for KG construction and evaluation, fostering the intersection between KGs and LLMs [6, 7]. To this end, we encourage submissions related to using LLMs within KG construction systems, evaluating KG quality, and applying quality control systems to empower KG and LLM interactions on both research- and industrial-oriented scenarios.
Possible topics of submissions
Potential topics include but are not limited to the following:
KG construction systems
Use of LLMs for KG generation
Efficient solutions to deploy LLMs on large-scale KGs