This repository contains the implementation of ANTS, a research paper accepted at ESWC 2025 (Research Track). ANTS addresses the challenges of abstractive entity summarization in Knowledge Graphs (KGs) by generating optimal summaries that integrate existing triples with inferred (absent) triples. It leverages Knowledge Graph Embeddings (KGE) and Large Language Models (LLMs) to enhance summarization quality.
ANTS generates entity summaries in natural language from Knowledge Graphs by leveraging both KGE and LLM techniques. It addresses the problem of missing information by predicting absent triples and verbalizing them into readable summaries.
To run the ANTS framework, you need to install the following packages:
python 3.7+
torch
- Create and activate a Conda environment:
conda create --name ants python=3.7
conda activate ants
- Download the project
git clone https://github.com/dice-group/ANTS.git
# Navigate to ANTS directory
cd ANTS
- Install required packages:
pip install torch
pip install -r requirements.txt
β οΈ Important Note: Ensure that all dependencies are correctly installed.
βββ data
β βββ ESBM-DBpedia
β β βββ ESSUM
β β β βββ silver-standard-summaries
β β β βββ absent
β β βββ predictions
β β β βββ ANTS
β β β βββ baselines
β β β βββ KGE
β β β βββ LLM
β β βββ elist.txt
β βββ FACES
β βββ ESSUM
β β βββ silver-standard-summaries
β β βββ absent
β βββ predictions
β β βββ ANTS
β β βββ baselines
β β βββ KGE
β β βββ LLM
β βββ elist.txt
βββ src
βββ βββ evaluation-modules
βββ βββ KGE-triples
βββ βββ LLM-triples
βββ βββ ranking-modules
βββ βββverbalizing-modules
βββ LICENSE
βββ README.md
A silver-standard dataset combining entities from ESBM-DBpedia and FACES. For each entity, we extract sentences with mentioned entities from the first paragraph of its Wikipedia page. In our experiment, we created two subsets: (1) ESSUM-DBpedia: 110 entities from ESBM-DBpedia, and (2) ESSUM-FACES: 50 entities from FACES.
Derived by randomly removing 20% of triples from ESBM-DBpedia and FACES. These omitted triples serve as ground-truth absent triples to evaluate a modelβs ability to infer missing facts.
cd src/KGE-triples
# Clone the LiteralE repository
git clone https://github.com/SmartDataAnalytics/LiteralE.git
# Navigate to the LiteralE directory and download the DBpedia dataset
cd LiteralE/data
wget https://zenodo.org/records/10991461/files/dbpedia34k.tar.gz
tar -xvf dbpedia34k.tar.gz
# back to KGE-triples folder
cd ../..
# Update LiteralE modules
bash update-LiteralE-modules.sh
## Navigate to the KGE-triples directory
cd src/KGE-triples
# Execute the script for missing triples prediction
python run_missing_triples_prediction.py --dataset dbpedia34k --system Conve_text --input_drop 0.2 --embedding_dim 100 --batch_size 1 --epochs 100 ---lr 0.001 --process True
This component leverages a Large Language Model (LLM), such as GPT, to extend its application to knowledge graph (KG) completion tasks, including triple classification, relation prediction, and the completion of missing triples. As illustrated below, the ANTS approach integrates the LLM-triples component, such as GPT-4, to address the inherent limitations of KGE methods in inferring literal triples.
cd src/LLM-triples
# Execute the script for missing triples prediction
python run_missing_triples_prediction.py --model <gpt-model> --system gpt-4 --dataset ESSUM-DBpedia
# Execute the script for post-processing
python post_processing.py --system gpt-4 --dataset ESSUM-DBpedia
Triples ranking utilizes the frequency of predicate occurrences within the knowledge graph, such as DBpedia. Predicates that occur most frequently will prioritize their corresponding triples at the top of the list. Run the triples-ranking
process (which includes the ranking process and entity summary).
# Navigate to ranking-modules directory
cd src/ranking-modules
# Run triple-ranking and entity summary
python triples-ranking.py --kge_model conve_text --llm_model gpt-4 --combined_model conve_text_gpt-4 --dataset ESSUM-DBpedia --base_model ANTS
Provides automatic evaluation of verbalized summaries using multiple NLP metrics.
Requirement:
- Download the pre-trained model for verbalizing the abstractive summaries. Link verbalization-P2 model: https://zenodo.org/records/10984714
- Move the pre-trained model to the verbalization-modules directory.
# Navigate to verbalizing-modules directory
cd verbalizing-modules
# Execute the script for verbalizing entity summary
python verbalization-process.py --dataset ESSUM-DBpedia --system conve_text_gpt-4 --base_model ANTS --semantic_constraints True
# Navigate to evaluation-modules directory
cd src/evaluation-modules
# Run converting verbalization results to evaluation format
python converting-to-evaluation-format.py --system "conve_text_gpt-4" --dataset "ESSUM-DBpedia" --base_model "ANTS" --semantic_constraints
# Make sure the directory still in src/evaluation-modules directory
cd GenerationEval
# Execute the script to perform automatic evaluation
python eval.py -R ../../data/ESBM-DBpedia/predictions/ANTS/semantic-constraints/conve_text_gpt-4/evaluation/refs.txt -H ../../data/ESBM-DBpedia/predictions/ANTS/semantic-constraints/conve_text_gpt-4/evaluation/hyp.txt -lng en -nr 1 -m bleu,meteor,chrf++,ter,bert,bleurt
@inproceedings{ANTS2025,
author = {Firmansyah, Asep Fajar and Zahera, Hamada and Sherif, Mohamed Ahmed and and Moussallem, Diego and Ngonga Ngomo, Axel-Cyrille},
booktitle = {ESWC2025},
title = {ANTS: Abstractive Entity Summarization in Knowledge Graphs},
year = 2025
}
If you have any questions or feedbacks, feel free to contact us at [email protected]