VulKG

This repo contains the data and codes for the paper submitted to ACM TKDD, titled "A Compact Vulnerability Knowledge Graph for Risk Assessment".

1. Repo Structure

VulKG
├── README.md 
├── import
│   ├── AffectsAddProperty.csv
│   ├── DomainNodes_Vulnerabiliy_HAS_REFERENCE_Domain_relationship.csv
│   ├── ExploitNodes.csv
│   ├── ProductNodes_VendorNodes_Vulnerability_AFFECTS_Product_BELONGS_TO_Vendor.csv
│   ├── VulnerabilityNodes.csv
│   ├── VulnerabilityNodesAddProperties.csv
│   ├── Vulnerabiliy_HAS_EXPLOIT_Exploit_relationship.csv
│   └── WeaknessNodes.csv
├── DescriptionEmbedding
│   └── VulnerabilityNodesTextEmbedding20.pkl
├── 2.VulKG_Deployment_Cypher.cypher
├── 3.3.VCBD_link_prediction.py
├── 3.4.1.Plot_F1_score_comparison.py
├── 3.5.1.select_co_exploitation_links_in_training_and_test_sets.py
├── 3.5.2.CO_AFFECT_subgraph_and_topological_feature_extraction.py
├── 3.5.3.non_topological_feature_extraction.py
└── 3.5.3.non_topological_feature_extraction.py

Folder import contains all original data for VulKG deployment.

Folder DescriptionEmbedding contains the 20-dimensional extracted feature from vulnerability descriptions using a pre-trained BERT model. This file will be used in 3.5.3.non_topological_feature_extraction.py.

File 2.VulKG_Deployment_Cypher.cypher contains the cypher codes for VulKG deployment on the Neo4j graph database platform, described in Section 2.

Files 3.3 - 3.5.3 are the python codes for the use case, Vulnerability Co-Exploitation Behaviour Discovery (VCBD), on the VulKG.

2. VulKG Deployment

This section introduces how to deploy the VulKG into the Neo4j graph database platform.

2.1 Programming Language

Cypher Query Language

2.2 Software/Platform

Neo4j Desktop

2.3 A step-by-step guide for VulKG Deployment

Download (from here) and install (refer here) Neo4j Desktop 1.4.15 or higher versions
Create a project named VulKG Project with Neo4j Desktop.
Add a local DBMS named Graph DBMS with Neo4j Desktop and set the password as Neo4j. Choose version 4.4.11 for Graph DBMS.
Start Graph DBMS
Install APOC (refer here) and Graph Data Science Library (refer here) plugins for Graph DBMS with Neo4j Desktop.
Open the setting file of Graph DBMS and add a line as below in the setting file.

    apoc.import.file.enabled=true

to tackle an error:

    Failed to invoke procedure `apoc.periodic.iterate`: Caused by: 
    java.lang.RuntimeException: Import from files not enabled, please set 
    apoc.import.file.enabled=true in your apoc.conf

Put all files in the import folder into the import folder of Graph DBMS.
Open Graph DBMS with Neo4j Browser. Since Neo4j Browser comes out-of-the-box when you install Neo4j Desktop on your system, no installation is required.
Click the Enable multi statement query editor to enable running multiple Cypher statements separated by semi-colons ; in the Neo4j Browser setting.
Run Cypher statements in the 2.VulKG_Deployment_Cypher.cypher file with the Neo4j Browser to deploy VulKG.

3. Use case: Vulnerability Co-Exploitation Behaviour Discovery

This section introduces how to implement the use case: Vulnerability Co-exploitation Behaviour Discovery (VCBD) on VulKG.

3.1 Programing language

Python and Cypher

3.2 Library

numpy==1.22.4 该版本需要python 3.10或3.9

scikit-learn==1.1.1

matplotlib==3.5.2

3.3 Data

Data is provided in folder GD_VCBD_Ready_to_go. The generation process of this subgraph dataset is described in Section 3.5.

3.3 Vulnerability Co-Exploitation Behaviour Discovery

Run python codes in 3.3.VCBD_link_prediction.py to get the results reported in Table 7 and Table 8.

3.4 Result Visualization

Run python codes in 3.4.1.Plot_F1_score_comparison.py to get the visualization results reported in Fig. 4.

Run python codes in 3.4.2.Plot_ROC.py to get the visualization results reported in Fig. 5.

3.5 Subgraph dataset generation for VCBD task

This subsection introduces how to generate a raw version and a ready-to-go version of graph datasets for the VCBD task, which are provided in folders named GD_VCBD_Raw and GD_VCBD_Ready_to_go. In case someone wants to know the details on how to extract subgraph datasets from VulKG.

3.5.1 Programing language

Python and Cypher

3.5.2 Library

py2neo==2021.2.3 pip没有这个版本，有2021.2.4版

pandas==1.4.2

numpy==1.22.4

3.5.3 A step-by-step guide

Open Neo4j Desktop and start Graph DBMS
Open the setting file of Graph DBMS. Search and change the memory setting as below
```
 dbms.memory.heap.initial_size=4G
 dbms.memory.heap.max_size=4G
```

to tackle an error:

    py2neo.errors.ClientError: [Procedure.ProcedureCallFailed] Failed to invoke procedure 
    `gds.graph.project.cypher`: Caused by: java.lang.OutOfMemoryError: Java heap space

run python codes in 3.5.1.select_co_exploitation_links_in_training_and_test_sets.py to construct the link head-tail pairs in the training and test sets
run python codes in 3.5.2.CO_AFFECT_subgraph_and_topological_feature_extraction.py to extract the CO_AFFECT subgraph and topological features.
run python codes in 3.3.non_topological_feature_extraction.py to extract non-topological features.

Once done, the generated GD-VCBD subgraph datasets will be saved in the corresponding folders, GD_VCBD_Raw and GD_VCBD_Ready_to_go.

Name	Name	Last commit message	Last commit date
Latest commit unknown 修改和注释空格有关的错误 Dec 30, 2024 a83704b · Dec 30, 2024 History 15 Commits
DescriptionEmbedding	DescriptionEmbedding	add folders	Jan 24, 2023
GD_VCBD_Raw	GD_VCBD_Raw	Add files via upload	Jan 24, 2023
GD_VCBD_Ready_to_go	GD_VCBD_Ready_to_go	Add files via upload	Jan 24, 2023
import	import	修改被拼错的文件名	Dec 30, 2024
2.VulKG_Deployment_Cypher.cypher	2.VulKG_Deployment_Cypher.cypher	修改和注释空格有关的错误	Dec 30, 2024
3.3.VCBD_link_prediction.py	3.3.VCBD_link_prediction.py	Add files via upload	Jan 24, 2023
3.4.1.Plot_F1_score_comparison.py	3.4.1.Plot_F1_score_comparison.py	Add files via upload	Jan 24, 2023
3.4.2.Plot_ROC.py	3.4.2.Plot_ROC.py	Update 3.4.2.Plot_ROC.py	Jan 25, 2023
3.5.1.select_co_exploitation_links_in_training_and_test_sets.py	3.5.1.select_co_exploitation_links_in_training_and_test_sets.py	Add files via upload	Jan 24, 2023
3.5.2.CO_AFFECT_subgraph_and_topological_feature_extraction.py	3.5.2.CO_AFFECT_subgraph_and_topological_feature_extraction.py	Add files via upload	Jan 24, 2023
3.5.3.non_topological_feature_extraction.py	3.5.3.non_topological_feature_extraction.py	Add files via upload	Jan 24, 2023
LICENSE	LICENSE	Initial commit	Jan 24, 2023
README.md	README.md	丰富版本要求部分的内容	Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VulKG

1. Repo Structure

2. VulKG Deployment

2.1 Programming Language

2.2 Software/Platform

2.3 A step-by-step guide for VulKG Deployment

3. Use case: Vulnerability Co-Exploitation Behaviour Discovery

3.1 Programing language

3.2 Library

3.3 Data

3.3 Vulnerability Co-Exploitation Behaviour Discovery

3.4 Result Visualization

3.5 Subgraph dataset generation for VCBD task

3.5.1 Programing language

3.5.2 Library

3.5.3 A step-by-step guide

The end!

About

Releases

Packages

Languages

License

ggwwtt/VulKG

Folders and files

Latest commit

History

Repository files navigation

VulKG

1. Repo Structure

2. VulKG Deployment

2.1 Programming Language

2.2 Software/Platform

2.3 A step-by-step guide for VulKG Deployment

3. Use case: Vulnerability Co-Exploitation Behaviour Discovery

3.1 Programing language

3.2 Library

3.3 Data

3.3 Vulnerability Co-Exploitation Behaviour Discovery

3.4 Result Visualization

3.5 Subgraph dataset generation for VCBD task

3.5.1 Programing language

3.5.2 Library

3.5.3 A step-by-step guide

The end!

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages