This project aims to apply Abbott's heuristics to identify classes, class instances, attributes, and functionalities from software engineering scenarios written in natural language. The project uses spacy, nltk, wordnet, transformers, googletrans, and deepl libraries to perform natural language processing and translation tasks.
Initialize and activate a virtual environment:
python3 -m venv venv
source venv/bin/activate
To install the required libraries, run the following command:
pip install -r requirements.txt
You also need to set the deepl_api_auth_key
environment variable with your DeepL API key, if you want to use the DeepL translation service.
The main script of the project is main.py
, which takes four arguments:
-m
: The spacy model to use for token classification. Best results with 'trf' and 'lg'.-i
: The path to the JSON file containing scenarios.-o
: The path to the output JSON file.-v
: The method for verb classification. Either 'wordnet' or 't5'.
For example, to run the script with the 'trf' model, the 'scenarios.json' file as input, the 'output.json' file as output, and the 't5' method for verb classification, run the following command:
python main.py -m trf -i scenarios.json -o output.json -v t5
The input JSON file should contain a list of scenarios, each with an 'id' and an 'en_text' field. For example:
[
{
"id": "1",
"en_text": "A user can log in to his Facebook account with an email and a password."
},
{
"id": "2",
"en_text": "A user can create a Facebook account with a name, an email, and a password."
}
]
The output JSON file will contain the same scenarios, with additional fields for the extracted words according to Abbott's heuristics. For example:
[
{
"id": "1",
"en_text": "A user can log in to his Facebook account with an email and a password.",
"proper_nouns": ["Facebook"],
"nouns": ["user", "account", "email", "password"],
"adjectives": [],
"modal_verbs": ["can"],
"possession_verbs": [],
"categorization_verbs": [],
"action_verbs": ["log"]
},
{
"id": "2",
"en_text": "A user can create a Facebook account with a name, an email, and a password.",
"proper_nouns": ["Facebook"],
"nouns": ["user", "account", "name", "email", "password"],
"adjectives": [],
"modal_verbs": ["can"],
"possession_verbs": [],
"categorization_verbs": [],
"action_verbs": ["create"]
}
]
The project also includes some utility scripts, such as:
translate.py
: A script to translate Greek scenarios to English, using either Google or DeepL translation service. The scenarios need to be in a JSON file, just like above.evaluate_json.py
: A script to evaluate the JSON fields according to Abbott's heuristics, and save the results in a text file.
For more details on how to use these scripts, please refer to their source code and comments.
This project is licensed under the MIT License - see the LICENSE file for details.