diff --git a/.gitignore b/.gitignore index 05dff405d1..9d3b5937be 100644 --- a/.gitignore +++ b/.gitignore @@ -109,6 +109,10 @@ ENV/ #vscode IDE .vscode +# Vim +*.vim +*.vimrc + #GIT .git/ diff --git a/README.md b/README.md index d539e52d3a..7624393b01 100644 --- a/README.md +++ b/README.md @@ -17,11 +17,13 @@ DeepPavlov is designed for * Contribution Guide [*docs:contribution_guide/*](http://docs.deeppavlov.ai/en/master/devguides/contribution_guide.html) * Issues [*github/issues/*](https://github.com/deepmipt/DeepPavlov/issues) * Forum [*forum.ipavlov.ai*](https://forum.ipavlov.ai/) -* Blogs [*ipavlov.ai/#rec108281800*](http://ipavlov.ai/#rec108281800) +* Blogs [*medium.com/deeppavlov*](https://medium.com/deeppavlov) * Tutorials [*examples/*](https://github.com/deepmipt/DeepPavlov/tree/master/examples) and [extended colab tutorials](https://github.com/deepmipt/dp_tutorials) * Docker Hub [*hub.docker.com/u/deeppavlov/*](https://hub.docker.com/u/deeppavlov/) * Docker Images Documentation [*docs:docker-images/*](http://docs.deeppavlov.ai/en/master/intro/installation.html#docker-images) +Please leave us [your feedback](https://forms.gle/i64fowQmiVhMMC7f9) on how we can improve the DeepPavlov framework. + **Models** [Named Entity Recognition](http://docs.deeppavlov.ai/en/master/features/models/ner.html) | [Slot filling](http://docs.deeppavlov.ai/en/master/features/models/slot_filling.html) @@ -54,6 +56,14 @@ DeepPavlov is designed for [Tuning Models with Evolutionary Algorithm](http://docs.deeppavlov.ai/en/master/features/hypersearch.html) +**Integrations** + +[REST API](http://docs.deeppavlov.ai/en/master/integrations/rest_api.html) | [Socket API](http://docs.deeppavlov.ai/en/master/integrations/socket_api.html) | [Yandex Alice](http://docs.deeppavlov.ai/en/master/integrations/yandex_alice.html) + +[Telegram](http://docs.deeppavlov.ai/en/master/integrations/telegram.html) | [Microsoft Bot Framework](http://docs.deeppavlov.ai/en/master/integrations/ms_bot.html) + +[Amazon Alexa](http://docs.deeppavlov.ai/en/master/integrations/amazon_alexa.html) | [Amazon AWS](http://docs.deeppavlov.ai/en/master/integrations/aws_ec2.html) + ## Installation 0. We support `Linux` and `Windows` platforms, `Python 3.6` and `Python 3.7` @@ -85,8 +95,8 @@ List of models is available on [the doc page](http://docs.deeppavlov.ai/en/master/features/overview.html) in the `deeppavlov.configs` (Python): -```python - from deeppavlov import configs +```python +from deeppavlov import configs ``` When you're decided on the model (+ config file), there are two ways to train, @@ -98,8 +108,8 @@ evaluate and infer it: Before making choice of an interface, install model's package requirements (CLI): -```bash - python -m deeppavlov install +```bash +python -m deeppavlov install ``` * where `` is path to the chosen model's config file (e.g. @@ -111,8 +121,8 @@ Before making choice of an interface, install model's package requirements To get predictions from a model interactively through CLI, run -```bash - python -m deeppavlov interact [-d] +```bash +python -m deeppavlov interact [-d] ``` * `-d` downloads required data -- pretrained model files and embeddings @@ -121,7 +131,7 @@ To get predictions from a model interactively through CLI, run You can train it in the same simple way: ```bash - python -m deeppavlov train [-d] +python -m deeppavlov train [-d] ``` Dataset will be downloaded regardless of whether there was `-d` flag or not. @@ -132,8 +142,8 @@ The data format is specified in the corresponding model doc page. There are even more actions you can perform with configs: -```bash - python -m deeppavlov [-d] +```bash +python -m deeppavlov [-d] ``` * `` can be @@ -157,13 +167,13 @@ There are even more actions you can perform with configs: To get predictions from a model interactively through Python, run -```python - from deeppavlov import build_model +```python +from deeppavlov import build_model - model = build_model(, download=True) +model = build_model(, download=True) - # get predictions for 'input_text1', 'input_text2' - model(['input_text1', 'input_text2']) +# get predictions for 'input_text1', 'input_text2' +model(['input_text1', 'input_text2']) ``` * where `download=True` downloads required data from web -- pretrained model @@ -175,10 +185,10 @@ To get predictions from a model interactively through Python, run You can train it in the same simple way: -```python - from deeppavlov import train_model +```python +from deeppavlov import train_model - model = train_model(, download=True) +model = train_model(, download=True) ``` * `download=True` downloads pretrained model, therefore the pretrained @@ -194,9 +204,9 @@ The data format is specified in the corresponding model doc page. You can also calculate metrics on the dataset specified in your config file: ```python - from deeppavlov import evaluate_model +from deeppavlov import evaluate_model - model = evaluate_model(, download=True) +model = evaluate_model(, download=True) ``` There are also available integrations with various messengers, see @@ -206,6 +216,18 @@ and others in the Integrations section for more info. ## Breaking Changes +**Breaking changes in version 0.6.0** +- [REST API](http://docs.deeppavlov.ai/en/0.6.0/integrations/rest_api.html): + - all models default endpoints were renamed to `/model` + - by default model arguments names are taken from `chainer.in` + [configuration parameter](http://docs.deeppavlov.ai/en/0.6.0/intro/configuration.html) instead of pre-set names + from a [settings file](http://docs.deeppavlov.ai/en/0.6.0/integrations/settings.html) + - swagger api endpoint moved from `/apidocs` to `/docs` +- when using `"max_proba": true` in + a [`proba2labels` component](http://docs.deeppavlov.ai/en/0.6.0/apiref/models/classifiers.html) for classification, + it will return single label for every batch element instead of a list. One can set `"top_n": 1` + to get batches of single item lists as before + **Breaking changes in version 0.5.0** - dependencies have to be reinstalled for most pipeline configurations - models depending on `tensorflow` require `CUDA 10.0` to run on GPU instead of `CUDA 9.0` @@ -246,7 +268,7 @@ DeepPavlov is Apache 2.0 - licensed. ## The Team -DeepPavlov is built and maintained by [Neural Networks and Deep Learning Lab](https://mipt.ru/english/research/labs/neural-networks-and-deep-learning-lab) at [MIPT](https://mipt.ru/english/) within [iPavlov](http://ipavlov.ai/) project (part of [National Technology Initiative](https://asi.ru/eng/nti/)) and in partnership with [Sberbank](http://www.sberbank.com/). +DeepPavlov is built and maintained by [Neural Networks and Deep Learning Lab](https://mipt.ru/english/research/labs/neural-networks-and-deep-learning-lab) at [MIPT](https://mipt.ru/english/) within [iPavlov](http://ipavlov.ai/) project.

diff --git a/deeppavlov/__init__.py b/deeppavlov/__init__.py index b5f7926eed..8d259ea039 100644 --- a/deeppavlov/__init__.py +++ b/deeppavlov/__init__.py @@ -37,7 +37,7 @@ def evaluate_model(config: [str, Path, dict], download: bool = False, recursive: except ImportError: 'Assuming that requirements are not yet installed' -__version__ = '0.5.1' +__version__ = '0.6.0' __author__ = 'Neural Networks and Deep Learning lab, MIPT' __description__ = 'An open source library for building end-to-end dialog systems and training chatbots.' __keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot'] diff --git a/deeppavlov/configs/classifiers/insults_kaggle.json b/deeppavlov/configs/classifiers/insults_kaggle.json index a2d95b4f9c..a3ca0f7238 100644 --- a/deeppavlov/configs/classifiers/insults_kaggle.json +++ b/deeppavlov/configs/classifiers/insults_kaggle.json @@ -112,7 +112,7 @@ "epochs": 1000, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/insults_kaggle_bert.json b/deeppavlov/configs/classifiers/insults_kaggle_bert.json index 4755ac9f10..83974a46a1 100644 --- a/deeppavlov/configs/classifiers/insults_kaggle_bert.json +++ b/deeppavlov/configs/classifiers/insults_kaggle_bert.json @@ -97,7 +97,7 @@ "y_pred_probas" ] }, - "sets_accuracy", + "accuracy", "f1_macro" ], "validation_patience": 5, diff --git a/deeppavlov/configs/classifiers/insults_kaggle_conv_bert.json b/deeppavlov/configs/classifiers/insults_kaggle_conv_bert.json index a19bfa114e..0884166985 100644 --- a/deeppavlov/configs/classifiers/insults_kaggle_conv_bert.json +++ b/deeppavlov/configs/classifiers/insults_kaggle_conv_bert.json @@ -113,7 +113,7 @@ "y_pred_probas" ] }, - "sets_accuracy", + "accuracy", "f1_macro" ], "validation_patience": 5, diff --git a/deeppavlov/configs/classifiers/intents_sample_csv.json b/deeppavlov/configs/classifiers/intents_sample_csv.json index 9b15809ae4..b3eeee623e 100644 --- a/deeppavlov/configs/classifiers/intents_sample_csv.json +++ b/deeppavlov/configs/classifiers/intents_sample_csv.json @@ -118,7 +118,7 @@ "epochs": 100, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/intents_sample_json.json b/deeppavlov/configs/classifiers/intents_sample_json.json index e8d8034591..23f2bdd8e9 100644 --- a/deeppavlov/configs/classifiers/intents_sample_json.json +++ b/deeppavlov/configs/classifiers/intents_sample_json.json @@ -113,7 +113,7 @@ "epochs": 100, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/intents_snips.json b/deeppavlov/configs/classifiers/intents_snips.json index b64349b16f..de684dd21e 100644 --- a/deeppavlov/configs/classifiers/intents_snips.json +++ b/deeppavlov/configs/classifiers/intents_snips.json @@ -103,7 +103,7 @@ "epochs": 1000, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/intents_snips_big.json b/deeppavlov/configs/classifiers/intents_snips_big.json index 64e4363572..e113c9f86c 100644 --- a/deeppavlov/configs/classifiers/intents_snips_big.json +++ b/deeppavlov/configs/classifiers/intents_snips_big.json @@ -103,7 +103,7 @@ "epochs": 1000, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/rusentiment_bert.json b/deeppavlov/configs/classifiers/rusentiment_bert.json index 927178635d..4de0d36918 100644 --- a/deeppavlov/configs/classifiers/rusentiment_bert.json +++ b/deeppavlov/configs/classifiers/rusentiment_bert.json @@ -104,7 +104,7 @@ "metrics": [ "f1_weighted", "f1_macro", - "sets_accuracy", + "accuracy", { "name": "roc_auc", "inputs": [ diff --git a/deeppavlov/configs/classifiers/rusentiment_bigru_superconv.json b/deeppavlov/configs/classifiers/rusentiment_bigru_superconv.json index 2a765a4dce..4a7fbf490a 100644 --- a/deeppavlov/configs/classifiers/rusentiment_bigru_superconv.json +++ b/deeppavlov/configs/classifiers/rusentiment_bigru_superconv.json @@ -130,7 +130,7 @@ "metrics": [ "f1_weighted", "f1_macro", - "sets_accuracy", + "accuracy", { "name": "roc_auc", "inputs": ["y_onehot", "y_pred_probas"] diff --git a/deeppavlov/configs/classifiers/rusentiment_cnn.json b/deeppavlov/configs/classifiers/rusentiment_cnn.json index 90d18503bf..3b4f9e9c75 100644 --- a/deeppavlov/configs/classifiers/rusentiment_cnn.json +++ b/deeppavlov/configs/classifiers/rusentiment_cnn.json @@ -127,7 +127,7 @@ "batch_size": 64, "metrics": [ "f1_weighted", - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/rusentiment_convers_bert.json b/deeppavlov/configs/classifiers/rusentiment_convers_bert.json new file mode 100644 index 0000000000..ccc9012847 --- /dev/null +++ b/deeppavlov/configs/classifiers/rusentiment_convers_bert.json @@ -0,0 +1,154 @@ +{ + "dataset_reader": { + "class_name": "basic_classification_reader", + "x": "text", + "y": "label", + "data_path": "{DOWNLOADS_PATH}/rusentiment/", + "train": "rusentiment_random_posts.csv", + "test": "rusentiment_test.csv" + }, + "dataset_iterator": { + "class_name": "basic_classification_iterator", + "seed": 42, + "split_seed": 23, + "field_to_split": "train", + "split_fields": [ + "train", + "valid" + ], + "split_proportions": [ + 0.9, + 0.1 + ] + }, + "chainer": { + "in": [ + "x" + ], + "in_y": [ + "y" + ], + "pipe": [ + { + "class_name": "bert_preprocessor", + "vocab_file": "{DOWNLOADS_PATH}/bert_models/ru_conversational_cased_L-12_H-768_A-12/vocab.txt", + "do_lower_case": false, + "max_seq_length": 64, + "in": [ + "x" + ], + "out": [ + "bert_features" + ] + }, + { + "id": "classes_vocab", + "class_name": "simple_vocab", + "fit_on": [ + "y" + ], + "save_path": "{MODEL_PATH}/classes.dict", + "load_path": "{MODEL_PATH}/classes.dict", + "in": "y", + "out": "y_ids" + }, + { + "in": "y_ids", + "out": "y_onehot", + "class_name": "one_hotter", + "depth": "#classes_vocab.len", + "single_vector": true + }, + { + "class_name": "bert_classifier", + "n_classes": "#classes_vocab.len", + "return_probas": true, + "one_hot_labels": true, + "bert_config_file": "{DOWNLOADS_PATH}/bert_models/ru_conversational_cased_L-12_H-768_A-12/bert_config.json", + "pretrained_bert": "{DOWNLOADS_PATH}/bert_models/ru_conversational_cased_L-12_H-768_A-12/bert_model.ckpt", + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", + "keep_prob": 0.5, + "learning_rate": 1e-05, + "learning_rate_drop_patience": 5, + "learning_rate_drop_div": 2.0, + "in": [ + "bert_features" + ], + "in_y": [ + "y_onehot" + ], + "out": [ + "y_pred_probas" + ] + }, + { + "in": "y_pred_probas", + "out": "y_pred_ids", + "class_name": "proba2labels", + "max_proba": true + }, + { + "in": "y_pred_ids", + "out": "y_pred_labels", + "ref": "classes_vocab" + } + ], + "out": [ + "y_pred_labels" + ] + }, + "train": { + "batch_size": 64, + "epochs": 100, + "metrics": [ + "f1_weighted", + "f1_macro", + "accuracy", + { + "name": "roc_auc", + "inputs": [ + "y_onehot", + "y_pred_probas" + ] + } + ], + "show_examples": false, + "pytest_max_batches": 2, + "validation_patience": 5, + "val_every_n_epochs": 1, + "log_every_n_epochs": 1, + "evaluation_targets": [ + "train", + "valid", + "test" + ], + "tensorboard_log_dir": "{MODEL_PATH}/" + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "MODEL_PATH": "{MODELS_PATH}/classifiers/rusentiment_convers_bert_v0/" + }, + "requirements": [ + "{DEEPPAVLOV_PATH}/requirements/tf.txt", + "{DEEPPAVLOV_PATH}/requirements/bert_dp.txt" + ], + "labels": { + "telegram_utils": "IntentModel", + "server_utils": "KerasIntentModel" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/bert/ru_conversational_cased_L-12_H-768_A-12.tar.gz", + "subdir": "{DOWNLOADS_PATH}/bert_models" + }, + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/classifiers/rusentiment_convers_bert_v0.tar.gz", + "subdir": "{MODELS_PATH}/classifiers/" + } + ] + } +} diff --git a/deeppavlov/configs/classifiers/rusentiment_elmo_twitter_cnn.json b/deeppavlov/configs/classifiers/rusentiment_elmo_twitter_cnn.json index 1b1b9b2c61..038f4104bb 100644 --- a/deeppavlov/configs/classifiers/rusentiment_elmo_twitter_cnn.json +++ b/deeppavlov/configs/classifiers/rusentiment_elmo_twitter_cnn.json @@ -135,7 +135,7 @@ "metrics": [ "f1_weighted", "f1_macro", - "sets_accuracy", + "accuracy", { "name": "roc_auc", "inputs": ["y_onehot", "y_pred_probas"] diff --git a/deeppavlov/configs/classifiers/sentiment_twitter.json b/deeppavlov/configs/classifiers/sentiment_twitter.json index 7a6f5d295a..4d8e3ce4b1 100644 --- a/deeppavlov/configs/classifiers/sentiment_twitter.json +++ b/deeppavlov/configs/classifiers/sentiment_twitter.json @@ -103,7 +103,7 @@ "epochs": 100, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/sentiment_twitter_preproc.json b/deeppavlov/configs/classifiers/sentiment_twitter_preproc.json index 1b9abffcba..cf127367c7 100644 --- a/deeppavlov/configs/classifiers/sentiment_twitter_preproc.json +++ b/deeppavlov/configs/classifiers/sentiment_twitter_preproc.json @@ -113,7 +113,7 @@ "epochs": 100, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/topic_ag_news.json b/deeppavlov/configs/classifiers/topic_ag_news.json index 047e242d9b..8def9bcace 100644 --- a/deeppavlov/configs/classifiers/topic_ag_news.json +++ b/deeppavlov/configs/classifiers/topic_ag_news.json @@ -111,7 +111,7 @@ "epochs": 100, "batch_size": 64, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/classifiers/yahoo_convers_vs_info.json b/deeppavlov/configs/classifiers/yahoo_convers_vs_info.json index 877dd64792..04c5a54d9a 100644 --- a/deeppavlov/configs/classifiers/yahoo_convers_vs_info.json +++ b/deeppavlov/configs/classifiers/yahoo_convers_vs_info.json @@ -121,7 +121,7 @@ ] }, { - "name": "sets_accuracy", + "name": "accuracy", "inputs": [ "y", "y_pred_labels" diff --git a/deeppavlov/configs/classifiers/yahoo_convers_vs_info_bert.json b/deeppavlov/configs/classifiers/yahoo_convers_vs_info_bert.json index c0136a0c7c..a33a240b9f 100644 --- a/deeppavlov/configs/classifiers/yahoo_convers_vs_info_bert.json +++ b/deeppavlov/configs/classifiers/yahoo_convers_vs_info_bert.json @@ -114,7 +114,7 @@ ] }, { - "name": "sets_accuracy", + "name": "accuracy", "inputs": [ "y", "y_pred_labels" diff --git a/deeppavlov/configs/evolution/evolve_intents_snips.json b/deeppavlov/configs/evolution/evolve_intents_snips.json index 29ccf85b9f..ca37789fbd 100644 --- a/deeppavlov/configs/evolution/evolve_intents_snips.json +++ b/deeppavlov/configs/evolution/evolve_intents_snips.json @@ -168,7 +168,7 @@ "discrete": true }, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/evolution/evolve_rusentiment_cnn.json b/deeppavlov/configs/evolution/evolve_rusentiment_cnn.json index d674c6874b..6ff2b47e19 100644 --- a/deeppavlov/configs/evolution/evolve_rusentiment_cnn.json +++ b/deeppavlov/configs/evolution/evolve_rusentiment_cnn.json @@ -167,7 +167,7 @@ "discrete": true }, "metrics": [ - "sets_accuracy", + "accuracy", "f1_macro", { "name": "roc_auc", diff --git a/deeppavlov/configs/go_bot/gobot_dstc2.json b/deeppavlov/configs/go_bot/gobot_dstc2.json index a56fcaef0b..d2f7707654 100644 --- a/deeppavlov/configs/go_bot/gobot_dstc2.json +++ b/deeppavlov/configs/go_bot/gobot_dstc2.json @@ -1,7 +1,7 @@ { "dataset_reader": { "class_name": "dstc2_reader", - "data_path": "{DOWNLOADS_PATH}/dstc2" + "data_path": "{DATA_PATH}" }, "dataset_iterator": { "class_name": "dialog_iterator" @@ -21,20 +21,13 @@ "id": "word_vocab", "class_name": "simple_vocab", "fit_on": ["x_tokens"], - "save_path": "{MODELS_PATH}/gobot_dstc2/word.dict", - "load_path": "{MODELS_PATH}/gobot_dstc2/word.dict" - }, - { - "id": "restaurant_database", - "class_name": "sqlite_database", - "table_name": "mytable", - "primary_keys": ["name"], - "save_path": "{DOWNLOADS_PATH}/dstc2/resto.sqlite" + "save_path": "{MODEL_PATH}/word.dict", + "load_path": "{MODEL_PATH}/word.dict" }, { "class_name": "go_bot", - "load_path": "{MODELS_PATH}/gobot_dstc2/model", - "save_path": "{MODELS_PATH}/gobot_dstc2/model", + "load_path": "{MODEL_PATH}/model", + "save_path": "{MODEL_PATH}/model", "in": ["x"], "in_y": ["y"], "out": ["y_predicted"], @@ -53,7 +46,12 @@ "word_vocab": "#word_vocab", "template_path": "{DOWNLOADS_PATH}/dstc2/dstc2-templates.txt", "template_type": "DualTemplate", - "database": "#restaurant_database", + "database": { + "class_name": "sqlite_database", + "table_name": "mytable", + "primary_keys": ["name"], + "save_path": "{DOWNLOADS_PATH}/dstc2/resto.sqlite" + }, "api_call_action": "api_call", "use_action_mask": false, "slot_filler": { @@ -99,9 +97,11 @@ "metadata": { "variables": { "ROOT_PATH": "~/.deeppavlov", + "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs", "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "DATA_PATH": "{DOWNLOADS_PATH}/dstc2", "MODELS_PATH": "{ROOT_PATH}/models", - "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs" + "MODEL_PATH": "{MODELS_PATH}/gobot_dstc2" }, "requirements": [ "{DEEPPAVLOV_PATH}/requirements/tf.txt", @@ -115,7 +115,7 @@ }, "download": [ { - "url": "http://files.deeppavlov.ai/deeppavlov_data/gobot_dstc2_v7.tar.gz", + "url": "http://files.deeppavlov.ai/deeppavlov_data/gobot_dstc2_v9.tar.gz", "subdir": "{MODELS_PATH}" }, { @@ -124,7 +124,7 @@ }, { "url": "http://files.deeppavlov.ai/datasets/dstc2_v2.tar.gz", - "subdir": "{DOWNLOADS_PATH}/dstc2" + "subdir": "{DATA_PATH}" } ] } diff --git a/deeppavlov/configs/go_bot/gobot_dstc2_best.json b/deeppavlov/configs/go_bot/gobot_dstc2_best.json index 6a27fb83bb..d0d3842694 100644 --- a/deeppavlov/configs/go_bot/gobot_dstc2_best.json +++ b/deeppavlov/configs/go_bot/gobot_dstc2_best.json @@ -40,7 +40,7 @@ "out": ["y_predicted"], "main": true, "debug": false, - "learning_rate": 3e-4, + "learning_rate": 3e-3, "learning_rate_drop_patience": 10, "learning_rate_drop_div": 4.0, "momentum": 0.95, @@ -51,13 +51,12 @@ "hidden_size": 128, "dense_size": 128, "attention_mechanism": { - "type": "cs_bahdanau", + "type": "general", "hidden_size": 32, - "depth": 3, "action_as_key": true, "intent_as_key": true, "max_num_tokens": 100, - "projected_align": false + "projected_align": false }, "word_vocab": "#token_vocab", "template_path": "{DOWNLOADS_PATH}/dstc2/dstc2-templates.txt", @@ -124,7 +123,7 @@ }, "download": [ { - "url": "http://files.deeppavlov.ai/deeppavlov_data/gobot_dstc2_best_v3.tar.gz", + "url": "http://files.deeppavlov.ai/deeppavlov_data/gobot_dstc2_best_v4.tar.gz", "subdir": "{MODELS_PATH}" }, { diff --git a/deeppavlov/configs/go_bot/gobot_simple_dstc2.json b/deeppavlov/configs/go_bot/gobot_simple_dstc2.json new file mode 100644 index 0000000000..01ba40b1f3 --- /dev/null +++ b/deeppavlov/configs/go_bot/gobot_simple_dstc2.json @@ -0,0 +1,131 @@ +{ + "dataset_reader": { + "class_name": "simple_dstc2_reader", + "data_path": "{DATA_PATH}" + }, + "dataset_iterator": { + "class_name": "dialog_iterator" + }, + "chainer": { + "in": ["x"], + "in_y": ["y"], + "out": ["y_predicted"], + "pipe": [ + { + "class_name": "deeppavlov.models.go_bot.wrapper:DialogComponentWrapper", + "component": { "class_name": "split_tokenizer" }, + "in": ["x"], + "out": ["x_tokens"] + }, + { + "id": "word_vocab", + "class_name": "simple_vocab", + "fit_on": ["x_tokens"], + "save_path": "{MODEL_PATH}/word.dict", + "load_path": "{MODEL_PATH}/word.dict" + }, + { + "class_name": "go_bot", + "load_path": "{MODEL_PATH}/model", + "save_path": "{MODEL_PATH}/model", + "in": ["x"], + "in_y": ["y"], + "out": ["y_predicted"], + "main": true, + "debug": false, + "learning_rate": 0.003, + "learning_rate_drop_patience": 5, + "learning_rate_drop_div": 10.0, + "momentum": 0.95, + "optimizer": "tensorflow.train:AdamOptimizer", + "clip_norm": 2.0, + "dropout_rate": 0.4, + "l2_reg_coef": 3e-4, + "hidden_size": 128, + "dense_size": 160, + "word_vocab": "#word_vocab", + "template_path": "{DATA_PATH}/simple-dstc2-templates.txt", + "template_type": "DefaultTemplate", + "database": { + "class_name": "sqlite_database", + "table_name": "mytable", + "primary_keys": ["name"], + "save_path": "{DATA_PATH}/resto.sqlite" + }, + "api_call_action": "api_call", + "use_action_mask": false, + "slot_filler": { + "config_path": "{CONFIGS_PATH}/ner/slotfill_dstc2.json" + }, + "intent_classifier": null, + "embedder": { + "class_name": "glove", + "load_path": "{DOWNLOADS_PATH}/embeddings/glove.6B.100d.txt" + }, + "bow_embedder": { + "class_name": "bow", + "depth": "#word_vocab.__len__()", + "with_counts": true + }, + "tokenizer": { + "class_name": "stream_spacy_tokenizer", + "lowercase": false + }, + "tracker": { + "class_name": "featurized_tracker", + "slot_names": ["pricerange", "this", "area", "food", "name"] + } + } + ] + }, + "train": { + "epochs": 200, + "batch_size": 8, + + "metrics": ["per_item_dialog_accuracy"], + "validation_patience": 10, + "val_every_n_batches": 15, + + "log_every_n_batches": 15, + "show_examples": false, + "evaluation_targets": [ + "valid", + "test" + ], + "class_name": "nn_trainer" + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "DATA_PATH": "{DOWNLOADS_PATH}/simple-dstc2", + "MODELS_PATH": "{ROOT_PATH}/models", + "MODEL_PATH": "{MODELS_PATH}/gobot_dstc2" + }, + "requirements": [ + "{DEEPPAVLOV_PATH}/requirements/tf.txt", + "{DEEPPAVLOV_PATH}/requirements/gensim.txt", + "{DEEPPAVLOV_PATH}/requirements/spacy.txt", + "{DEEPPAVLOV_PATH}/requirements/en_core_web_sm.txt" + ], + "labels": { + "telegram_utils": "GoalOrientedBot", + "server_utils": "GoalOrientedBot" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/gobot_dstc2_v9.tar.gz", + "subdir": "{MODELS_PATH}" + }, + { + "url": "http://files.deeppavlov.ai/embeddings/glove.6B.100d.txt", + "subdir": "{DOWNLOADS_PATH}/embeddings" + }, + { + "url": "http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz", + "subdir": "{DATA_PATH}" + } + ] + } +} diff --git a/deeppavlov/configs/ner/ner_dstc2.json b/deeppavlov/configs/ner/ner_dstc2.json index ca6a3790a1..9840c2fb3a 100644 --- a/deeppavlov/configs/ner/ner_dstc2.json +++ b/deeppavlov/configs/ner/ner_dstc2.json @@ -1,11 +1,11 @@ { "dataset_reader": { "class_name": "dstc2_reader", - "data_path": "{DOWNLOADS_PATH}/dstc2" + "data_path": "{DATA_PATH}" }, "dataset_iterator": { "class_name": "dstc2_ner_iterator", - "dataset_path": "{DOWNLOADS_PATH}/dstc2" + "slot_values_path": "{SLOT_VALS_PATH}" }, "chainer": { "in": ["x"], @@ -27,8 +27,8 @@ "class_name": "simple_vocab", "pad_with_zeros": true, "fit_on": ["x_lower"], - "save_path": "{MODELS_PATH}/slotfill_dstc2/word.dict", - "load_path": "{MODELS_PATH}/slotfill_dstc2/word.dict", + "save_path": "{MODEL_PATH}/word.dict", + "load_path": "{MODEL_PATH}/word.dict", "out": ["x_tok_ind"] }, { @@ -43,8 +43,8 @@ "class_name": "simple_vocab", "pad_with_zeros": true, "fit_on": ["y"], - "save_path": "{MODELS_PATH}/slotfill_dstc2/tag.dict", - "load_path": "{MODELS_PATH}/slotfill_dstc2/tag.dict", + "save_path": "{MODEL_PATH}/tag.dict", + "load_path": "{MODEL_PATH}/tag.dict", "out": ["y_ind"] }, { @@ -62,8 +62,8 @@ "n_hidden_list": [64, 64], "net_type": "cnn", "n_tags": "#tag_vocab.len", - "save_path": "{MODELS_PATH}/slotfill_dstc2/model", - "load_path": "{MODELS_PATH}/slotfill_dstc2/model", + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", "embeddings_dropout": true, "top_dropout": true, "intra_layer_dropout": false, @@ -107,8 +107,10 @@ "metadata": { "variables": { "ROOT_PATH": "~/.deeppavlov", - "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", - "MODELS_PATH": "{ROOT_PATH}/models" + "DATA_PATH": "{ROOT_PATH}/downloads/dstc2", + "SLOT_VALS_PATH": "{DATA_PATH}/dstc_slot_vals.json", + "MODELS_PATH": "{ROOT_PATH}/models", + "MODEL_PATH": "{MODELS_PATH}/slotfill_dstc2" }, "requirements": [ "{DEEPPAVLOV_PATH}/requirements/tf.txt" @@ -118,10 +120,14 @@ "server_utils": "NER" }, "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz", + "subdir": "{DATA_PATH}" + }, { "url": "http://files.deeppavlov.ai/deeppavlov_data/slotfill_dstc2.tar.gz", "subdir": "{MODELS_PATH}" } ] } -} \ No newline at end of file +} diff --git a/deeppavlov/configs/ner/slotfill_dstc2.json b/deeppavlov/configs/ner/slotfill_dstc2.json index dc82bf9b45..217f2b2bfc 100644 --- a/deeppavlov/configs/ner/slotfill_dstc2.json +++ b/deeppavlov/configs/ner/slotfill_dstc2.json @@ -1,11 +1,11 @@ { "dataset_reader": { "class_name": "dstc2_reader", - "data_path": "{DOWNLOADS_PATH}/dstc2" + "data_path": "{DATA_PATH}" }, "dataset_iterator": { "class_name": "dstc2_ner_iterator", - "dataset_path": "{DOWNLOADS_PATH}/dstc2" + "slot_values_path": "{SLOT_VALS_PATH}" }, "chainer": { "in": ["x"], @@ -18,7 +18,7 @@ }, { "in": ["x_tokens"], - "config_path": "{CONFIGS_PATH}/ner/ner_dstc2.json", + "config_path": "{NER_CONFIG_PATH}", "out": ["x_tokens", "tags"] }, @@ -26,8 +26,8 @@ "in": ["x_tokens", "tags"], "class_name": "dstc_slotfilling", "threshold": 0.8, - "save_path": "{MODELS_PATH}/slotfill_dstc2/dstc_slot_vals.json", - "load_path": "{MODELS_PATH}/slotfill_dstc2/dstc_slot_vals.json", + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", "out": ["slots"] } ], @@ -44,9 +44,11 @@ "metadata": { "variables": { "ROOT_PATH": "~/.deeppavlov", - "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "NER_CONFIG_PATH": "{DEEPPAVLOV_PATH}/configs/ner/ner_dstc2.json", + "DATA_PATH": "{ROOT_PATH}/downloads/dstc2", + "SLOT_VALS_PATH": "{DATA_PATH}/dstc_slot_vals.json", "MODELS_PATH": "{ROOT_PATH}/models", - "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs" + "MODEL_PATH": "{MODELS_PATH}/slotfill_dstc2" }, "requirements": [ "{DEEPPAVLOV_PATH}/requirements/tf.txt" @@ -56,6 +58,10 @@ "server_utils": "DstcSlotFillingNetwork" }, "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz", + "subdir": "{DATA_PATH}" + }, { "url": "http://files.deeppavlov.ai/deeppavlov_data/slotfill_dstc2.tar.gz", "subdir": "{MODELS_PATH}" diff --git a/deeppavlov/configs/ner/slotfill_dstc2_raw.json b/deeppavlov/configs/ner/slotfill_dstc2_raw.json index 709c40c6c3..f66293b8bc 100644 --- a/deeppavlov/configs/ner/slotfill_dstc2_raw.json +++ b/deeppavlov/configs/ner/slotfill_dstc2_raw.json @@ -1,14 +1,15 @@ { "dataset_reader": { "class_name": "dstc2_reader", - "data_path": "{DOWNLOADS_PATH}/dstc2" + "data_path": "{DATA_PATH}" }, "dataset_iterator": { "class_name": "dstc2_ner_iterator", - "dataset_path": "{DOWNLOADS_PATH}/dstc2" + "slot_values_path": "{SLOT_VALS_PATH}" }, "chainer": { "in": ["x"], + "in_y": ["y"], "pipe": [ { "in": ["x"], @@ -23,18 +24,25 @@ { "in": ["x_lower"], "class_name": "slotfill_raw", - "save_path": "{MODELS_PATH}/slotfill_dstc2/dstc_slot_vals.json", - "load_path": "{MODELS_PATH}/slotfill_dstc2/dstc_slot_vals.json", + "save_path": "{SLOT_VALS_PATH}", + "load_path": "{SLOT_VALS_PATH}", "out": ["slots"] } ], "out": ["slots"] }, + "train": { + "metrics": ["slots_accuracy"], + "evaluation_targets": [ + "valid", + "test" + ] + }, "metadata": { "variables": { "ROOT_PATH": "~/.deeppavlov", - "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", - "MODELS_PATH": "{ROOT_PATH}/models" + "DATA_PATH": "{ROOT_PATH}/downloads/dstc2", + "SLOT_VALS_PATH": "{DATA_PATH}/dstc_slot_vals.json" }, "requirements": [ "{DEEPPAVLOV_PATH}/requirements/tf.txt" @@ -44,8 +52,8 @@ }, "download": [ { - "url": "http://files.deeppavlov.ai/deeppavlov_data/slotfill_dstc2.tar.gz", - "subdir": "{MODELS_PATH}" + "url": "http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz", + "subdir": "{DATA_PATH}" } ] } diff --git a/deeppavlov/configs/ner/slotfill_simple_dstc2_raw.json b/deeppavlov/configs/ner/slotfill_simple_dstc2_raw.json new file mode 100644 index 0000000000..47f176394c --- /dev/null +++ b/deeppavlov/configs/ner/slotfill_simple_dstc2_raw.json @@ -0,0 +1,60 @@ +{ + "dataset_reader": { + "class_name": "simple_dstc2_reader", + "data_path": "{DATA_PATH}" + }, + "dataset_iterator": { + "class_name": "dstc2_ner_iterator", + "slot_values_path": "{SLOT_VALS_PATH}" + }, + "chainer": { + "in": ["x"], + "in_y": ["y"], + "pipe": [ + { + "in": ["x"], + "class_name": "lazy_tokenizer", + "out": ["x_tokens"] + }, + { + "in": ["x_tokens"], + "class_name": "str_lower", + "out": ["x_lower"] + }, + { + "in": ["x_lower"], + "class_name": "slotfill_raw", + "save_path": "{SLOT_VALS_PATH}", + "load_path": "{SLOT_VALS_PATH}", + "out": ["slots"] + } + ], + "out": ["slots"] + }, + "train": { + "metrics": ["slots_accuracy"], + "evaluation_targets": [ + "valid", + "test" + ] + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DATA_PATH": "{ROOT_PATH}/downloads/simple-dstc2", + "SLOT_VALS_PATH": "{DATA_PATH}/dstc_slot_vals.json" + }, + "requirements": [ + "{DEEPPAVLOV_PATH}/requirements/tf.txt" + ], + "labels": { + "telegram_utils": "NERModel" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz", + "subdir": "{DATA_PATH}" + } + ] + } +} diff --git a/deeppavlov/configs/aiml_skill/aiml_skill.json b/deeppavlov/configs/skills/aiml_skill.json similarity index 100% rename from deeppavlov/configs/aiml_skill/aiml_skill.json rename to deeppavlov/configs/skills/aiml_skill.json diff --git a/deeppavlov/configs/skills/rasa_skill.json b/deeppavlov/configs/skills/rasa_skill.json new file mode 100644 index 0000000000..fae1521b0a --- /dev/null +++ b/deeppavlov/configs/skills/rasa_skill.json @@ -0,0 +1,43 @@ +{ + "chainer": { + "in": [ + "utterances" + ], + "out": [ + "responses_batch", + "confidences_batch" + ], + "pipe": [ + { + "class_name": "rasa_skill", + "path_to_models": "{PROJECT_ROOT}/models", + "in": [ + "utterances" + ], + "out": [ + "responses_batch", + "confidences_batch", + "output_states_batch" + ] + } + ] + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "PROJECT_ROOT": "{DOWNLOADS_PATH}/rasa_tutorial_project" + }, + "requirements": [ + "{DEEPPAVLOV_PATH}/requirements/rasa_skill.txt", + "{DEEPPAVLOV_PATH}/requirements/tf.txt" + ], + "download": [ + { + "url": "http://files.deeppavlov.ai/rasa_skill/rasa_tutorial_project.tar.gz", + "subdir": "{DOWNLOADS_PATH}" + } + ] + } +} diff --git a/deeppavlov/core/common/registry.json b/deeppavlov/core/common/registry.json index 079010b520..0d50ff6de9 100644 --- a/deeppavlov/core/common/registry.json +++ b/deeppavlov/core/common/registry.json @@ -1,4 +1,5 @@ { + "UD_pymorphy_lemmatizer": "deeppavlov.models.morpho_tagger.lemmatizer:UDPymorphyLemmatizer", "aiml_skill": "deeppavlov.skills.aiml_skill.aiml_skill:AIMLSkill", "amazon_ecommerce_reader": "deeppavlov.dataset_readers.amazon_ecommerce_reader:AmazonEcommerceReader", "api_requester": "deeppavlov.models.api_requester.api_requester:ApiRequester", @@ -6,7 +7,6 @@ "basic_classification_iterator": "deeppavlov.dataset_iterators.basic_classification_iterator:BasicClassificationDatasetIterator", "basic_classification_reader": "deeppavlov.dataset_readers.basic_classification_reader:BasicClassificationDatasetReader", "bert_classifier": "deeppavlov.models.bert.bert_classifier:BertClassifierModel", - "bert_context_add": "deeppavlov.models.preprocessors.bert_preprocessor:BertContextAdd", "bert_ner": "deeppavlov.models.bert.bert_ner:BertNerModel", "bert_ner_preprocessor": "deeppavlov.models.preprocessors.bert_preprocessor:BertNerPreprocessor", "bert_preprocessor": "deeppavlov.models.preprocessors.bert_preprocessor:BertPreprocessor", @@ -21,13 +21,13 @@ "bow": "deeppavlov.models.embedders.bow_embedder:BoWEmbedder", "capitalization_featurizer": "deeppavlov.models.preprocessors.capitalization:CapitalizationPreprocessor", "char_splitter": "deeppavlov.models.preprocessors.char_splitter:CharSplitter", + "char_splitting_lowercase_preprocessor": "deeppavlov.models.preprocessors.capitalization:CharSplittingLowercasePreprocessor", "conll2003_reader": "deeppavlov.dataset_readers.conll2003_reader:Conll2003DatasetReader", "cos_sim_classifier": "deeppavlov.models.classifiers.cos_sim_classifier:CosineSimilarityClassifier", "dam_nn": "deeppavlov.models.ranking.deep_attention_matching_network:DAMNetwork", "dam_nn_use_transformer": "deeppavlov.models.ranking.deep_attention_matching_network_use_transformer:DAMNetworkUSETransformer", "data_fitting_iterator": "deeppavlov.core.data.data_fitting_iterator:DataFittingIterator", "data_learning_iterator": "deeppavlov.core.data.data_learning_iterator:DataLearningIterator", - "default_vocab": "deeppavlov.core.data.vocab:DefaultVocabulary", "dialog_db_result_iterator": "deeppavlov.dataset_iterators.dialog_iterator:DialogDBResultDatasetIterator", "dialog_iterator": "deeppavlov.dataset_iterators.dialog_iterator:DialogDatasetIterator", "dialog_state": "deeppavlov.models.seq2seq_go_bot.dialog_state:DialogState", @@ -69,13 +69,10 @@ "lemmatized_output_prettifier": "deeppavlov.models.morpho_tagger.common:LemmatizedOutputPrettifier", "line_reader": "deeppavlov.dataset_readers.line_reader:LineReader", "logit_ranker": "deeppavlov.models.doc_retrieval.logit_ranker:LogitRanker", - "char_splitting_lowercase_preprocessor": "deeppavlov.models.preprocessors.capitalization:CharSplittingLowercasePreprocessor", "mask": "deeppavlov.models.preprocessors.mask:Mask", "morpho_tagger": "deeppavlov.models.morpho_tagger.morpho_tagger:MorphoTagger", "morphotagger_dataset": "deeppavlov.dataset_iterators.morphotagger_iterator:MorphoTaggerDatasetIterator", "morphotagger_dataset_reader": "deeppavlov.dataset_readers.morphotagging_dataset_reader:MorphotaggerDatasetReader", - "morphotagger_multidataset": "deeppavlov.dataset_iterators.morphotagger_iterator:MorphoTaggerMultiDatasetIterator", - "morphotagger_multidataset_reader": "deeppavlov.dataset_readers.morphotagging_dataset_reader:MorphotaggerMultiDatasetReader", "mpm_nn": "deeppavlov.models.ranking.mpm_siamese_network:MPMSiameseNetwork", "multi_squad_dataset_reader": "deeppavlov.dataset_readers.squad_dataset_reader:MultiSquadDatasetReader", "multi_squad_iterator": "deeppavlov.dataset_iterators.squad_iterator:MultiSquadIterator", @@ -105,14 +102,15 @@ "ru_sent_tokenizer": "deeppavlov.models.tokenizers.ru_sent_tokenizer:RuSentTokenizer", "ru_tokenizer": "deeppavlov.models.tokenizers.ru_tokenizer:RussianTokenizer", "russian_words_vocab": "deeppavlov.vocabs.typos:RussianWordsVocab", + "rasa_skill": "deeppavlov.skills.rasa_skill.rasa_skill:RASASkill", "sanitizer": "deeppavlov.models.preprocessors.sanitizer:Sanitizer", - "sent_label_splitter": "deeppavlov.models.morpho_tagger.misc:SentLabelSplitter", "seq2seq_go_bot": "deeppavlov.models.seq2seq_go_bot.bot:Seq2SeqGoalOrientedBot", "seq2seq_go_bot_nn": "deeppavlov.models.seq2seq_go_bot.network:Seq2SeqGoalOrientedBotNetwork", "siamese_iterator": "deeppavlov.dataset_iterators.siamese_iterator:SiameseIterator", "siamese_predictor": "deeppavlov.models.ranking.siamese_predictor:SiamesePredictor", "siamese_preprocessor": "deeppavlov.models.preprocessors.siamese_preprocessor:SiamesePreprocessor", "siamese_reader": "deeppavlov.dataset_readers.siamese_reader:SiameseReader", + "simple_dstc2_reader": "deeppavlov.dataset_readers.dstc2_reader:SimpleDSTC2DatasetReader", "simple_vocab": "deeppavlov.core.data.simple_vocab:SimpleVocabulary", "sklearn_component": "deeppavlov.models.sklearn.sklearn_component:SklearnComponent", "slotfill_raw": "deeppavlov.models.slotfill.slotfill_raw:SlotFillingComponent", @@ -123,6 +121,7 @@ "spelling_error_model": "deeppavlov.models.spelling_correction.brillmoore.error_model:ErrorModel", "spelling_levenshtein": "deeppavlov.models.spelling_correction.levenshtein.searcher_component:LevenshteinSearcherComponent", "split_tokenizer": "deeppavlov.models.tokenizers.split_tokenizer:SplitTokenizer", + "sq_reader": "deeppavlov.dataset_readers.sq_reader:OntonotesReader", "sqlite_database": "deeppavlov.core.data.sqlite_database:Sqlite3Database", "sqlite_iterator": "deeppavlov.dataset_iterators.sqlite_iterator:SQLiteDataIterator", "squad_ans_postprocessor": "deeppavlov.models.preprocessors.squad_preprocessor:SquadAnsPostprocessor", @@ -155,7 +154,6 @@ "ubuntu_v1_mt_reader": "deeppavlov.dataset_readers.ubuntu_v1_mt_reader:UbuntuV1MTReader", "ubuntu_v2_mt_reader": "deeppavlov.dataset_readers.ubuntu_v2_mt_reader:UbuntuV2MTReader", "ubuntu_v2_reader": "deeppavlov.dataset_readers.ubuntu_v2_reader:UbuntuV2Reader", - "UD_pymorphy_lemmatizer": "deeppavlov.models.morpho_tagger.lemmatizer:UDPymorphyLemmatizer", "wiki_sqlite_vocab": "deeppavlov.vocabs.wiki_sqlite:WikiSQLiteVocab", "wikitionary_100K_vocab": "deeppavlov.vocabs.typos:Wiki100KDictionary" } \ No newline at end of file diff --git a/deeppavlov/core/data/simple_vocab.py b/deeppavlov/core/data/simple_vocab.py index 7ccd78ab02..fde9433079 100644 --- a/deeppavlov/core/data/simple_vocab.py +++ b/deeppavlov/core/data/simple_vocab.py @@ -89,7 +89,7 @@ def __call__(self, batch, is_top=True, **kwargs): looked_up_batch = [self(sample, is_top=False) for sample in batch] else: return self[batch] - if is_top and self._pad_with_zeros and not is_str_batch(looked_up_batch): + if self._pad_with_zeros and is_top and not is_str_batch(looked_up_batch): looked_up_batch = zero_pad(looked_up_batch) return looked_up_batch diff --git a/deeppavlov/core/data/sqlite_database.py b/deeppavlov/core/data/sqlite_database.py index 21e026a4ff..4d575c69b2 100644 --- a/deeppavlov/core/data/sqlite_database.py +++ b/deeppavlov/core/data/sqlite_database.py @@ -35,17 +35,19 @@ class Sqlite3Database(Estimator): Parameters: save_path: sqlite database path. - table_name: name of the sqlite table. primary_keys: list of table primary keys' names. keys: all table keys' names. + table_name: name of the sqlite table. unknown_value: value assigned to missing item values. - **kwargs: parameters passed to parent :class:`~deeppavlov.core.models.estimator.Estimator` class. + **kwargs: parameters passed to parent + :class:`~deeppavlov.core.models.estimator.Estimator` class. """ - def __init__(self, save_path: str, - table_name: str, + def __init__(self, + save_path: str, primary_keys: List[str], keys: List[str] = None, + table_name: str = "mytable", unknown_value: str = 'UNK', *args, **kwargs) -> None: super().__init__(save_path=save_path, *args, **kwargs) @@ -57,14 +59,15 @@ def __init__(self, save_path: str, self.keys = keys self.unknown_value = unknown_value - self.conn = sqlite3.connect(str(self.save_path), check_same_thread=False) + self.conn = sqlite3.connect(str(self.save_path), + check_same_thread=False) self.cursor = self.conn.cursor() if self._check_if_table_exists(): - log.info("Loading database from {}.".format(self.save_path)) + log.info(f"Loading database from {self.save_path}.") if not self.keys: self.keys = self._get_keys() else: - log.info("Initializing empty database on {}.".format(self.save_path)) + log.info(f"Initializing empty database on {self.save_path}.") def __call__(self, batch: List[Dict], order_by: str = None, @@ -76,32 +79,19 @@ def __call__(self, batch: List[Dict], return [self._search(b, order_by=order_by, order=order) for b in batch] def _check_if_table_exists(self): - self.cursor.execute("SELECT name FROM sqlite_master" - " WHERE type='table'" - " AND name='{}';".format(self.tname)) + self.cursor.execute(f"SELECT name FROM sqlite_master" + f" WHERE type='table'" + f" AND name='{self.tname}';") return bool(self.cursor.fetchall()) - def _search(self, kv, order_by, order): - if not kv: - # get all table content - if order_by is not None: - self.cursor.execute("SELECT * FROM {}".format(self.tname) + - " ORDER BY {} {}".format(order_by, order)) - else: - self.cursor.execute("SELECT * FROM {}".format(self.tname)) + def _search(self, kv=None, order_by=None, order=''): + order_expr = f" ORDER BY {order_by} {order}" if order_by else '' + if kv: + keys, values = zip(*kv.items()) + where_expr = " AND ".join(f"{k}=?" for k in keys) + self.cursor.execute(f"SELECT * FROM {self.tname} WHERE {where_expr}" + order_expr, values) else: - keys = list(kv.keys()) - values = [kv[k] for k in keys] - where_expr = ' AND '.join('{}=?'.format(k) for k in keys) - if order_by is not None: - self.cursor.execute("SELECT * FROM {}".format(self.tname) + - " WHERE {}".format(where_expr) + - " ORDER BY {} {}".format(order_by, order), - values) - else: - self.cursor.execute("SELECT * FROM {}".format(self.tname) + - " WHERE {}".format(where_expr), - values) + self.cursor.execute(f"SELECT * FROM {self.tname}" + order_expr) return [self._wrap_selection(s) for s in self.cursor.fetchall()] def _wrap_selection(self, selection): @@ -110,17 +100,18 @@ def _wrap_selection(self, selection): return {f: v for f, v in zip(self.keys, selection)} def _get_keys(self): - self.cursor.execute("PRAGMA table_info({});".format(self.tname)) + self.cursor.execute(f"PRAGMA table_info({self.tname});") return [info[1] for info in self.cursor] def _get_types(self): - self.cursor.execute("PRAGMA table_info({});".format(self.tname)) + self.cursor.execute(f"PRAGMA table_info({self.tname});") return {info[1]: info[2] for info in self.cursor} def fit(self, data: List[Dict]) -> None: if not self._check_if_table_exists(): - self.keys = self.keys or list(set(k for d in data for k in d.keys())) - types = ('integer' if type(data[0][k]) == int else 'text' for k in self.keys) + self.keys = self.keys or [key for key in data[0]] + # because in the next line we assume that in the first dict there are all (!) necessary keys: + types = ('integer' if isinstance(data[0][k], int) else 'text' for k in self.keys) self._create_table(self.keys, types) elif not self.keys: self.keys = self._get_keys() @@ -129,13 +120,14 @@ def fit(self, data: List[Dict]) -> None: def _create_table(self, keys, types): if any(pk not in keys for pk in self.primary_keys): - raise ValueError("Primary keys must be from {}.".format(keys)) - new_types = ("{} {} primary key".format(k, t) if k in self.primary_keys else - "{} {}".format(k, t) + raise ValueError(f"Primary keys must be from {keys}.") + new_types = (f"{k} {t} primary key" + if k in self.primary_keys else f"{k} {t}" for k, t in zip(keys, types)) - self.cursor.execute("CREATE TABLE IF NOT EXISTS {} ({})" - .format(self.tname, ', '.join(new_types))) - log.info("Created table with keys {}.".format(self._get_types())) + new_types_joined = ', '.join(new_types) + self.cursor.execute(f"CREATE TABLE IF NOT EXISTS {self.tname}" + f" ({new_types_joined})") + log.info(f"Created table with keys {self._get_types()}.") def _insert_many(self, data): to_insert = {} @@ -154,8 +146,8 @@ def _insert_many(self, data): if to_insert: fformat = ','.join(['?'] * len(self.keys)) - self.cursor.executemany("INSERT into {}".format(self.tname) + - " VALUES ({})".format(fformat), + self.cursor.executemany(f"INSERT into {self.tname}" + + f" VALUES ({fformat})", to_insert.values()) if to_update: for record in to_update.values(): @@ -165,24 +157,25 @@ def _insert_many(self, data): def _get_record(self, primary_values): ffields = ', '.join(self.keys) or '*' - where_expr = ' AND '.join("{} = '{}'".format(pk, v) - for pk, v in zip(self.primary_keys, primary_values)) - fetched = self.cursor.execute("SELECT {} FROM {}".format(ffields, self.tname) + - " WHERE {}".format(where_expr)).fetchone() + where_expr = " AND ".join(f"{pk} = '{v}'" + for pk, v in zip(self.primary_keys, + primary_values)) + fetched = self.cursor.execute(f"SELECT {ffields} FROM {self.tname}" + + f" WHERE {where_expr}").fetchone() if not fetched: return None return fetched def _update_one(self, record): - set_expr = ', '.join("{} = '{}'".format(k, v) + set_expr = ', '.join(f"{k} = '{v}'" for k, v in zip(self.keys, record) if k not in self.primary_keys) - where_expr = ' AND '.join("{} = '{}'".format(k, v) + where_expr = " AND ".join(f"{k} = '{v}'" for k, v in zip(self.keys, record) if k in self.primary_keys) - self.cursor.execute("UPDATE {}".format(self.tname) + - " SET {}".format(set_expr) + - " WHERE {}".format(where_expr)) + self.cursor.execute(f"UPDATE {self.tname}" + + f" SET {set_expr}" + + f" WHERE {where_expr}") def save(self): pass diff --git a/deeppavlov/core/data/utils.py b/deeppavlov/core/data/utils.py index 0c7abda692..6efd1477c5 100644 --- a/deeppavlov/core/data/utils.py +++ b/deeppavlov/core/data/utils.py @@ -428,10 +428,9 @@ def jsonify_data(data): result[key] = jsonify_data(data[key]) elif isinstance(data, np.ndarray): result = data.tolist() - elif isinstance(data, (np.int_, np.intc, np.intp, np.int8, np.int16, np.int32, - np.int64, np.uint8, np.uint16, np.uint32, np.uint64)): + elif isinstance(data, np.integer): result = int(data) - elif isinstance(data, (np.float_, np.float16, np.float32, np.float64)): + elif isinstance(data, np.floating): result = float(data) else: result = data diff --git a/deeppavlov/core/layers/tf_layers.py b/deeppavlov/core/layers/tf_layers.py index b5b4e54c5b..7cb8298fb8 100644 --- a/deeppavlov/core/layers/tf_layers.py +++ b/deeppavlov/core/layers/tf_layers.py @@ -945,4 +945,4 @@ def variational_dropout(units, keep_prob, fixed_mask_dims=(1,)): noise_shape = [units_shape[n] for n in range(len(units.shape))] for dim in fixed_mask_dims: noise_shape[dim] = 1 - return tf.nn.dropout(units, keep_prob, noise_shape) + return tf.nn.dropout(units, rate=1-keep_prob, noise_shape=noise_shape) diff --git a/deeppavlov/dataset_iterators/basic_classification_iterator.py b/deeppavlov/dataset_iterators/basic_classification_iterator.py index 42cb4a5256..1168142c4d 100644 --- a/deeppavlov/dataset_iterators/basic_classification_iterator.py +++ b/deeppavlov/dataset_iterators/basic_classification_iterator.py @@ -49,12 +49,12 @@ class BasicClassificationDatasetIterator(DataLearningIterator): def __init__(self, data: dict, fields_to_merge: List[str] = None, merged_field: str = None, field_to_split: str = None, split_fields: List[str] = None, split_proportions: List[float] = None, - seed: int = None, shuffle: bool = True, split_seed: int=None, + seed: int = None, shuffle: bool = True, split_seed: int = None, stratify: bool = None, *args, **kwargs): """ Initialize dataset using data from DatasetReader, - merges and splits fields according to the given parameters + merges and splits fields according to the given parameters. """ super().__init__(data, seed=seed, shuffle=shuffle) diff --git a/deeppavlov/dataset_iterators/dstc2_ner_iterator.py b/deeppavlov/dataset_iterators/dstc2_ner_iterator.py index 10c07ab3ad..420e3f3a95 100644 --- a/deeppavlov/dataset_iterators/dstc2_ner_iterator.py +++ b/deeppavlov/dataset_iterators/dstc2_ner_iterator.py @@ -14,12 +14,12 @@ import json import logging -from typing import List, Tuple, Dict +from overrides import overrides +from typing import List, Tuple, Dict, Any from deeppavlov.core.commands.utils import expand_path from deeppavlov.core.common.registry import register from deeppavlov.core.data.data_learning_iterator import DataLearningIterator -from deeppavlov.core.data.utils import download logger = logging.getLogger(__name__) @@ -36,50 +36,60 @@ class Dstc2NerDatasetIterator(DataLearningIterator): seed: value for random seed shuffle: whether to shuffle the data """ - def __init__(self, data: Dict[str, List[Tuple]], dataset_path: str, seed: int = None, shuffle: bool = False): + def __init__(self, + data: Dict[str, List[Tuple]], + slot_values_path: str, + seed: int = None, + shuffle: bool = False): # TODO: include slot vals to dstc2.tar.gz - dataset_path = expand_path(dataset_path) / 'slot_vals.json' - self._build_slot_vals(dataset_path) - with open(dataset_path, encoding='utf8') as f: + with expand_path(slot_values_path).open(encoding='utf8') as f: self._slot_vals = json.load(f) super().__init__(data, seed, shuffle) - def preprocess(self, data_part, *args, **kwargs): - processed_data_part = list() + def preprocess(self, + data: List[Tuple[Any, Any]], + *args, **kwargs) -> List[Tuple[Any, Any]]: + processed_data = list() processed_texts = dict() - for sample in data_part: - for utterance in sample: - if 'intents' not in utterance or len(utterance['text']) < 1: - continue - text = utterance['text'] - intents = utterance.get('intents', dict()) - slots = list() - for intent in intents: + for x, y in data: + text = x['text'] + if not text.strip(): + continue + intents = [] + if 'intents' in x: + intents = x['intents'] + elif 'slots' in x: + intents = [x] + # aggregate slots from different intents + slots = list() + for intent in intents: + current_slots = intent.get('slots', []) + for slot_type, slot_val in current_slots: + if not self._slot_vals or (slot_type in self._slot_vals): + slots.append((slot_type, slot_val,)) + # remove duplicate pairs (text, slots) + if (text in processed_texts) and (slots in processed_texts[text]): + continue + processed_texts[text] = processed_texts.get(text, []) + [slots] - current_slots = intent.get('slots', []) - for slot_type, slot_val in current_slots: - if slot_type in self._slot_vals: - slots.append((slot_type, slot_val,)) + processed_data.append(self._add_bio_markup(text, slots)) + return processed_data - # remove duplicate pairs (text, slots) - if (text in processed_texts) and (slots in processed_texts[text]): - continue - processed_texts[text] = processed_texts.get(text, []) + [slots] - - processed_data_part.append(self._add_bio_markup(text, slots)) - return processed_data_part - - def _add_bio_markup(self, utterance, slots): + def _add_bio_markup(self, + utterance: str, + slots: List[Tuple[str, str]]) -> Tuple[List, List]: tokens = utterance.split() n_toks = len(tokens) tags = ['O' for _ in range(n_toks)] for n in range(n_toks): for slot_type, slot_val in slots: - for entity in self._slot_vals[slot_type][slot_val]: + for entity in self._slot_vals[slot_type].get(slot_val, + [slot_val]): slot_tokens = entity.split() slot_len = len(slot_tokens) - if n + slot_len <= n_toks and self._is_equal_sequences(tokens[n: n + slot_len], - slot_tokens): + if n + slot_len <= n_toks and \ + self._is_equal_sequences(tokens[n: n + slot_len], + slot_tokens): tags[n] = 'B-' + slot_type for k in range(1, slot_len): tags[n + k] = 'I-' + slot_type @@ -90,8 +100,3 @@ def _add_bio_markup(self, utterance, slots): def _is_equal_sequences(seq1, seq2): equality_list = [tok1 == tok2 for tok1, tok2 in zip(seq1, seq2)] return all(equality_list) - - @staticmethod - def _build_slot_vals(slot_vals_json_path='data/'): - url = 'http://files.deeppavlov.ai/datasets/dstc_slot_vals.json' - download(slot_vals_json_path, url) diff --git a/deeppavlov/dataset_iterators/snips_intents_iterator.py b/deeppavlov/dataset_iterators/snips_intents_iterator.py index 4f90455336..306329e762 100644 --- a/deeppavlov/dataset_iterators/snips_intents_iterator.py +++ b/deeppavlov/dataset_iterators/snips_intents_iterator.py @@ -28,5 +28,5 @@ def preprocess(self, data, *args, **kwargs): for query in data: text = ''.join(part['text'] for part in query['data']) intent = query['intent'] - result.append((text, [intent])) + result.append((text, intent)) return result diff --git a/deeppavlov/dataset_readers/basic_classification_reader.py b/deeppavlov/dataset_readers/basic_classification_reader.py index 4438f84cd1..8b33963dd5 100644 --- a/deeppavlov/dataset_readers/basic_classification_reader.py +++ b/deeppavlov/dataset_readers/basic_classification_reader.py @@ -34,7 +34,7 @@ class BasicClassificationDatasetReader(DatasetReader): @overrides def read(self, data_path: str, url: str = None, - format: str = "csv", class_sep: str = ",", + format: str = "csv", class_sep: str = None, *args, **kwargs) -> dict: """ Read dataset from data_path directory. @@ -47,7 +47,7 @@ def read(self, data_path: str, url: str = None, url: download data files if data_path not exists or empty format: extension of files. Set of Values: ``"csv", "json"`` class_sep: string separator of labels in column with labels - sep (str): delimeter for ``"csv"`` files. Default: ``","`` + sep (str): delimeter for ``"csv"`` files. Default: None -> only one class per sample header (int): row number to use as the column names names (array): list of column names to use orient (str): indication of expected JSON string format @@ -88,9 +88,21 @@ def read(self, data_path: str, url: str = None, x = kwargs.get("x", "text") y = kwargs.get('y', 'labels') if isinstance(x, list): - data[data_type] = [([row[x_] for x_ in x], str(row[y]).split(class_sep)) for _, row in df.iterrows()] + if class_sep is None: + # each sample is a tuple ("text", "label") + data[data_type] = [([row[x_] for x_ in x], str(row[y])) + for _, row in df.iterrows()] + else: + # each sample is a tuple ("text", ["label", "label", ...]) + data[data_type] = [([row[x_] for x_ in x], str(row[y]).split(class_sep)) + for _, row in df.iterrows()] else: - data[data_type] = [(row[x], str(row[y]).split(class_sep)) for _, row in df.iterrows()] + if class_sep is None: + # each sample is a tuple ("text", "label") + data[data_type] = [(row[x], str(row[y])) for _, row in df.iterrows()] + else: + # each sample is a tuple ("text", ["label", "label", ...]) + data[data_type] = [(row[x], str(row[y]).split(class_sep)) for _, row in df.iterrows()] else: log.warning("Cannot find {} file".format(file)) diff --git a/deeppavlov/dataset_readers/dstc2_reader.py b/deeppavlov/dataset_readers/dstc2_reader.py index e8fc16862b..187d047e52 100644 --- a/deeppavlov/dataset_readers/dstc2_reader.py +++ b/deeppavlov/dataset_readers/dstc2_reader.py @@ -70,7 +70,7 @@ class DSTC2DatasetReader(DatasetReader): @staticmethod def _data_fname(datatype): assert datatype in ('trn', 'val', 'tst'), "wrong datatype name" - return 'dstc2-{}.jsonlist'.format(datatype) + return f"dstc2-{datatype}.jsonlist" @classmethod @overrides @@ -92,7 +92,7 @@ def read(self, data_path: str, dialogs: bool = False) -> Dict[str, List]: """ required_files = (self._data_fname(dt) for dt in ('trn', 'val', 'tst')) if not all(Path(data_path, f).exists() for f in required_files): - log.info('[downloading data from {} to {}]'.format(self.url, data_path)) + log.info(f"[downloading data from {self.url} to {data_path}]") download_decompress(self.url, data_path) mark_done(data_path) @@ -109,7 +109,7 @@ def read(self, data_path: str, dialogs: bool = False) -> Dict[str, List]: @classmethod def _read_from_file(cls, file_path, dialogs=False): """Returns data from single file""" - log.info("[loading dialogs from {}]".format(file_path)) + log.info(f"[loading dialogs from {file_path}]") utterances, responses, dialog_indices =\ cls._get_turns(cls._iter_file(file_path), with_indices=True) @@ -122,14 +122,15 @@ def _read_from_file(cls, file_path, dialogs=False): @staticmethod def _format_turn(turn): - x = {'text': turn[0]['text'], - 'intents': turn[0]['dialog_acts']} - if turn[0].get('db_result') is not None: - x['db_result'] = turn[0]['db_result'] - if turn[0].get('episode_done'): + turn_x, turn_y = turn + x = {'text': turn_x['text'], + 'intents': turn_x['dialog_acts']} + if turn_x.get('db_result') is not None: + x['db_result'] = turn_x['db_result'] + if turn_x.get('episode_done'): x['episode_done'] = True - y = {'text': turn[1]['text'], - 'act': turn[1]['dialog_acts'][0]['act']} + y = {'text': turn_y['text'], + 'act': turn_y['dialog_acts'][0]['act']} return (x, y) @staticmethod @@ -180,9 +181,9 @@ def _get_turns(data, with_indices=False): else: new_turn = copy.deepcopy(utterances[-1]) if 'db_result' not in responses[-1]: - raise RuntimeError("Every api_call action should have" - " db_result, turn = {}" - .format(responses[-1])) + raise RuntimeError(f"Every api_call action" + f" should have db_result," + f" turn = {responses[-1]}") new_turn['db_result'] = responses[-1].pop('db_result') utterances.append(new_turn) responses.append(turn) @@ -198,3 +199,164 @@ def _get_turns(data, with_indices=False): if with_indices: return utterances, responses, dialog_indices return utterances, responses + + +@register('simple_dstc2_reader') +class SimpleDSTC2DatasetReader(DatasetReader): + """ + Contains labelled dialogs from Dialog State Tracking Challenge 2 + (http://camdial.org/~mh521/dstc/). + + There've been made the following modifications to the original dataset: + + 1. added api calls to restaurant database + + - example: ``{"text": "api_call area=\"south\" food=\"dontcare\" + pricerange=\"cheap\"", "dialog_acts": ["api_call"]}``. + + 2. new actions + + - bot dialog actions were concatenated into one action + (example: ``{"dialog_acts": ["ask", "request"]}`` -> + ``{"dialog_acts": ["ask_request"]}``) + + - if a slot key was associated with the dialog action, the new act + was a concatenation of an act and a slot key (example: + ``{"dialog_acts": ["ask"], "slot_vals": ["area"]}`` -> + ``{"dialog_acts": ["ask_area"]}``) + + 3. new train/dev/test split + + - original dstc2 consisted of three different MDP policies, the original + train and dev datasets (consisting of two policies) were merged and + randomly split into train/dev/test + + 4. minor fixes + + - fixed several dialogs, where actions were wrongly annotated + - uppercased first letter of bot responses + - unified punctuation for bot responses + """ + + url = 'http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz' + + @staticmethod + def _data_fname(datatype): + assert datatype in ('trn', 'val', 'tst'), "wrong datatype name" + return f"simple-dstc2-{datatype}.json" + + @classmethod + @overrides + def read(self, data_path: str, dialogs: bool = False) -> Dict[str, List]: + """ + Downloads ``'simple_dstc2.tar.gz'`` archive from internet, + decompresses and saves files to ``data_path``. + + Parameters: + data_path: path to save DSTC2 dataset + dialogs: flag which indicates whether to output list of turns or + list of dialogs + + Returns: + dictionary that contains ``'train'`` field with dialogs from + ``'simple-dstc2-trn.json'``, ``'valid'`` field with dialogs + from ``'simple-dstc2-val.json'`` and ``'test'`` field with + dialogs from ``'simple-dstc2-tst.json'``. + Each field is a list of tuples ``(user turn, system turn)``. + """ + required_files = (self._data_fname(dt) for dt in ('trn', 'val', 'tst')) + if not all(Path(data_path, f).exists() for f in required_files): + log.info(f"{[Path(data_path, f) for f in required_files]}]") + log.info(f"[downloading data from {self.url} to {data_path}]") + download_decompress(self.url, data_path) + mark_done(data_path) + + data = { + 'train': self._read_from_file( + Path(data_path, self._data_fname('trn')), dialogs), + 'valid': self._read_from_file( + Path(data_path, self._data_fname('val')), dialogs), + 'test': self._read_from_file( + Path(data_path, self._data_fname('tst')), dialogs) + } + log.info(f"There are {len(data['train'])} samples in train split.") + log.info(f"There are {len(data['valid'])} samples in valid split.") + log.info(f"There are {len(data['test'])} samples in test split.") + return data + + @classmethod + def _read_from_file(cls, file_path: str, dialogs: bool = False): + """Returns data from single file""" + log.info(f"[loading dialogs from {file_path}]") + + utterances, responses, dialog_indices =\ + cls._get_turns(json.load(open(file_path, 'rt')), with_indices=True) + + data = list(map(cls._format_turn, zip(utterances, responses))) + + if dialogs: + return [data[idx['start']:idx['end']] for idx in dialog_indices] + return data + + @staticmethod + def _format_turn(turn): + turn_x, turn_y = turn + x = {'text': turn_x['text']} + y = {'text': turn_y['text'], + 'act': turn_y['act']} + if 'act' in turn_x: + x['intents'] = turn_x['act'] + if 'episode_done' in turn_x: + x['episode_done'] = turn_x['episode_done'] + if turn_x.get('db_result') is not None: + x['db_result'] = turn_x['db_result'] + if turn_x.get('slots'): + x['slots'] = turn_x['slots'] + if turn_y.get('slots'): + y['slots'] = turn_y['slots'] + return (x, y) + + @staticmethod + def _get_turns(data, with_indices=False): + n = 0 + utterances, responses, dialog_indices = [], [], [] + for dialog in data: + cur_n_utter, cur_n_resp = 0, 0 + for i, turn in enumerate(dialog): + speaker = turn.pop('speaker') + if speaker == 1: + if i == 0: + turn['episode_done'] = True + utterances.append(turn) + cur_n_utter += 1 + elif speaker == 2: + responses.append(turn) + cur_n_resp += 1 + if cur_n_utter not in range(cur_n_resp - 2, cur_n_resp + 1): + raise RuntimeError("Datafile has wrong format.") + if cur_n_utter != cur_n_resp: + if i == 0: + new_utter = { + "text": "", + "episode_done": True + } + else: + new_utter = copy.deepcopy(utterances[-1]) + if 'db_result' not in responses[-2]: + raise RuntimeError("Every api_call action" + " should have db_result") + db_result = responses[-2].pop('db_result') + new_utter['db_result'] = db_result + utterances.append(new_utter) + cur_n_utter += 1 + if cur_n_utter != cur_n_resp: + raise RuntimeError("Datafile has wrong format.") + n += cur_n_utter + dialog_indices.append({ + 'start': n - cur_n_utter, + 'end': n, + }) + + if with_indices: + return utterances, responses, dialog_indices + return utterances, responses diff --git a/deeppavlov/deep.py b/deeppavlov/deep.py index 4dd73302ea..3439935343 100644 --- a/deeppavlov/deep.py +++ b/deeppavlov/deep.py @@ -27,6 +27,7 @@ from deeppavlov.utils.ms_bot_framework.server import run_ms_bf_default_agent from deeppavlov.utils.pip_wrapper import install_from_config from deeppavlov.utils.server.server import start_model_server +from deeppavlov.utils.socket.socket import start_socket_server from deeppavlov.utils.telegram.telegram_ui import interact_model_by_telegram log = getLogger(__name__) @@ -35,7 +36,7 @@ parser.add_argument("mode", help="select a mode, train or interact", type=str, choices={'train', 'evaluate', 'interact', 'predict', 'interactbot', 'interactmsbot', - 'alexa', 'riseapi', 'download', 'install', 'crossval'}) + 'alexa', 'riseapi', 'risesocket', 'download', 'install', 'crossval'}) parser.add_argument("config_path", help="path to a pipeline json config", type=str) parser.add_argument("-e", "--start-epoch-num", dest="start_epoch_num", default=None, @@ -60,7 +61,9 @@ parser.add_argument("--key", default=None, help="ssl key", type=str) parser.add_argument("--cert", default=None, help="ssl certificate", type=str) -parser.add_argument("-p", "--port", default=None, help="api port", type=str) +parser.add_argument("-p", "--port", default=None, help="api port", type=int) +parser.add_argument("--socket-type", default='TCP', type=str, choices={"TCP", "UNIX"}) +parser.add_argument("--socket-file", default="/tmp/deeppavlov_socket.s", type=str) parser.add_argument("--api-mode", help="rest api mode: 'basic' with batches or 'alice' for Yandex.Dialogs format", type=str, default='basic', choices={'basic', 'alice'}) @@ -120,6 +123,8 @@ def main(): start_alice_server(pipeline_config_path, https, ssl_key, ssl_cert, port=args.port) else: start_model_server(pipeline_config_path, https, ssl_key, ssl_cert, port=args.port) + elif args.mode == 'risesocket': + start_socket_server(pipeline_config_path, args.socket_type, port=args.port, socket_file=args.socket_file) elif args.mode == 'predict': predict_on_stream(pipeline_config_path, args.batch_size, args.file_path) elif args.mode == 'install': diff --git a/deeppavlov/download.py b/deeppavlov/download.py index 7ae7b73846..6b4f48723a 100644 --- a/deeppavlov/download.py +++ b/deeppavlov/download.py @@ -109,8 +109,9 @@ def check_md5(url: str, dest_paths: List[Path]) -> bool: return True -def download_resource(url: str, dest_paths: Iterable[Path]) -> None: - dest_paths = list(dest_paths) +def download_resource(url: str, dest_paths: Iterable[Union[Path, str]]) \ + -> None: + dest_paths = [Path(dest) for dest in dest_paths] if check_md5(url, dest_paths): log.info(f'Skipped {url} download because of matching hashes') diff --git a/deeppavlov/models/classifiers/keras_classification_model.py b/deeppavlov/models/classifiers/keras_classification_model.py index f2b0c6a2a9..bcde5df3d3 100644 --- a/deeppavlov/models/classifiers/keras_classification_model.py +++ b/deeppavlov/models/classifiers/keras_classification_model.py @@ -190,7 +190,7 @@ def train_on_batch(self, texts: List[List[np.ndarray]], labels: list) -> Union[f """ features = self.check_input(texts) - metrics_values = self.model.train_on_batch(features, np.squeeze(np.array(labels))) + metrics_values = self.model.train_on_batch(features, np.array(labels)) return metrics_values def infer_on_batch(self, texts: List[List[np.ndarray]], labels: list = None) -> \ @@ -209,7 +209,7 @@ def infer_on_batch(self, texts: List[List[np.ndarray]], labels: list = None) -> features = self.check_input(texts) if labels: - metrics_values = self.model.test_on_batch(features, np.squeeze(np.array(labels))) + metrics_values = self.model.test_on_batch(features, np.array(labels)) return metrics_values else: predictions = self.model.predict(features) diff --git a/deeppavlov/models/classifiers/proba2labels.py b/deeppavlov/models/classifiers/proba2labels.py index 5cb2dfe659..29ab96ebc5 100644 --- a/deeppavlov/models/classifiers/proba2labels.py +++ b/deeppavlov/models/classifiers/proba2labels.py @@ -33,12 +33,12 @@ class Proba2Labels(Component): Args: max_proba: whether to choose label with maximal probability - confident_threshold: boundary probability value for smaple to belong with the class (best use for multi-label) + confident_threshold: boundary probability value for sample to belong with the class (best use for multi-label) top_n: how many top labels with the highest probabilities to return Attributes: max_proba: whether to choose label with maximal probability - confident_threshold: boundary probability value for smaple to belong with the class (best use for multi-label) + confident_threshold: boundary probability value for sample to belong with the class (best use for multi-label) top_n: how many top labels with the highest probabilities to return """ @@ -54,14 +54,12 @@ def __init__(self, self.top_n = top_n def __call__(self, data: Union[np.ndarray, List[List[float]], List[List[int]]], - *args, **kwargs) -> Union[List[List[str]], List[str]]: + *args, **kwargs) -> Union[List[List[int]], List[int]]: """ Process probabilities to labels Args: data: list of vectors with probability distribution - *args: - **kwargs: Returns: list of labels (only label classification) or list of lists of labels (multi-label classification) @@ -70,7 +68,7 @@ def __call__(self, data: Union[np.ndarray, List[List[float]], List[List[int]]], return [list(np.where(np.array(d) > self.confident_threshold)[0]) for d in data] elif self.max_proba: - return [[np.argmax(d)] for d in data] + return [np.argmax(d) for d in data] elif self.top_n: return [np.argsort(d)[::-1][:self.top_n] for d in data] else: diff --git a/deeppavlov/models/go_bot/network.py b/deeppavlov/models/go_bot/network.py index a5e713fb47..e7c304f95d 100644 --- a/deeppavlov/models/go_bot/network.py +++ b/deeppavlov/models/go_bot/network.py @@ -15,8 +15,9 @@ import collections import json import re +import copy from logging import getLogger -from typing import Dict, Any +from typing import Dict, Any, List, Optional, Union, Tuple import numpy as np import tensorflow as tf @@ -38,9 +39,9 @@ @register("go_bot") class GoalOrientedBot(LRScheduledTFModel): """ - The dialogue bot is based on https://arxiv.org/abs/1702.03274, which introduces - Hybrid Code Networks that combine an RNN with domain-specific knowledge - and system action templates. + The dialogue bot is based on https://arxiv.org/abs/1702.03274, which + introduces Hybrid Code Networks that combine an RNN with domain-specific + knowledge and system action templates. The network handles dialogue policy management. Inputs features of an utterance and predicts label of a bot action @@ -91,8 +92,8 @@ class GoalOrientedBot(LRScheduledTFModel): slot_filler: component that outputs slot values for a given utterance (:class:`~deeppavlov.models.slotfill.slotfill.DstcSlotFillingNetwork` recommended). - intent_classifier: component that outputs intents probability distribution - for a given utterance ( + intent_classifier: component that outputs intents probability + distribution for a given utterance ( :class:`~deeppavlov.models.classifiers.keras_classification_model.KerasClassificationModel` recommended). database: database that will be used during inference to perform @@ -132,21 +133,21 @@ def __init__(self, slot_filler: Component = None, intent_classifier: Component = None, database: Component = None, - api_call_action: str = None, # TODO: make it unrequired + api_call_action: str = None, use_action_mask: bool = False, debug: bool = False, - **kwargs): + **kwargs) -> None: if any(p in network_parameters for p in self.DEPRECATED): log.warning(f"parameters {self.DEPRECATED} are deprecated," - " for learning rate schedule documentation see" - " deeppavlov.core.models.lr_scheduled_tf_model" - " or read gitub tutorial on super convergence.") + f" for learning rate schedule documentation see" + f" deeppavlov.core.models.lr_scheduled_tf_model" + f" or read gitub tutorial on super convergence.") if 'learning_rate' in network_parameters: kwargs['learning_rate'] = network_parameters.pop('learning_rate') super().__init__(load_path=load_path, save_path=save_path, **kwargs) self.tokenizer = tokenizer - self.tracker = tracker + self.default_tracker = tracker self.bow_embedder = bow_embedder self.embedder = embedder self.slot_filler = slot_filler @@ -157,13 +158,13 @@ def __init__(self, template_path = expand_path(template_path) template_type = getattr(templ, template_type) - log.info("[loading templates from {}]".format(template_path)) + log.info(f"[loading templates from {template_path}]") self.templates = templ.Templates(template_type).load(template_path) self.n_actions = len(self.templates) - log.info("{} templates loaded".format(self.n_actions)) + log.info(f"{self.n_actions} templates loaded.") self.database = database - self.api_call_id = None + self.api_call_id = -1 if api_call_action is not None: self.api_call_id = self.templates.actions.index(api_call_action) @@ -185,14 +186,21 @@ def __init__(self, new_network_parameters.update(network_parameters) self._init_network(**new_network_parameters) + self.states = {} self.reset() - def _init_network(self, hidden_size, action_size, obs_size, dropout_rate, - l2_reg_coef, dense_size, attn): + def _init_network(self, + hidden_size: int, + action_size: int, + obs_size: int, + dropout_rate: float, + l2_reg_coef: float, + dense_size: int, + attn: dict) -> None: # initialize network dense_size = dense_size or hidden_size if obs_size is None: - obs_size = 6 + self.tracker.num_features + self.n_actions + obs_size = 6 + self.default_tracker.num_features + self.n_actions if callable(self.bow_embedder): obs_size += len(self.word_vocab) if callable(self.embedder): @@ -237,17 +245,14 @@ def _init_network(self, hidden_size, action_size, obs_size, dropout_rate, self.sess.run(tf.global_variables_initializer()) if tf.train.checkpoint_exists(str(self.load_path.resolve())): - log.info("[initializing `{}` from saved]".format(self.__class__.__name__)) + log.info(f"[initializing `{self.__class__.__name__}` from saved]") self.load() else: - log.info("[initializing `{}` from scratch]".format(self.__class__.__name__)) - - def _encode_context(self, context, db_result=None): - # tokenize input - tokens = self.tokenizer([context.lower().strip()])[0] - if self.debug: - log.debug("Tokenized text= `{}`".format(' '.join(tokens))) + log.info(f"[initializing `{self.__class__.__name__}` from scratch]") + def _encode_context(self, + tokens: List[str], + state: dict) -> List[np.ndarray]: # Bag of words features bow_features = [] if callable(self.bow_embedder): @@ -282,128 +287,126 @@ def _encode_context(self, context, db_result=None): # Intent features intent_features = [] if callable(self.intent_classifier): - intent_features = self.intent_classifier([context])[1][0] + intent_features = self.intent_classifier([' '.join(tokens)])[1][0] if self.debug: intent = self.intents[np.argmax(intent_features[0])] - log.debug("Predicted intent = `{}`".format(intent)) + log.debug(f"Predicted intent = `{intent}`") attn_key = np.array([], dtype=np.float32) if self.attn: if self.attn.action_as_key: - attn_key = np.hstack((attn_key, self.prev_action)) + attn_key = np.hstack((attn_key, state['prev_action'])) if self.attn.intent_as_key: attn_key = np.hstack((attn_key, intent_features)) if len(attn_key) == 0: attn_key = np.array([1], dtype=np.float32) - # Text entity features - if callable(self.slot_filler): - self.tracker.update_state(self.slot_filler([tokens])[0]) - if self.debug: - log.debug("Slot vals: {}".format(self.slot_filler([tokens]))) - - state_features = self.tracker.get_features() + state_features = state['tracker'].get_features() # Other features result_matches_state = 0. - if self.db_result is not None: - result_matches_state = all(v == self.db_result.get(s) - for s, v in self.tracker.get_state().items() + if state['db_result'] is not None: + matching_items = state['tracker'].get_state().items() + result_matches_state = all(v == state['db_result'].get(s) + for s, v in matching_items if v != 'dontcare') * 1. - context_features = np.array([bool(db_result) * 1., - (db_result == {}) * 1., - (self.db_result is None) * 1., - bool(self.db_result) * 1., - (self.db_result == {}) * 1., + context_features = np.array([bool(state['current_db_result']) * 1., + (state['current_db_result'] == {}) * 1., + (state['db_result'] is None) * 1., + bool(state['db_result']) * 1., + (state['db_result'] == {}) * 1., result_matches_state], dtype=np.float32) if self.debug: - log.debug("Context features = {}".format(context_features)) - debug_msg = "num bow features = {}, ".format(len(bow_features)) +\ - "num emb features = {}, ".format(len(emb_features)) +\ - "num intent features = {}, ".format(len(intent_features)) +\ - "num state features = {}, ".format(len(state_features)) +\ - "num context features = {}, ".format(len(context_features)) +\ - "prev_action shape = {}".format(len(self.prev_action)) + log.debug(f"Context features = {context_features}") + debug_msg = f"num bow features = {bow_features}" +\ + f", num emb features = {emb_features}" +\ + f", num intent features = {intent_features}" +\ + f", num state features = {len(state_features)}" +\ + f", num context features = {len(context_features)}" +\ + f", prev_action shape = {len(state['prev_action'])}" log.debug(debug_msg) concat_feats = np.hstack((bow_features, emb_features, intent_features, - state_features, context_features, self.prev_action)) + state_features, context_features, + state['prev_action'])) return concat_feats, emb_context, attn_key - def _encode_response(self, act): + def _encode_response(self, act: str) -> int: return self.templates.actions.index(act) - def _decode_response(self, action_id): + def _decode_response(self, action_id: int, state: dict) -> str: """ Convert action template id and entities from tracker to final response. """ template = self.templates.templates[int(action_id)] - slots = self.tracker.get_state() - if self.db_result is not None: - for k, v in self.db_result.items(): + slots = state['tracker'].get_state() + if state['db_result'] is not None: + for k, v in state['db_result'].items(): slots[k] = str(v) resp = template.generate_text(slots) # in api calls replace unknown slots to "dontcare" - if (self.templates.ttype is templ.DualTemplate) and\ - (action_id == self.api_call_id): + if action_id == self.api_call_id: resp = re.sub("#([A-Za-z]+)", "dontcare", resp).lower() - if self.debug: - log.debug("Pred response = {}".format(resp)) return resp - def calc_action_mask(self, previous_action): + def calc_action_mask(self, state: dict) -> np.ndarray: mask = np.ones(self.n_actions, dtype=np.float32) if self.use_action_mask: - known_entities = {**self.tracker.get_state(), **(self.db_result or {})} + known_entities = {**state['tracker'].get_state(), + **(state['db_result'] or {})} for a_id in range(self.n_actions): tmpl = str(self.templates.templates[a_id]) for entity in set(re.findall('#([A-Za-z]+)', tmpl)): if entity not in known_entities: mask[a_id] = 0. # forbid two api calls in a row - if np.any(previous_action): - prev_act_id = np.argmax(previous_action) + if np.any(state['prev_action']): + prev_act_id = np.argmax(state['prev_action']) if prev_act_id == self.api_call_id: mask[prev_act_id] = 0. return mask - def prepare_data(self, x, y): + def prepare_data(self, x: List[dict], y: List[dict]) -> List[np.ndarray]: b_features, b_u_masks, b_a_masks, b_actions = [], [], [], [] b_emb_context, b_keys = [], [] # for attention max_num_utter = max(len(d_contexts) for d_contexts in x) for d_contexts, d_responses in zip(x, y): - self.reset() - if self.debug: - preds = self._infer_dialog(d_contexts) + state = self._zero_state() d_features, d_a_masks, d_actions = [], [], [] d_emb_context, d_key = [], [] # for attention for context, response in zip(d_contexts, d_responses): - if context.get('db_result') is not None: - self.db_result = context['db_result'] - features, emb_context, key = \ - self._encode_context(context['text'], context.get('db_result')) + tokens = self.tokenizer([context['text'].lower().strip()])[0] + + # update state + state['current_db_result'] = context.get('db_result', None) + if state['current_db_result'] is not None: + state['db_result'] = state['current_db_result'] + if callable(self.slot_filler): + context_slots = self.slot_filler([tokens])[0] + state['tracker'].update_state(context_slots) + + features, emb_context, key = self._encode_context(tokens, + state=state) d_features.append(features) d_emb_context.append(emb_context) d_key.append(key) - d_a_masks.append(self.calc_action_mask(self.prev_action)) + d_a_masks.append(self.calc_action_mask(state)) action_id = self._encode_response(response['act']) d_actions.append(action_id) - # previous action is teacher-forced here - self.prev_action *= 0. - self.prev_action[action_id] = 1. + # update state + # - previous action is teacher-forced here + state['prev_action'] *= 0. + state['prev_action'][action_id] = 1. if self.debug: - log.debug("True response = `{}`".format(response['text'])) - if preds[0].lower() != response['text'].lower(): - log.debug("Pred response = `{}`".format(preds[0])) - preds = preds[1:] + log.debug(f"True response = '{response['text']}'.") if d_a_masks[-1][action_id] != 1.: log.warn("True action forbidden by action mask.") @@ -424,41 +427,48 @@ def prepare_data(self, x, y): b_actions.append(d_actions) return b_features, b_emb_context, b_keys, b_u_masks, b_a_masks, b_actions - def train_on_batch(self, x, y): + def train_on_batch(self, x: List[dict], y: List[dict]) -> dict: return self.network_train_on_batch(*self.prepare_data(x, y)) - def _infer(self, context, db_result=None, prob=False): - if db_result is not None: - self.db_result = db_result - features, emb_context, key = self._encode_context(context, db_result) - action_mask = self.calc_action_mask(self.prev_action) - probs = self.network_call([[features]], [[emb_context]], [[key]], - [[action_mask]], prob=True) - pred_id = np.argmax(probs) - - # one-hot encoding seems to work better then probabilities - if prob: - self.prev_action = probs - else: - self.prev_action *= 0 - self.prev_action[pred_id] = 1 - - return self._decode_response(pred_id) - - def _infer_dialog(self, contexts): - self.reset() + def _infer(self, tokens: List[str], state: dict) -> List: + features, emb_context, key = self._encode_context(tokens, state=state) + action_mask = self.calc_action_mask(state) + probs, state_c, state_h = \ + self.network_call([[features]], [[emb_context]], [[key]], + [[action_mask]], [[state['network_state'][0]]], + [[state['network_state'][1]]], + prob=True) + return probs, np.argmax(probs), (state_c, state_h) + + def _infer_dialog(self, contexts: List[dict]) -> List[str]: res = [] + state = self._zero_state() for context in contexts: if context.get('prev_resp_act') is not None: - action_id = self._encode_response(context.get('prev_resp_act')) + prev_act_id = self._encode_response(context['prev_resp_act']) # previous action is teacher-forced - self.prev_action *= 0. - self.prev_action[action_id] = 1. - - res.append(self._infer(context['text'], db_result=context.get('db_result'))) + state['prev_action'] *= 0. + state['prev_action'][prev_act_id] = 1. + + state['current_db_result'] = context.get('db_result') + if state['current_db_result'] is not None: + state['db_result'] = state['current_db_result'] + + tokens = self.tokenizer([context['text'].lower().strip()])[0] + if callable(self.slot_filler): + utter_slots = self.slot_filler([tokens])[0] + state['tracker'].update_state(utter_slots) + _, pred_act_id, state['network_state'] = \ + self._infer(tokens, state=state) + state['prev_action'] *= 0. + state['prev_action'][pred_act_id] = 1. + + resp = self._decode_response(pred_act_id, state) + res.append(resp) return res - def make_api_call(self, slots): + def make_api_call(self, state: dict) -> dict: + slots = state['tracker'].get_state() db_results = [] if self.database is not None: # filter slot keys with value equal to 'dontcare' as @@ -467,43 +477,82 @@ def make_api_call(self, slots): db_slots = {s: v for s, v in slots.items() if (v != 'dontcare') and (s in self.database.keys)} db_results = self.database([db_slots])[0] + # filter api results if there are more than one + if len(db_results) > 1: + db_results = [r for r in db_results if r != state['db_result']] else: log.warn("No database specified.") - log.info("Made api_call with {}, got {} results.".format(slots, len(db_results))) - # filter api results if there are more than one - if len(db_results) > 1: - db_results = [r for r in db_results if r != self.db_result] - return db_results[0] if db_results else {} + log.info(f"Made api_call with {slots}, got {len(db_results)} results.") + return {} if not db_results else db_results[0] - def __call__(self, batch): + def __call__(self, + batch: Union[List[dict], List[str]], + user_ids: Optional[List] = None) -> List[str]: + # batch is a list of utterances if isinstance(batch[0], str): res = [] - for x in batch: - pred = self._infer(x) + if not user_ids: + user_ids = ['finn' for i in range(len(batch))] + for user_id, x in zip(user_ids, batch): + state = self.states[user_id] + state['current_db_result'] = None + + tokens = self.tokenizer([x.lower().strip()])[0] + if callable(self.slot_filler): + utter_slots = self.slot_filler([tokens])[0] + state['tracker'].update_state(utter_slots) + _, pred_act_id, state['network_state'] = \ + self._infer(tokens, state=state) + state['prev_action'] *= 0. + state['prev_action'][pred_act_id] = 1. + # if made api_call, then respond with next prediction - prev_act_id = np.argmax(self.prev_action) - if prev_act_id == self.api_call_id: - db_result = self.make_api_call(self.tracker.get_state()) - res.append(self._infer(x, db_result=db_result)) - else: - res.append(pred) + if pred_act_id == self.api_call_id: + state['current_db_result'] = self.make_api_call(state) + if state['current_db_result'] is not None: + state['db_result'] = state['current_db_result'] + _, pred_act_id, state['network_state'] = \ + self._infer(tokens, state=state) + state['prev_action'] *= 0. + state['prev_action'][pred_act_id] = 1. + + resp = self._decode_response(pred_act_id, state) + res.append(resp) + self.states[user_id] = state return res + # batch is a list of dialogs, user_ids ignored return [self._infer_dialog(x) for x in batch] - def reset(self): - self.tracker.reset_state() - self.db_result = None - self.prev_action = np.zeros(self.n_actions, dtype=np.float32) - self.reset_network_state() + def _zero_state(self) -> dict: + return { + 'tracker': copy.deepcopy(self.default_tracker), + 'db_result': None, + 'current_db_result': None, + 'prev_action': np.zeros(self.n_actions, dtype=np.float32), + 'network_state': ( + np.zeros([1, self.hidden_size], dtype=np.float32), + np.zeros([1, self.hidden_size], dtype=np.float32) + ) + } + + def reset(self, user_id: Union[str, int] = 'finn') -> None: + self.states[user_id] = self._zero_state() if self.debug: log.debug("Bot reset.") - def network_call(self, features, emb_context, key, action_mask, prob=False): + def network_call(self, + features: np.ndarray, + emb_context: np.ndarray, + key: np.ndarray, + action_mask: np.ndarray, + states_c: np.ndarray, + states_h: np.ndarray, + prob: bool = False) -> List[np.ndarray]: feed_dict = { self._features: features, self._dropout_keep_prob: 1., self._utterance_mask: [[1.]], - self._initial_state: (self.state_c, self.state_h), + self._initial_state: (states_c, states_h), self._action_mask: action_mask } if self.attn: @@ -514,13 +563,17 @@ def network_call(self, features, emb_context, key, action_mask, prob=False): self.sess.run([self._probs, self._prediction, self._state], feed_dict=feed_dict) - self.state_c, self._state_h = state if prob: - return probs - return prediction - - def network_train_on_batch(self, features, emb_context, key, utter_mask, - action_mask, action): + return probs, state[0], state[1] + return prediction, state[0], state[1] + + def network_train_on_batch(self, + features: np.ndarray, + emb_context: np.ndarray, + key: np.ndarray, + utter_mask: np.ndarray, + action_mask: np.ndarray, + action: np.ndarray) -> dict: feed_dict = { self._dropout_keep_prob: 1., self._utterance_mask: utter_mask, @@ -539,7 +592,7 @@ def network_train_on_batch(self, features, emb_context, key, utter_mask, 'learning_rate': self.get_learning_rate(), 'momentum': self.get_momentum()} - def _init_network_params(self): + def _init_network_params(self) -> None: self.dropout_rate = self.opt['dropout_rate'] self.hidden_size = self.opt['hidden_size'] self.action_size = self.opt['action_size'] @@ -557,7 +610,7 @@ def _init_network_params(self): else: self.attn = None - def _build_graph(self): + def _build_graph(self) -> None: self._add_placeholders() @@ -584,7 +637,7 @@ def _build_graph(self): self._loss += self.l2_reg * tf.losses.get_regularization_loss() self._train_op = self.get_train_op(self._loss) - def _add_placeholders(self): + def _add_placeholders(self) -> None: self._dropout_keep_prob = tf.placeholder_with_default(1.0, shape=[], name='dropout_prob') @@ -618,13 +671,13 @@ def _add_placeholders(self): [None, None, self.attn.key_size], name='key') - def _build_body(self): + def _build_body(self) -> Tuple[tf.Tensor, tf.Tensor]: # input projection _units = tf.layers.dense(self._features, self.dense_size, kernel_regularizer=tf.nn.l2_loss, kernel_initializer=xav()) if self.attn: - attn_scope = "attention_mechanism/{}".format(self.attn.type) + attn_scope = f"attention_mechanism/{self.attn.type}" with tf.variable_scope(attn_scope): if self.attn.type == 'general': _attn_output = am.general_attention( @@ -673,7 +726,10 @@ def _build_body(self): # recurrent network unit _lstm_cell = tf.nn.rnn_cell.LSTMCell(self.hidden_size) - _utter_lengths = tf.to_int32(tf.reduce_sum(self._utterance_mask, axis=-1)) + _utter_lengths = tf.cast(tf.reduce_sum(self._utterance_mask, axis=-1), + tf.int32) + # _output: [batch_size, max_time, hidden_size] + # _state: tuple of two [batch_size, hidden_size] _output, _state = tf.nn.dynamic_rnn(_lstm_cell, _units, time_major=False, @@ -688,35 +744,30 @@ def _build_body(self): kernel_initializer=xav(), name='logits') return _logits, _state - def load(self, *args, **kwargs): + def load(self, *args, **kwargs) -> None: self.load_params() super().load(*args, **kwargs) - def save(self, *args, **kwargs): + def save(self, *args, **kwargs) -> None: super().save(*args, **kwargs) self.save_params() - def save_params(self): + def save_params(self) -> None: path = str(self.save_path.with_suffix('.json').resolve()) - log.info('[saving parameters to {}]'.format(path)) + log.info(f"[saving parameters to {path}]") with open(path, 'w', encoding='utf8') as fp: json.dump(self.opt, fp) - def load_params(self): + def load_params(self) -> None: path = str(self.load_path.with_suffix('.json').resolve()) - log.info('[loading parameters from {}]'.format(path)) + log.info(f"[loading parameters from {path}]") with open(path, 'r', encoding='utf8') as fp: params = json.load(fp) for p in self.GRAPH_PARAMS: if self.opt.get(p) != params.get(p): - raise ConfigError("`{}` parameter must be equal to saved model " - "parameter value `{}`, but is equal to `{}`" - .format(p, params.get(p), self.opt.get(p))) + raise ConfigError(f"`{p}` parameter must be equal to saved" + f" model parameter value `{params.get(p)}`," + f" but is equal to `{self.opt.get(p)}`") - def process_event(self, event_name, data): + def process_event(self, event_name, data) -> None: super().process_event(event_name, data) - - def reset_network_state(self): - # set zero state - self.state_c = np.zeros([1, self.hidden_size], dtype=np.float32) - self.state_h = np.zeros([1, self.hidden_size], dtype=np.float32) diff --git a/deeppavlov/requirements/rasa_skill.txt b/deeppavlov/requirements/rasa_skill.txt new file mode 100644 index 0000000000..bfb2598b2d --- /dev/null +++ b/deeppavlov/requirements/rasa_skill.txt @@ -0,0 +1 @@ +git+https://github.com/deepmipt/rasa.git@b0a80916e54ed9f4496c709a28f1093f7a5f2492#egg=rasa==1.2.7 diff --git a/deeppavlov/skills/rasa_skill/__init__.py b/deeppavlov/skills/rasa_skill/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/deeppavlov/skills/rasa_skill/rasa_skill.py b/deeppavlov/skills/rasa_skill/rasa_skill.py new file mode 100644 index 0000000000..d29350c881 --- /dev/null +++ b/deeppavlov/skills/rasa_skill/rasa_skill.py @@ -0,0 +1,256 @@ +import uuid +import asyncio +import logging +from pathlib import Path +from typing import Tuple, Optional, List +from functools import reduce + +from deeppavlov.core.common.registry import register +from deeppavlov.core.skill.skill import Skill + +from rasa.core.agent import Agent +from rasa.core.channels import UserMessage +from rasa.core.channels import CollectingOutputChannel +from rasa.model import get_model +from rasa.cli.utils import get_validated_path +from rasa.constants import DEFAULT_MODELS_PATH + +logger = logging.getLogger(__name__) + + +@register("rasa_skill") +class RASASkill(Skill): + """RASASkill lets you to wrap RASA Agent as a Skill within DeepPavlov environment. + + The component requires path to your RASA models (folder with timestamped tar.gz archieves) + as you use in command `rasa run -m models --enable-api --log-file out.log` + + """ + + def __init__(self, path_to_models: str, **kwargs) -> None: + """ + Constructs RASA Agent as a DeepPavlov skill: + read model folder, + initialize rasa.core.agent.Agent and wrap it's interfaces + + Args: + path_to_models: string path to folder with RASA models + + """ + # we need absolute path (expanded for user home and resolved if it relative path): + self.path_to_models = Path(path_to_models).expanduser().resolve() + + model = get_validated_path(self.path_to_models, "model", DEFAULT_MODELS_PATH) + + model_path = get_model(model) + if not model_path: + # can not laod model path + raise Exception("can not load model path: %s" % model) + + self._agent = Agent.load(model_path) + self.ioloop = asyncio.new_event_loop() + logger.info(f"path to RASA models is: `{self.path_to_models}`") + + def __call__(self, utterances_batch: List[str], + history_batch: Optional[List]=None, + states_batch: Optional[List]=None) -> Tuple[List[str], List[float], list]: + """Returns skill inference result. + + Returns batches of skill inference results, estimated confidence + levels and up to date states corresponding to incoming utterance + batch. + + Args: + utterances_batch: A batch of utterances of str type. + history_batch: A batch of list typed histories for each utterance. + states_batch: A batch of arbitrary typed states for + each utterance. + + + Returns: + response: A batch of arbitrary typed skill inference results. + confidence: A batch of float typed confidence levels for each of + skill inference result. + output_states_batch: A batch of arbitrary typed states for + each utterance. + + """ + user_ids, output_states_batch = self._handle_user_identification(utterances_batch, states_batch) + ################################################################################# + # RASA use asyncio for handling messages and handle_text is async function, + # so we need to instantiate event loop + # futures = [rasa_confident_response_decorator(self._agent, utt, sender_id=uid) for utt, uid in + futures = [self.rasa_confident_response_decorator(self._agent, utt, sender_id=uid) for utt, uid in + zip(utterances_batch, user_ids)] + + asyncio.set_event_loop(self.ioloop) + results = self.ioloop.run_until_complete(asyncio.gather(*futures)) + + responses_batch, confidences_batch = zip(*results) + return responses_batch, confidences_batch, output_states_batch + + async def rasa_confident_response_decorator(self, rasa_agent, text_message, sender_id): + """ + Args: + rasa_agent: rasa.core.agent.Agent instance + text_message: str with utterance from user + sender_id: id of the user + + Returns: None or tuple with str and float, where first element is a message and second is + confidence + """ + + resp = await self.rasa_handle_text_verbosely(rasa_agent, text_message, sender_id) + if resp: + responses, confidences, actions = resp + else: + logger.warning("Null response from RASA Skill") + return None + + # for adaptation to deep pavlov arch we need to merge multi-messages into single string: + texts = [each_resp['text'] for each_resp in responses if 'text' in each_resp] + merged_message = "\n".join(texts) + + merged_confidence = reduce(lambda a, b: a * b, confidences) + # TODO possibly it better to choose another function for calculation of final confidence + # current realisation of confidence propagation may cause confidence decay for long actions + # chains. If long chains is your case, try max(confidence) or confidence[0] + return merged_message, merged_confidence + + async def rasa_handle_text_verbosely(self, rasa_agent, text_message, sender_id): + """ + This function reimplements RASA's rasa.core.agent.Agent.handle_text method to allow to retrieve + message responses with confidence estimation altogether. + + It reconstructs with merge RASA's methods: + https://github.com/RasaHQ/rasa_core/blob/master/rasa/core/agent.py#L401 + https://github.com/RasaHQ/rasa_core/blob/master/rasa/core/agent.py#L308 + https://github.com/RasaHQ/rasa/blob/master/rasa/core/processor.py#L327 + + This required to allow RASA to output confidences with actions altogether + (Out of the box RASA does not support such use case). + + Args: + rasa_agent: rasa.core.agent.Agent instance + text_message: str with utterance from user + sender_id: id of the user + + Returns: None or + tuple where first element is a list of messages dicts, the second element is a list + of confidence scores for all actions (it is longer than messages list, because some actions + does not produce messages) + + """ + message = UserMessage(text_message, + output_channel=None, + sender_id=sender_id) + + processor = rasa_agent.create_processor() + tracker = processor._get_tracker(message.sender_id) + + confidences = [] + actions = [] + await processor._handle_message_with_tracker(message, tracker) + # save tracker state to continue conversation from this state + processor._save_tracker(tracker) + + # here we restore some of logic in RASA management. + # ###### Loop of IntraStep decisions ########################################################## + # await processor._predict_and_execute_next_action(msg, tracker): + # https://github.com/RasaHQ/rasa/blob/master/rasa/core/processor.py#L327-L362 + # keep taking actions decided by the policy until it chooses to 'listen' + should_predict_another_action = True + num_predicted_actions = 0 + + def is_action_limit_reached(): + return (num_predicted_actions == processor.max_number_of_predictions and + should_predict_another_action) + + # action loop. predicts actions until we hit action listen + while (should_predict_another_action and + processor._should_handle_message(tracker) and + num_predicted_actions < processor.max_number_of_predictions): + # this actually just calls the policy's method by the same name + action, policy, confidence = processor.predict_next_action(tracker) + + confidences.append(confidence) + actions.append(action) + + should_predict_another_action = await processor._run_action( + action, + tracker, + message.output_channel, + processor.nlg, + policy, confidence + ) + num_predicted_actions += 1 + + if is_action_limit_reached(): + # circuit breaker was tripped + logger.warning( + "Circuit breaker tripped. Stopped predicting " + "more actions for sender '{}'".format(tracker.sender_id)) + if processor.on_circuit_break: + # call a registered callback + processor.on_circuit_break(tracker, message.output_channel, processor.nlg) + + if isinstance(message.output_channel, CollectingOutputChannel): + + return message.output_channel.messages, confidences, actions + else: + return None + + def _generate_user_id(self) -> str: + """ + Here you put user id generative logic if you want to implement it in the skill. + + Although it is better to delegate user_id generation to Agent Layer + Returns: str + + """ + return uuid.uuid1().hex + + def _handle_user_identification(self, utterances_batch, states_batch): + """Method preprocesses states batch to guarantee that all users are identified (or + identifiers are generated for all users). + + Args: + utterances_batch: batch of utterances + states_batch: batch of states + + Returns: + + """ + # grasp user_ids from states batch. + # We expect that skill receives None or dict of state for each utterance. + # if state has user_id then skill uses it, otherwise it generates user_id and calls the + # user with this name in further. + + # In this implementation we use current datetime for generating uniqe ids + output_states_batch = [] + user_ids = [] + if states_batch is None: + # generate states batch matching batch of utterances: + states_batch = [None] * len(utterances_batch) + + for state in states_batch: + if not state: + user_id = self._generate_user_id() + new_state = {'user_id': user_id} + + elif 'user_id' not in state: + new_state = state + user_id = self._generate_user_id() + new_state['user_id'] = self._generate_user_id() + + else: + new_state = state + user_id = new_state['user_id'] + + user_ids.append(user_id) + output_states_batch.append(new_state) + return user_ids, output_states_batch + + def destroy(self): + self.ioloop.close() + super().destroy() diff --git a/deeppavlov/utils/alexa/server.py b/deeppavlov/utils/alexa/server.py index 8164c10034..06e923d865 100644 --- a/deeppavlov/utils/alexa/server.py +++ b/deeppavlov/utils/alexa/server.py @@ -38,7 +38,6 @@ log = getLogger(__name__) app = Flask(__name__) -Swagger(app) CORS(app) @@ -104,6 +103,10 @@ def run_alexa_server(agent_generator: callable, host = server_params['common_defaults']['host'] port = port or server_params['common_defaults']['port'] + docs_endpoint = server_params['common_defaults']['docs_endpoint'] + + Swagger.DEFAULT_CONFIG['specs_route'] = docs_endpoint + Swagger(app) alexa_server_params = server_params['alexa_defaults'] @@ -248,7 +251,7 @@ def run_alexa_server(agent_generator: callable, @app.route('/') def index(): - return redirect('/apidocs/') + return redirect(docs_endpoint) @app.route('/interact', methods=['POST']) @swag_from(endpoint_description) diff --git a/deeppavlov/utils/alice/alice.py b/deeppavlov/utils/alice/alice.py index 51073f6450..332c5bce0d 100644 --- a/deeppavlov/utils/alice/alice.py +++ b/deeppavlov/utils/alice/alice.py @@ -36,7 +36,6 @@ log = getLogger(__name__) app = Flask(__name__) -Swagger(app) CORS(app) DialogID = namedtuple('DialogID', ['user_id', 'session_id']) @@ -86,6 +85,10 @@ def start_alice_server(model_config, https=False, ssl_key=None, ssl_cert=None, p server_params = get_server_params(server_config_path, model_config) https = https or server_params['https'] + docs_endpoint = server_params['docs_endpoint'] + + Swagger.DEFAULT_CONFIG['specs_route'] = docs_endpoint + Swagger(app) if not https: ssl_key = ssl_cert = None @@ -112,6 +115,10 @@ def start_alice_server(model_config, https=False, ssl_key=None, ssl_cert=None, p skill = DefaultStatelessSkill(model, lang='ru') agent = DefaultAgent([skill], skills_processor=DefaultRichContentWrapper()) + @app.route('/') + def index(): + return redirect(docs_endpoint) + start_agent_server(agent, host, port, model_endpoint, ssl_key, ssl_cert) @@ -126,10 +133,6 @@ def start_agent_server(agent: Agent, host: str, port: int, endpoint: str, else: ssl_context = None - @app.route('/') - def index(): - return redirect('/apidocs/') - endpoint_description = { 'description': 'A model endpoint', 'parameters': [ diff --git a/deeppavlov/utils/ms_bot_framework/server.py b/deeppavlov/utils/ms_bot_framework/server.py index 82a6a9ebb2..3b3400281c 100644 --- a/deeppavlov/utils/ms_bot_framework/server.py +++ b/deeppavlov/utils/ms_bot_framework/server.py @@ -27,7 +27,6 @@ log = getLogger(__name__) app = Flask(__name__) -Swagger(app) CORS(app) @@ -74,6 +73,10 @@ def run_ms_bot_framework_server(agent_generator: callable, host = server_params['common_defaults']['host'] port = port or server_params['common_defaults']['port'] + docs_endpoint = server_params['common_defaults']['docs_endpoint'] + + Swagger.DEFAULT_CONFIG['specs_route'] = docs_endpoint + Swagger(app) ms_bf_server_params = server_params['ms_bot_framework_defaults'] @@ -126,7 +129,7 @@ def run_ms_bot_framework_server(agent_generator: callable, @app.route('/') def index(): - return redirect('/apidocs/') + return redirect(docs_endpoint) @app.route('/v3/conversations', methods=['POST']) def handle_activity(): diff --git a/deeppavlov/utils/server/server.py b/deeppavlov/utils/server/server.py index 5bfc4a4f70..65a4439925 100644 --- a/deeppavlov/utils/server/server.py +++ b/deeppavlov/utils/server/server.py @@ -12,9 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. -import re import ssl -from logging import getLogger, Filter +from logging import getLogger, Filter, LogRecord from pathlib import Path from typing import List, Tuple @@ -34,13 +33,10 @@ class PollerFilter(Filter): - """ - PollerFilter class is used to filter POST requests log records to - /poller endpoints. - """ - pat = re.compile(r'POST\s/\S*/poller\s') - def filter(self, record): - return not PollerFilter.pat.search(record.getMessage()) + """PollerFilter class is used to filter POST requests to /probe endpoint from logs.""" + def filter(self, record: LogRecord) -> bool: + """To log the record method should return True.""" + return 'POST /probe HTTP' not in record.getMessage() log = getLogger(__name__) @@ -48,7 +44,6 @@ def filter(self, record): werklog.addFilter(PollerFilter()) app = Flask(__name__) -Swagger(app) CORS(app) dialog_logger = DialogLogger(agent_name='dp_api') @@ -68,6 +63,9 @@ def get_server_params(server_config_path, model_config): if model_defaults[param_name]: server_params[param_name] = model_defaults[param_name] + server_params['model_endpoint'] = server_params.get('model_endpoint', '/model') + server_params['model_args_names'] = server_params['model_args_names'] or model_config['chainer']['in'] + return server_params @@ -134,13 +132,11 @@ def start_model_server(model_config, https=False, ssl_key=None, ssl_cert=None, p host = server_params['host'] port = port or server_params['port'] model_endpoint = server_params['model_endpoint'] + docs_endpoint = server_params['docs_endpoint'] model_args_names = server_params['model_args_names'] - if model_endpoint == '/': - e = ValueError('"/" endpoint is reserved, please provide correct endpoint in model_endpoint' - 'param in server configuration file') - log.error(e) - raise e + Swagger.DEFAULT_CONFIG['specs_route'] = docs_endpoint + Swagger(app) https = https or server_params['https'] @@ -168,7 +164,7 @@ def start_model_server(model_config, https=False, ssl_key=None, ssl_cert=None, p @app.route('/') def index(): - return redirect('/apidocs/') + return redirect(docs_endpoint) endpoint_description = { 'description': 'A model endpoint', @@ -192,8 +188,12 @@ def index(): def answer(): return interact(model, model_args_names) - @app.route(model_endpoint+'/poller', methods=['POST']) - def polling(): + @app.route('/probe', methods=['POST']) + def probe(): return test_interact(model, model_args_names) + @app.route('/api', methods=['GET']) + def api(): + return jsonify(model_args_names), 200 + app.run(host=host, port=port, threaded=False, ssl_context=ssl_context) diff --git a/deeppavlov/utils/settings/server_config.json b/deeppavlov/utils/settings/server_config.json index 99dfb174e3..2b63e04e3e 100644 --- a/deeppavlov/utils/settings/server_config.json +++ b/deeppavlov/utils/settings/server_config.json @@ -2,8 +2,8 @@ "common_defaults": { "host": "0.0.0.0", "port": 5000, - "model_endpoint": "/model", - "model_args_names": ["context"], + "docs_endpoint": "/docs/", + "model_args_names": "", "https": false, "https_cert_path": "", "https_key_path": "", @@ -30,55 +30,46 @@ "DstcSlotFillingNetwork": { "host": "", "port": "", - "model_endpoint": "/slot-fill", "model_args_names": "" }, "EcommerceSkill": { "host": "", "port": "", - "model_endpoint": "/ecommerce_skill", - "model_args_names": ["query", "history", "state"] + "model_args_names": "" }, "ErrorModel": { "host": "", "port": "", - "model_endpoint": "/error-model", "model_args_names": "" }, "GoalOrientedBot": { "host": "", "port": "", - "model_endpoint": "/go-bot", "model_args_names": "" }, "KerasIntentModel": { "host": "", "port": "", - "model_endpoint": "/intents", "model_args_names": "" }, "NER": { "host": "", "port": "", - "model_endpoint": "/ner", "model_args_names": "" }, "SquadModel": { "host": "", "port": "", - "model_endpoint": "/squad", - "model_args_names": ["context", "question"] + "model_args_names": "" }, "ODQA": { "host": "", "port": "", - "model_endpoint": "/odqa", "model_args_names": "" }, "Ranker": { "host": "", "port": "", - "model_endpoint": "/ranker", "model_args_names": "" } } diff --git a/deeppavlov/utils/settings/socket_config.json b/deeppavlov/utils/settings/socket_config.json new file mode 100644 index 0000000000..53c6b638dd --- /dev/null +++ b/deeppavlov/utils/settings/socket_config.json @@ -0,0 +1,14 @@ +{ + "common_defaults":{ + "host": "0.0.0.0", + "port": 5001, + "unix_socket_file": "/tmp/deeppavlov_socket.s", + "socket_type": "TCP", + "model_args_names": "", + "bufsize": 1024, + "binding_message": "binding socket to" + }, + "model_defaults": { + "SquadModel": {} + } +} \ No newline at end of file diff --git a/deeppavlov/utils/socket/__init__.py b/deeppavlov/utils/socket/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/deeppavlov/utils/socket/socket.py b/deeppavlov/utils/socket/socket.py new file mode 100644 index 0000000000..c877070d32 --- /dev/null +++ b/deeppavlov/utils/socket/socket.py @@ -0,0 +1,190 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import asyncio +import json +import socket +from logging import getLogger +from pathlib import Path +from typing import Dict, List, Optional, Tuple, Union + +from deeppavlov.core.agent.dialog_logger import DialogLogger +from deeppavlov.core.commands.infer import build_model +from deeppavlov.core.common.chainer import Chainer +from deeppavlov.core.common.paths import get_settings_path +from deeppavlov.core.data.utils import jsonify_data +from deeppavlov.utils.server.server import get_server_params + +SOCKET_CONFIG_FILENAME = 'socket_config.json' + + +class SocketServer: + """Creates socket server that sends the received data to the DeepPavlov model and returns model response. + + The server receives dictionary serialized to JSON formatted bytes array and sends it to the model. The dictionary + keys should match model arguments names, the values should be lists or tuples of inferenced values. + + Example: + {“context”:[“Elon Musk launched his cherry Tesla roadster to the Mars orbit”]} + + Socket server returns dictionary {'status': status, 'payload': payload} serialized to a JSON formatted byte array, + where: + status (str): 'OK' if the model successfully processed the data, else - error message. + payload: (Optional[List[Tuple]]): The model result if no error has occurred, otherwise None + + """ + _address_family: socket.AddressFamily + _bind_address: Union[Tuple[str, int], str] + _launch_msg: str + _loop: asyncio.AbstractEventLoop + _model: Chainer + _params: Dict + _socket: socket.socket + _socket_type: str + + def __init__(self, model_config: Path, socket_type: str, port: Optional[int] = None, + socket_file: Optional[Union[str, Path]] = None) -> None: + """Initialize socket server. + + Args: + model_config: Path to the config file. + socket_type: Socket family. "TCP" for the AF_INET socket, "UNIX" for the AF_UNIX. + port: Port number for the AF_INET address family. If parameter is not defined, the port number from the + model_config is used. + socket_file: Path to the file to which server of the AF_UNIX address family connects. If parameter + is not defined, the path from the model_config is used. + + """ + socket_config_path = get_settings_path() / SOCKET_CONFIG_FILENAME + self._params = get_server_params(socket_config_path, model_config) + self._socket_type = socket_type or self._params['socket_type'] + + if self._socket_type == 'TCP': + host = self._params['host'] + port = port or self._params['port'] + self._address_family = socket.AF_INET + self._launch_msg = f'{self._params["binding_message"]} http://{host}:{port}' + self._bind_address = (host, port) + elif self._socket_type == 'UNIX': + self._address_family = socket.AF_UNIX + bind_address = socket_file or self._params['unix_socket_file'] + bind_address = Path(bind_address).resolve() + if bind_address.exists(): + bind_address.unlink() + self._bind_address = str(bind_address) + self._launch_msg = f'{self._params["binding_message"]} {self._bind_address}' + else: + raise ValueError(f'socket type "{self._socket_type}" is not supported') + + self._dialog_logger = DialogLogger(agent_name='dp_api') + self._log = getLogger(__name__) + self._loop = asyncio.get_event_loop() + self._model = build_model(model_config) + self._socket = socket.socket(self._address_family, socket.SOCK_STREAM) + + self._socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) + self._socket.setblocking(False) + + def start(self) -> None: + """Binds the socket to the address and enables the server to accept connections""" + self._socket.bind(self._bind_address) + self._socket.listen() + self._log.info(self._launch_msg) + try: + self._loop.run_until_complete(self._server()) + except Exception as e: + self._log.error(f'got exception {e} while running server') + finally: + self._loop.close() + self._socket.close() + + async def _server(self) -> None: + while True: + conn, addr = await self._loop.sock_accept(self._socket) + self._loop.create_task(self._handle_connection(conn, addr)) + + async def _handle_connection(self, conn: socket.socket, addr: Tuple) -> None: + self._log.info(f'handling connection from {addr}') + conn.setblocking(False) + recv_data = b'' + try: + while True: + chunk = await self._loop.run_in_executor(None, conn.recv, self._params['bufsize']) + if chunk: + recv_data += chunk + else: + break + except BlockingIOError: + pass + try: + data = json.loads(recv_data) + except ValueError: + await self._wrap_error(conn, f'request "{recv_data}" type is not json') + return + self._dialog_logger.log_in(data) + model_args = [] + for param_name in self._params['model_args_names']: + param_value = data.get(param_name) + if param_value is None or (isinstance(param_value, list) and len(param_value) > 0): + model_args.append(param_value) + else: + await self._wrap_error(conn, f"nonempty array expected but got '{param_name}'={repr(param_value)}") + return + lengths = {len(i) for i in model_args if i is not None} + + if not lengths: + await self._wrap_error(conn, 'got empty request') + return + elif len(lengths) > 1: + await self._wrap_error(conn, f'got several different batch sizes: {lengths}') + return + batch_size = list(lengths)[0] + model_args = [arg or [None] * batch_size for arg in model_args] + + # in case when some parameters were not described in model_args + model_args += [[None] * batch_size for _ in range(len(self._model.in_x) - len(model_args))] + + prediction = await self._loop.run_in_executor(None, self._model, *model_args) + if len(self._model.out_params) == 1: + prediction = [prediction] + prediction = list(zip(*prediction)) + result = await self._response('OK', prediction) + self._dialog_logger.log_out(result) + await self._loop.sock_sendall(conn, result) + + async def _wrap_error(self, conn: socket.socket, error: str) -> None: + self._log.error(error) + await self._loop.sock_sendall(conn, await self._response(error, None)) + + @staticmethod + async def _response(status: str, payload: Optional[List[Tuple]]) -> bytes: + """Puts arguments into dict and serialize it to JSON formatted byte array. + + Args: + status: Response status. 'OK' if no error has occurred, otherwise error message. + payload: DeepPavlov model result if no error has occurred, otherwise None. + + Returns: + dict({'status': status, 'payload': payload}) serialized to a JSON formatted byte array. + + """ + resp_dict = jsonify_data({'status': status, 'payload': payload}) + resp_str = json.dumps(resp_dict) + return resp_str.encode('utf-8') + + +def start_socket_server(model_config: Path, socket_type: str, port: Optional[int], + socket_file: Optional[Union[str, Path]]) -> None: + server = SocketServer(model_config, socket_type, port, socket_file) + server.start() diff --git a/docs/_static/ipavlov_footer.png b/docs/_static/ipavlov_footer.png index fb7d2a1a16..691442d87f 100644 Binary files a/docs/_static/ipavlov_footer.png and b/docs/_static/ipavlov_footer.png differ diff --git a/docs/_static/social/Medium_Monogram.svg b/docs/_static/social/Medium_Monogram.svg new file mode 100644 index 0000000000..c8b251ddec --- /dev/null +++ b/docs/_static/social/Medium_Monogram.svg @@ -0,0 +1,13 @@ + + + + Monogram + Created with Sketch. + + + + + + + + \ No newline at end of file diff --git a/docs/_static/social/Twitter_Social_Icon_Circle_Color.svg b/docs/_static/social/Twitter_Social_Icon_Circle_Color.svg new file mode 100755 index 0000000000..6b421ee917 --- /dev/null +++ b/docs/_static/social/Twitter_Social_Icon_Circle_Color.svg @@ -0,0 +1,20 @@ + + + + + + + + + + + diff --git a/docs/_static/social/f_logo_RGB-Blue_58.png b/docs/_static/social/f_logo_RGB-Blue_58.png new file mode 100755 index 0000000000..743ec2d28b Binary files /dev/null and b/docs/_static/social/f_logo_RGB-Blue_58.png differ diff --git a/docs/_static/social/youtube_social_circle_red.png b/docs/_static/social/youtube_social_circle_red.png new file mode 100755 index 0000000000..8dce3e337e Binary files /dev/null and b/docs/_static/social/youtube_social_circle_red.png differ diff --git a/docs/_templates/footer.html b/docs/_templates/footer.html new file mode 100644 index 0000000000..56c16abd8b --- /dev/null +++ b/docs/_templates/footer.html @@ -0,0 +1,64 @@ +{#{% extends '!footer.html' %}#} + +

+ {% if (theme_prev_next_buttons_location == 'bottom' or theme_prev_next_buttons_location == 'both') and (next or prev) %} + + {% endif %} + +
+ +
+ {%- block extrafooter %} +

Problem? Ask a Question or try our Demo

+

+ twitter + facebook + youtube + medium +

+ {% endblock %} +

+ {%- if show_copyright %} + {%- if hasdoc('copyright') %} + {% set path = pathto('copyright') %} + {% set copyright = copyright|e %} + © {% trans %}Copyright{% endtrans %} {{ copyright }} + {%- else %} + {% set copyright = copyright|e %} + © {% trans %}Copyright{% endtrans %} {{ copyright }} + {%- endif %} + {%- endif %} + + {%- if build_id and build_url %} + + {# Translators: Build is a noun, not a verb #} + {% trans %}Build{% endtrans %} + {{ build_id }}. + + {%- elif commit %} + + {% trans %}Revision{% endtrans %} {{ commit }}. + + {%- elif last_updated %} + + {% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %} + + {%- endif %} + +

+
+ + {%- if show_sphinx %} + {% set sphinx_web = 'Sphinx' %} + {% set readthedocs_web = 'Read the Docs' %} + {% trans sphinx_web=sphinx_web, readthedocs_web=readthedocs_web %}Built with {{ sphinx_web }} using a{% endtrans %} {% trans %}theme{% endtrans %} {% trans %}provided by {{ readthedocs_web }}{% endtrans %}. + {%- endif %} + +
diff --git a/docs/apiref/models/classifiers.rst b/docs/apiref/models/classifiers.rst index d56af15e8d..81bb9f88b3 100644 --- a/docs/apiref/models/classifiers.rst +++ b/docs/apiref/models/classifiers.rst @@ -25,3 +25,8 @@ deeppavlov.models.classifiers :members: .. automethod:: __call__ + +.. autoclass:: deeppavlov.models.classifiers.proba2labels.Proba2Labels + :members: + + .. automethod:: __call__ diff --git a/docs/apiref/skills/rasa_skill.rst b/docs/apiref/skills/rasa_skill.rst new file mode 100644 index 0000000000..c4cef207fb --- /dev/null +++ b/docs/apiref/skills/rasa_skill.rst @@ -0,0 +1,5 @@ +deeppavlov.skills.rasa_skill +======================================== + +.. automodule:: deeppavlov.skills.rasa_skill.rasa_skill + :members: diff --git a/docs/conf.py b/docs/conf.py index eed018f2b7..eec32ec4e5 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -191,7 +191,7 @@ # -- Extension configuration ------------------------------------------------- autodoc_mock_imports = ['tensorflow', 'tensorflow_hub', 'fastText', 'nltk', 'gensim', 'kenlm', 'spacy', 'lxml', - 'sortedcontainers', 'russian_tagsets', 'bert_dp', 'aiml'] + 'sortedcontainers', 'russian_tagsets', 'bert_dp', 'aiml', 'rasa'] extlinks = { 'config': (f'https://github.com/deepmipt/DeepPavlov/blob/{release}/deeppavlov/configs/%s', None) diff --git a/docs/devguides/contribution_guide.rst b/docs/devguides/contribution_guide.rst index 010ddac1a1..8ea975255e 100644 --- a/docs/devguides/contribution_guide.rst +++ b/docs/devguides/contribution_guide.rst @@ -26,7 +26,7 @@ How to contribute: Accompany code with **clear comments** to let other people understand the flow of your mind. - If you create new models, refer to the :doc:`Registry your model + If you create new models, refer to the :doc:`Register your model ` section to add it to the DeepPavlov registry of models. diff --git a/docs/features/models/bert.rst b/docs/features/models/bert.rst index c8a847ee02..50e85ac8d0 100644 --- a/docs/features/models/bert.rst +++ b/docs/features/models/bert.rst @@ -15,11 +15,12 @@ There are several pre-trained BERT models released by Google Research, more deta - BERT-base, multilingual, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: download from `[google] `__, `[deeppavlov] `__ - BERT-base, Chinese, cased, 12-layer, 768-hidden, 12-heads, 110M parameters: download from `[google] `__ -We have trained BERT-base model for other languages: +We have trained BERT-base model for other languages and domains: -- RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: `[deeppavlov] `__ -- SlavicBERT, Slavic (bg, cs, pl, ru), cased, 12-layer, 768-hidden, 12-heads, 180M parameters: `[deeppavlov] `__ -- Conversational BERT, English, cased, 12-layer, 768-hidden, 12-heads, 110M parameters: `[deeppavlov] `__ +- RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: `[deeppavlov] `__ +- SlavicBERT, Slavic (bg, cs, pl, ru), cased, 12-layer, 768-hidden, 12-heads, 180M parameters: `[deeppavlov] `__ +- Conversational BERT, English, cased, 12-layer, 768-hidden, 12-heads, 110M parameters: `[deeppavlov] `__ +- Conversational RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters: `[deeppavlov] `__ RuBERT was trained on the Russian part of Wikipedia and news data. We used this training data to build vocabulary of Russian subtokens and took multilingual version of BERT-base as initialization for RuBERT [1]_. @@ -31,6 +32,9 @@ Conversational BERT was trained on the English part of Twitter, Reddit, DailyDia We used this training data to build the vocabulary of English subtokens and took English cased version of BERT-base as initialization for English Conversational BERT. +Conversational RuBERT was trained on OpenSubtitles [4]_, Dirty, Pikabu, and Social Media segment of Taiga corpus [7]_. +We assembled new vocabulary for Conversational RuBERT model on this data and initialized model with RuBERT. + Here, in DeepPavlov, we made it easy to use pre-trained BERT for downstream tasks like classification, tagging, question answering and ranking. We also provide pre-trained models and examples on how to use BERT with DeepPavlov. @@ -107,3 +111,4 @@ the :doc:`config ` file must be changed to match new BERT .. [4] P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) .. [5] Justine Zhang, Ravi Kumar, Sujith Ravi, Cristian Danescu-Niculescu-Mizil. Proceedings of NAACL, 2016. .. [6] J. Schler, M. Koppel, S. Argamon and J. Pennebaker (2006). Effects of Age and Gender on Blogging in Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs. +.. [7] Shavrina T., Shapovalova O. (2017) TO THE METHODOLOGY OF CORPUS CONSTRUCTION FOR MACHINE LEARNING: «TAIGA» SYNTAX TREE CORPUS AND PARSER. in proc. of “CORPORA2017”, international conference , Saint-Petersbourg, 2017. diff --git a/docs/features/models/classifiers.rst b/docs/features/models/classifiers.rst index 12a77e7ee9..b2a77d6256 100644 --- a/docs/features/models/classifiers.rst +++ b/docs/features/models/classifiers.rst @@ -313,6 +313,8 @@ Therefore, this model is available only for interaction. | | | | :config:`ELMo ` | | 0.7519 | 0.7875 | 700 Mb | + + + +-------------------------------------------------------------------------------------------------+ +--------+--------+-----------+ | | | | :config:`Multi-language BERT ` | | 0.6809 | 0.7193 | 1900 Mb | ++ + + +-------------------------------------------------------------------------------------------------+ +--------+--------+-----------+ +| | | | :config:`Conversational RuBERT ` | | 0.7548 | 0.7742 | 657 Mb | +------------------+--------------------+ +-------------------------------------------------------------------------------------------------+-------------+--------+--------+-----------+ | Intent |Ru like`Yahoo-L31`_ | | :config:`Conversational vs Informational on ELMo ` | ROC-AUC | 0.9412 | -- | 700 Mb | +------------------+--------------------+------+-------------------------------------------------------------------------------------------------+-------------+--------+--------+-----------+ diff --git a/docs/features/overview.rst b/docs/features/overview.rst index a8917f4ed9..e71a0fa68b 100644 --- a/docs/features/overview.rst +++ b/docs/features/overview.rst @@ -99,6 +99,8 @@ Several pre-trained models are available and presented in Table below. | | | | :config:`ELMo ` | | 0.7519 | 0.7875 | 700 Mb | + + + +-------------------------------------------------------------------------------------------------+ +--------+--------+-----------+ | | | | :config:`Multi-language BERT ` | | 0.6809 | 0.7193 | 1900 Mb | ++ + + +-------------------------------------------------------------------------------------------------+ +--------+--------+-----------+ +| | | | :config:`Conversational RuBERT ` | | 0.7548 | 0.7742 | 657 Mb | +------------------+--------------------+ +-------------------------------------------------------------------------------------------------+-------------+--------+--------+-----------+ | Intent |Ru like`Yahoo-L31`_ | | :config:`Conversational vs Informational on ELMo ` | ROC-AUC | 0.9412 | -- | 700 Mb | +------------------+--------------------+------+-------------------------------------------------------------------------------------------------+-------------+--------+--------+-----------+ @@ -444,9 +446,9 @@ Available pre-trained models and their comparison with existing benchmarks: +----------------+------+-------------------------------------------------------------------------------------+---------------+---------+------------+------------------+ | Dataset | Lang | Model | Metric | Valid | Test | Downloads | +================+======+=====================================================================================+===============+=========+============+==================+ -| `DSTC 2`_ [*]_ | En | :config:`bot with slot filler ` | Turn Accuracy | 0.521 | 0.529 | 400 Mb | +| `DSTC 2`_ [*]_ | En | :config:`bot with slot filler ` | Turn Accuracy | 0.544 | 0.542 | 400 Mb | + + +-------------------------------------------------------------------------------------+ +---------+------------+------------------+ -| | | :config:`bot with slot filler & intents & attention ` | | 0.555 | **0.561** | 8.5 Gb | +| | | :config:`bot with slot filler & intents & attention ` | | 0.548 | **0.553** | 8.5 Gb | +----------------+ +-------------------------------------------------------------------------------------+ +---------+------------+------------------+ | `DSTC 2`_ | | Bordes and Weston (2016) | | -- | 0.411 | -- | + + +-------------------------------------------------------------------------------------+ +---------+------------+------------------+ diff --git a/docs/features/pretrained_vectors.rst b/docs/features/pretrained_vectors.rst index 7b78658024..1bb4b9477f 100644 --- a/docs/features/pretrained_vectors.rst +++ b/docs/features/pretrained_vectors.rst @@ -9,6 +9,7 @@ We are publishing several pre-trained BERT models: * RuBERT for Russian language * Slavic BERT for Bulgarian, Czech, Polish, and Russian * Conversational BERT for informal English +* and Conversational BERT for informal Russian Description of these models is available in the :doc:`BERT section ` of the docs. @@ -23,15 +24,17 @@ Downloads The models can be run with the original `BERT repo `_ code. The download links are: -+----------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ -| Description | Model parameters | Download link | -+======================+====================================================+============================================================================================================================================+ -| RuBERT | vocab size = 120K, parameters = 180M, size = 700MB | `[rubert_cased_L-12_H-768_A-12] `__ | -+----------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ -| Slavic BERT | vocab size = 120K, parameters = 180M, size = 700MB | `[bg_cs_pl_ru_cased_L-12_H-768_A-12] `__ | -+----------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ -| Conversational BERT | vocab size = 30K, parameters = 110M, size = 400MB | `[conversational_cased_L-12_H-768_A-12] `__ | -+----------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ ++------------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ +| Description | Model parameters | Download link | ++========================+====================================================+==================================================================================================================================================+ +| RuBERT | vocab size = 120K, parameters = 180M, size = 632MB | `[rubert_cased_L-12_H-768_A-12] `__ | ++------------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ +| Slavic BERT | vocab size = 120K, parameters = 180M, size = 632MB | `[bg_cs_pl_ru_cased_L-12_H-768_A-12] `__ | ++------------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ +| Conversational BERT | vocab size = 30K, parameters = 110M, size = 385MB | `[conversational_cased_L-12_H-768_A-12] `__ | ++------------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ +| Conversational RuBERT | vocab size = 120K, parameters = 180M, size = 630MB | `[conversational_cased_L-12_H-768_A-12] `__ | ++------------------------+----------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+ ELMo ---- diff --git a/docs/features/skills/aiml_skill.rst b/docs/features/skills/aiml_skill.rst index 6038b4689c..2f2c09944e 100644 --- a/docs/features/skills/aiml_skill.rst +++ b/docs/features/skills/aiml_skill.rst @@ -20,7 +20,7 @@ parameter `path_to_aiml_scripts`. You can download bunch of free and ready for use AIML scripts from pandorabots repo: https://github.com/pandorabots/Free-AIML -DeepPavlov library has default config for AIMLSkill here: :config:`configs/aiml_skill/aiml_skill.json ` +DeepPavlov library has default config for AIMLSkill here: :config:`configs/skills/aiml_skill.json ` Usage ^^^^^^^^ diff --git a/docs/features/skills/dsl_skill.rst b/docs/features/skills/dsl_skill.rst index c705b92c33..575558cfc6 100644 --- a/docs/features/skills/dsl_skill.rst +++ b/docs/features/skills/dsl_skill.rst @@ -1,14 +1,11 @@ DSL Skill ====================== -An :doc:`DSL implementation`. DSL helps to easily create user-defined - skills for dialog systems. +A :doc:`DSL implementation`. DSL helps to easily create user-defined skills for dialog systems. -For the case when DSL skill matched utterance and found response it outputs response with confidence -value. +For the case when DSL skill matched utterance and found response it outputs response with confidence value. -For the case when no match occurred DSL skill returns the argument `on_invalid_command` ("Простите, я вас не понял" by delault) - as utterance and sets confidence to `null_confidence` attribute (0 by default). +For the case when no match occurred DSL skill returns the argument `on_invalid_command` ("Простите, я вас не понял" by delault) as utterance and sets confidence to `null_confidence` attribute (0 by default). `on_invalid_command` and `null_confidence` can be changed in model config diff --git a/docs/features/skills/go_bot.rst b/docs/features/skills/go_bot.rst index bfcffabb95..b1b3079fef 100644 --- a/docs/features/skills/go_bot.rst +++ b/docs/features/skills/go_bot.rst @@ -325,11 +325,9 @@ Scores for different modifications of our bot model and comparison with existing +================+======+===========================================+======================================================================+===============+===========+===============+ | `DSTC 2`_ [*]_ | En | basic bot | :config:`gobot_dstc2_minimal.json ` | Turn Accuracy | 0.380 | 10 Mb | + + +-------------------------------------------+----------------------------------------------------------------------+ +-----------+---------------+ -| | | bot with slot filler | :config:`gobot_dstc2.json ` | | 0.529 | 400 Mb | +| | | bot with slot filler | :config:`gobot_dstc2.json ` | | 0.542 | 400 Mb | + + +-------------------------------------------+----------------------------------------------------------------------+ +-----------+---------------+ -| | | bot with slot filler & intents | | | 0.531 | -- | -+ + +-------------------------------------------+----------------------------------------------------------------------+ +-----------+---------------+ -| | | bot with slot filler, intents & attention | :config:`gobot_dstc2_best.json ` | | **0.561** | 8.5 Gb | +| | | bot with slot filler, intents & attention | :config:`gobot_dstc2_best.json ` | | **0.553** | 8.5 Gb | +----------------+ +-------------------------------------------+----------------------------------------------------------------------+ +-----------+---------------+ | `DSTC 2`_ | | Bordes and Weston (2016) [3]_ | -- | | 0.411 | -- | + + +-------------------------------------------+----------------------------------------------------------------------+ +-----------+---------------+ diff --git a/docs/features/skills/rasa_skill.rst b/docs/features/skills/rasa_skill.rst new file mode 100644 index 0000000000..0e9427a119 --- /dev/null +++ b/docs/features/skills/rasa_skill.rst @@ -0,0 +1,51 @@ +Rasa Skill +====================== + +A :class:`Rasa wrapper implementation` that reads a folder with Rasa models +(provided by ``path_to_models`` argument), initializes Rasa Agent with this configuration and responds for incoming +utterances according to responses predicted by Rasa. Each response has confidence value estimated as product of +scores of executed actions by Rasa system in the current prediction step (each prediction step in Rasa usually consists of +multiple actions). If Rasa responds with multiple ``BotUttered`` actions, then such phrases are merged into one utterance +divided by ``'\n'``. + +Quick Start +----------- +To setup a Rasa Skill you need to have a working Rasa project at some path, then you can specify the path to Rasa's +models (usually it is a folder with name ``models`` inside the project path) at initialization of Rasa Skill class +by providing ``path_to_models`` attribute. + +Dummy Rasa project +------------------ +DeepPavlov library has :config:`a template config for RASASkill`. +This project is in essence a working Rasa project created with ``rasa init`` and ``rasa train`` commands +with minimal additions. The Rasa bot can greet, answer about what he can do and detect user's mood sentiment. + +The template DeepPavlov config specifies only one component (RASASkill) in :doc:`a pipeline`. +The configuration also specifies: ``metadata.requirements`` which is the file with Rasa dependency and +``metadata.download`` configuration specifies to download and unpack the gzipped template project into subdir +``{DOWNLOADS_PATH}``. + +If you create a configuration for a Rasa project hosted on your machine, you don't need to specify ``metadata.download`` +and just need to correctly set ``path_to_models`` of the ``rasa_skill`` component. +``path_to_models`` needs to be a path to your Rasa's ``models`` directory. + +See `Rasa's documentation `_ for explanation on how +to create project. + +Usage without DeepPavlov configuration files +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + from deeppavlov.agents.default_agent.default_agent import DefaultAgent + from deeppavlov.agents.processors.highest_confidence_selector import HighestConfidenceSelector + from deeppavlov.skills.rasa_skill.rasa_skill import RASASkill + + rasa_skill_config = { + 'path_to_models': , + } + + rasa_skill = RASASkill(**rasa_skill_config) + agent = DefaultAgent([rasa_skill], skills_selector=HighestConfidenceSelector()) + responses = agent(["Hello"]) + print(responses) diff --git a/docs/index.rst b/docs/index.rst index f3da1da9f2..867822c649 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -51,6 +51,7 @@ Welcome to DeepPavlov's documentation! Frequently Asked Questions Answering eCommerce Bot AIML + Rasa DSL @@ -60,6 +61,7 @@ Welcome to DeepPavlov's documentation! :caption: Integrations REST API + Socket API Telegram integration Yandex Alice integration Amazon Alexa integration @@ -74,7 +76,7 @@ Welcome to DeepPavlov's documentation! :caption: Developer Guides Contribution guide - Registry your model + Register your model .. toctree:: diff --git a/docs/integrations/amazon_alexa.rst b/docs/integrations/amazon_alexa.rst index 4385bd344c..856511637b 100644 --- a/docs/integrations/amazon_alexa.rst +++ b/docs/integrations/amazon_alexa.rst @@ -174,7 +174,7 @@ Alexa sends request to the https endpoint which was set in the **Endpoint** sect You should deploy DeepPavlov skill/model REST service on this endpoint or redirect it to your REST service. Full REST endpoint URL -can be obtained by the swagger ``apidocs/`` endpoint. We remind you that Alexa requires https endpoint +can be obtained by the swagger ``docs/`` endpoint. We remind you that Alexa requires https endpoint with valid certificate from CA. `Here is the guide `__ for running custom skill service with self-signed certificates in test mode. diff --git a/docs/integrations/ms_bot.rst b/docs/integrations/ms_bot.rst index 4f1443ea50..4954f656bb 100644 --- a/docs/integrations/ms_bot.rst +++ b/docs/integrations/ms_bot.rst @@ -75,7 +75,7 @@ which was set in the **Web App Bot connection configuration** section. You should deploy DeepPavlov skill/model REST service on this endpoint or terminate it to your REST service. Full REST endpoint URL -can be obtained by the swagger ``apidocs/`` endpoint. We remind you that Microsoft Bot Framework requires https endpoint +can be obtained by the swagger ``docs/`` endpoint. We remind you that Microsoft Bot Framework requires https endpoint with valid certificate from CA. Each DeepPavlov skill/model can be made available for MS Bot Framework diff --git a/docs/integrations/rest_api.rst b/docs/integrations/rest_api.rst index 98a93ec442..d91ce0b810 100644 --- a/docs/integrations/rest_api.rst +++ b/docs/integrations/rest_api.rst @@ -14,18 +14,42 @@ inference as a REST web service. The general method is: settings from ``deeppavlov/utils/settings/server_config.json``. The command will print the used host and port. Default web service properties -(host, port, model endpoint, GET request arguments) can be modified via changing +(host, port, POST request arguments) can be modified via changing ``deeppavlov/utils/settings/server_config.json`` file. +API routes +---------- + +/model +"""""" +Send POST request to ``:/model`` to infer model. See details at +:ref:`rest_api_docs`. + +/probe +"""""" +Send POST request to ``:/probe`` to check if API is working. The +server will send a response ``["Test passed"]`` if it is working. Requests to +``/probe`` are not logged. + +/api +"""" +To get model argument names send GET request to ``:/api``. Server +will return list with argument names. + +.. _rest_api_docs: + +/docs +""""" + To interact with the REST API via graphical interface open -``:/apidocs`` in a browser (Flasgger UI). +``:/docs`` in a browser (Flasgger UI). + Advanced configuration -~~~~~~~~~~~~~~~~~~~~~~ +---------------------- By modifying ``deeppavlov/utils/settings/server_config.json`` you can change -host, port, model endpoint, GET request arguments and other properties of the -API service. +host, port, POST request arguments and other properties of the API service. Properties from ``common_defaults`` section are used by default unless they are overridden by model-specific properties, provided in @@ -38,35 +62,38 @@ match with properties key from ``model_defaults`` section of For example, ``metadata/labels/server_utils`` tag from ``go_bot/gobot_dstc2.json`` references to the *GoalOrientedBot* section -of ``server_config.json``. Therefore, ``model_endpoint`` parameter in -``common_defaults`` will be will be overridden with the same parameter -from ``model_defaults/GoalOrientedBot``. +of ``server_config.json``. Therefore, all parameters with non empty (i.e. not +``""``, not ``[]`` etc.) values from ``model_defaults/GoalOrientedBot`` will +overwrite the parameter values in ``common_defaults``. -Model argument names are provided as list in ``model_args_names`` -parameter, where arguments order corresponds to model API. +If ``model_args_names`` parameter of ``server_config.json`` is empty string, +then model argument names are provided as list from ``chainer/in`` section of +the model config file, where arguments order corresponds to model API. When inferencing model via REST api, JSON payload keys should match -model arguments names from ``model_args_names``. -Default argument name for one argument models is *"context"*. -Here are POST requests examples for some of the library models: - -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Model | POST request JSON payload example | -+=========================================+=================================================================================================================================================+ -| **One argument models** | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| NER model | {"context":["Elon Musk launched his cherry Tesla roadster to the Mars orbit"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Intent classification model | {"context":["I would like to go to a restaurant with Asian cuisine this evening"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Automatic spelling correction model | {"context":["errror"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Ranking model | {"context":["What is the average cost of life insurance services?"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Goal-oriented bot | {"context":["Hello, can you help me to find and book a restaurant this evening?"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| **Multiple arguments models** | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ -| Question Answering model | | {"context":["After 1765, growing philosophical and political differences strained the relationship between Great Britain and its colonies."], | -| | |  "question":["What strained the relationship between Great Britain and its colonies?"]} | -+-----------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ +model arguments names from ``chainer/in`` section. +If ``model_args_names`` parameter of ``server_config.json`` is list, its values +are used as model argument names instead of the list from model config's +``chainer/in`` section. +Here are POST request payload examples for some of the library models: + ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Model | POST request JSON payload example | ++=========================================+=====================================================================================================================================================+ +| **One argument models** | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| NER model | {"x":["Elon Musk launched his cherry Tesla roadster to the Mars orbit"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Intent classification model | {"x":["I would like to go to a restaurant with Asian cuisine this evening"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Automatic spelling correction model | {"x":["errror"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Ranking model | {"x":["What is the average cost of life insurance services?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Goal-oriented bot | {"x":["Hello, can you help me to find and book a restaurant this evening?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| **Multiple arguments models** | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Question Answering model | | {"context_raw":["After 1765, growing philosophical and political differences strained the relationship between Great Britain and its colonies."], | +| | |  "question_raw":["What strained the relationship between Great Britain and its colonies?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ diff --git a/docs/integrations/socket_api.rst b/docs/integrations/socket_api.rst new file mode 100644 index 0000000000..c4526d94e1 --- /dev/null +++ b/docs/integrations/socket_api.rst @@ -0,0 +1,125 @@ +Socket API +========== + +Each DeepPavlov model can be made available as a socket server. The general +method is: + +.. code:: bash + + python -m deeppavlov risesocket [-d] [--socket-type ] [-p ] \ + [--socket-file ] + + +* ``-d``: downloads model specific data before starting the service. +* ``--socket-type ``: sets socket address family to ``AF_INET`` + if ```` is ``TCP`` or to ``AF_UNIX`` if ```` + is ``UNIX``. Overrides default settings from + ``deeppavlov/utils/settings/socket_config.json``. +* ``-p ``: sets the port to ```` if socket address family is + ``AF_INET``. Overrides default settings from + ``deeppavlov/utils/settings/socket_config.json``. +* ``--socket-file ``: sets the file for socket binding to + ```` if socket address family is ``AF_UNIX``. Overrides + default settings from ``deeppavlov/utils/settings/socket_config.json``. + +The command will print the binding address: host and port for ``AF_INET`` +socket family and path to the UNIX socket file for ``AF_UNIX`` socket family. +Default service properties (socket address family, host, port, path to the UNIX +socket file, socket buffer size, binding message) can be modified via changing +``deeppavlov/utils/settings/socket_config.json`` file. + +Advanced configuration +~~~~~~~~~~~~~~~~~~~~~~ + +By modifying ``deeppavlov/utils/settings/socket_config.json`` you can change +socket address family, host, port, path to the UNIX socket file and other +properties of the API service. + +Properties from ``common_defaults`` section are used by default unless they are +overridden by model-specific properties, provided in ``model_defaults`` section +of the ``socket_config.json``. Model-specific properties are bound to the model +by ``server_utils`` label in ``metadata/labels`` section of the model config. +Value of ``server_utils`` label from model config should match with properties +key from ``model_defaults`` section of ``socket_config.json``. + +For example, ``metadata/labels/server_utils`` tag from +``deeppavlov/configs/squad/squad.json`` references to the *SquadModel* section +of ``socket_config.json``. Therefore, all parameters with non empty (i.e. not +``""``, not ``[]`` etc.) values from ``model_defaults/SquadModel`` will +overwrite the parameter values in ``common_defaults``. + +If ``model_args_names`` parameter of ``server_config.json`` is empty string, +then model argument names are provided as list from ``chainer/in`` section of +the model config file, where arguments order corresponds to model API. +When inferencing model via socket API, serialized JSON payload keys should match +model arguments names from ``chainer/in`` section. +If ``model_args_names`` parameter of ``server_config.json`` is list, its values +are used as model argument names instead of the list from model config's +``chainer/in`` section. + ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Model | POST request JSON payload example | ++=========================================+=====================================================================================================================================================+ +| **One argument models** | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| NER model | {"x":["Elon Musk launched his cherry Tesla roadster to the Mars orbit"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Intent classification model | {"x":["I would like to go to a restaurant with Asian cuisine this evening"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Automatic spelling correction model | {"x":["errror"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Ranking model | {"x":["What is the average cost of life insurance services?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Goal-oriented bot | {"x":["Hello, can you help me to find and book a restaurant this evening?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| **Multiple arguments models** | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ +| Question Answering model | | {"context_raw":["After 1765, growing philosophical and political differences strained the relationship between Great Britain and its colonies."], | +| | |  "question_raw":["What strained the relationship between Great Britain and its colonies?"]} | ++-----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+ + +Socket client example (Python) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Socket client for :doc:`SQuAD ` model with a batch of +two elements: + +.. code-block:: python + + # squad-client.py + + import json + import socket + + socket_payload = { + "context_raw": [ + "All work and no play makes Jack a dull boy", + "I used to be an adventurer like you, then I took an arrow in the knee" + ], + "question_raw": [ + "What makes Jack a dull boy?", + "Who I used to be?" + ] + } + dumped_socket_payload = json.dumps(socket_payload) + + with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: + s.connect(('0.0.0.0', 5000)) + s.sendall(dumped_socket_payload.encode('utf-8')) + serialized_payload = s.recv(1024) + json_payload = json.loads(serialized_payload) + + print(json_payload) + +To start socket server with ``squad_bert`` model run: + +.. code:: bash + + python -m deeppavlov risesocket -d squad_bert --socket-type TCP -p 5000 + + +To start socket client on another terminal run: + +.. code:: bash + + python squad-client.py diff --git a/docs/intro/quick_start.rst b/docs/intro/quick_start.rst index c2fb7195a3..db70fdc193 100644 --- a/docs/intro/quick_start.rst +++ b/docs/intro/quick_start.rst @@ -67,6 +67,8 @@ There are even more actions you can perform with configs: * ``interact`` to interact via CLI, * ``riseapi`` to run a REST API server (see :doc:`docs `), + * ``risesocket`` to run a socket API server (see :doc:`docs + `), * ``interactbot`` to run as a Telegram bot (see :doc:`docs `), * ``interactmsbot`` to run a Miscrosoft Bot Framework server (see diff --git a/examples/classification_tutorial.ipynb b/examples/classification_tutorial.ipynb index b80dcb5918..b0c7915b1d 100644 --- a/examples/classification_tutorial.ipynb +++ b/examples/classification_tutorial.ipynb @@ -48,7 +48,7 @@ " * [Vocabulary](#Vocabulary)\n", "3. [Featurization](#Featurization): [docs link](https://deeppavlov.readthedocs.io/en/latest/components/data_processors.html), [pre-trained embeddings link](https://deeppavlov.readthedocs.io/en/latest/intro/pretrained_vectors.html)\n", " * [Bag-of-words embedder](#Bag-of-words)\n", - " * [TF-IDF vectorizer](#TF-IDF Vectorizer)\n", + " * [TF-IDF vectorizer](#TF-IDF-Vectorizer)\n", " * [GloVe embedder](#GloVe-embedder)\n", " * [Mean GloVe embedder](#Mean-GloVe-embedder)\n", " * [GloVe weighted by TF-IDF embedder](#GloVe-weighted-by-TF-IDF-embedder)\n", diff --git a/examples/gobot_tutorial.ipynb b/examples/gobot_tutorial.ipynb new file mode 100644 index 0000000000..8d7e6f14a1 --- /dev/null +++ b/examples/gobot_tutorial.ipynb @@ -0,0 +1,1729 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### You can also run the notebook in [COLAB](https://colab.research.google.com/github/deepmipt/DeepPavlov/blob/master/examples/gobot_tutorial.ipynb)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install deeppavlov" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Goal-oriented bot in DeepPavlov" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The tutor is focused on building a goal-oriented dialogue system:\n", + "\n", + "0. [Data preparation](#0.-Data-Preparation)\n", + "1. [Build database of items](#1.-Build-database-of-items)\n", + "2. [Build Slot Filler](#2.-Build-Slot-Filler)\n", + "3. [Train bot](#3.-Train-bot)\n", + "\n", + "An example of the final model served as a telegram bot is:\n", + "\n", + "![gobot_example.png](img/gobot_example.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 0. Data Preparation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The tutor's dialogue system will be on the domain of restaurant booking. [Dialogue State Tracking Challenge 2 (DSTC-2)](http://camdial.org/~mh521/dstc/) dataset provides dialogues of a human talking to a booking system labelled with slots and dialogue actions. The labels are will be used for training a dialogue policy network.\n", + "\n", + "See below a small chunk of the data. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2019-09-04 14:40:33.370 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 269: [PosixPath('my_data/simple-dstc2-val.json'), PosixPath('my_data/simple-dstc2-tst.json')]]\n", + "2019-09-04 14:40:33.371 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 270: [downloading data from http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz to my_data]\n", + "2019-09-04 14:40:33.399 INFO in 'deeppavlov.core.data.utils'['utils'] at line 63: Downloading from http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz to my_data/simple_dstc2.tar.gz\n", + "100%|██████████| 497k/497k [00:00<00:00, 67.5MB/s]\n", + "2019-09-04 14:40:33.410 INFO in 'deeppavlov.core.data.utils'['utils'] at line 201: Extracting my_data/simple_dstc2.tar.gz archive into my_data\n", + "2019-09-04 14:40:33.442 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from my_data/simple-dstc2-trn.json]\n", + "2019-09-04 14:40:33.534 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from my_data/simple-dstc2-val.json]\n", + "2019-09-04 14:40:33.604 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from my_data/simple-dstc2-tst.json]\n", + "2019-09-04 14:40:33.652 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 282: There are 9115 samples in train split.\n", + "2019-09-04 14:40:33.652 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 283: There are 6231 samples in valid split.\n", + "2019-09-04 14:40:33.653 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 284: There are 6345 samples in test split.\n" + ] + } + ], + "source": [ + "from deeppavlov.dataset_readers.dstc2_reader import SimpleDSTC2DatasetReader\n", + "\n", + "data = SimpleDSTC2DatasetReader().read('my_data')" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "simple-dstc2-templates.txt simple-dstc2-tst.json\r\n", + "simple-dstc2-trn.json\t simple-dstc2-val.json\r\n" + ] + } + ], + "source": [ + "!ls my_data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The training/validation/test data is stored in json files (`simple-dstc2-trn.json`, `simple-dstc2-val.json` and `simple-dstc2-tst.json`):" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[\r\n", + " [\r\n", + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"Hello, welcome to the Cambridge restaurant system. You can ask for restaurants by area, price range or food type. How may I help you?\",\r\n", + " \"slots\": [],\r\n", + " \"act\": \"welcomemsg\"\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 1,\r\n", + " \"text\": \"cheap restaurant\",\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"pricerange\",\r\n", + " \"cheap\"\r\n", + " ]\r\n", + " ]\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"What kind of food would you like?\",\r\n", + " \"slots\": [],\r\n", + " \"act\": \"request_food\"\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 1,\r\n", + " \"text\": \"any\",\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"this\",\r\n", + " \"dontcare\"\r\n", + " ]\r\n", + " ]\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"What part of town do you have in mind?\",\r\n", + " \"slots\": [],\r\n", + " \"act\": \"request_area\"\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 1,\r\n", + " \"text\": \"south\",\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"area\",\r\n", + " \"south\"\r\n", + " ]\r\n", + " ]\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"api_call area=\\\"south\\\" food=\\\"dontcare\\\" pricerange=\\\"cheap\\\"\",\r\n", + " \"db_result\": {\r\n", + " \"food\": \"chinese\",\r\n", + " \"pricerange\": \"cheap\",\r\n", + " \"area\": \"south\",\r\n", + " \"addr\": \"cambridge leisure park clifton way cherry hinton\",\r\n", + " \"phone\": \"01223 244277\",\r\n", + " \"postcode\": \"c.b 1, 7 d.y\",\r\n", + " \"name\": \"the lucky star\"\r\n", + " },\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"area\",\r\n", + " \"south\"\r\n", + " ],\r\n", + " [\r\n", + " \"pricerange\",\r\n", + " \"cheap\"\r\n", + " ],\r\n", + " [\r\n", + " \"food\",\r\n", + " \"dontcare\"\r\n", + " ]\r\n", + " ],\r\n", + " \"act\": \"api_call\"\r\n", + " },\r\n", + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"The lucky star is a nice place in the south of town serving tasty chinese food.\",\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"area\",\r\n", + " \"south\"\r\n", + " ],\r\n", + " [\r\n", + " \"pricerange\",\r\n", + " \"cheap\"\r\n", + " ],\r\n", + " [\r\n", + " \"name\",\r\n", + " \"the lucky star\"\r\n", + " ],\r\n", + " [\r\n", + " \"food\",\r\n", + " \"chinese\"\r\n", + " ]\r\n", + " ],\r\n", + " \"act\": \"inform_area+inform_food+offer_name\"\r\n", + " },\r\n" + ] + } + ], + "source": [ + "!head -n 101 my_data/simple-dstc2-trn.json" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from deeppavlov.dataset_iterators.dialog_iterator import DialogDatasetIterator\n", + "\n", + "iterator = DialogDatasetIterator(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can now iterate over batches of preprocessed DSTC-2 dialogs:" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "User utterances:\n", + "----------------\n", + "\n", + "[ {'prev_resp_act': None, 'text': ''},\n", + " { 'prev_resp_act': 'welcomemsg',\n", + " 'slots': [['pricerange', 'moderate'], ['area', 'north']],\n", + " 'text': 'im looking for a moderately priced restaurant in the north '\n", + " 'part of town'},\n", + " { 'db_result': { 'addr': '7 milton road chesterton',\n", + " 'area': 'north',\n", + " 'food': 'indian',\n", + " 'name': 'the nirala',\n", + " 'phone': '01223 360966',\n", + " 'postcode': 'c.b 4, 1 u.y',\n", + " 'pricerange': 'moderate'},\n", + " 'prev_resp_act': 'api_call',\n", + " 'slots': [['pricerange', 'moderate'], ['area', 'north']],\n", + " 'text': 'im looking for a moderately priced restaurant in the north '\n", + " 'part of town'},\n", + " { 'prev_resp_act': 'inform_area+inform_pricerange+offer_name',\n", + " 'slots': [['slot', 'phone']],\n", + " 'text': 'what is the phone number'},\n", + " {'prev_resp_act': 'inform_phone+offer_name', 'text': 'thank you goodbye'}]\n", + "\n", + "System responses:\n", + "-----------------\n", + "\n", + "[ { 'act': 'welcomemsg',\n", + " 'text': 'Hello, welcome to the Cambridge restaurant system. You can '\n", + " 'ask for restaurants by area, price range or food type. How '\n", + " 'may I help you?'},\n", + " { 'act': 'api_call',\n", + " 'slots': [['pricerange', 'moderate'], ['area', 'north']],\n", + " 'text': 'api_call area=\"north\" food=\"dontcare\" pricerange=\"moderate\"'},\n", + " { 'act': 'inform_area+inform_pricerange+offer_name',\n", + " 'slots': [ ['pricerange', 'moderate'],\n", + " ['area', 'north'],\n", + " ['name', 'the nirala']],\n", + " 'text': 'The nirala is a nice place in the north of town and the '\n", + " 'prices are moderate.'},\n", + " { 'act': 'inform_phone+offer_name',\n", + " 'slots': [['phone', '01223 360966'], ['name', 'the nirala']],\n", + " 'text': 'The phone number of the nirala is 01223 360966.'},\n", + " {'act': 'bye', 'text': 'You are welcome!'}]\n" + ] + } + ], + "source": [ + "from pprint import pprint\n", + "\n", + "for dialog in iterator.gen_batches(batch_size=1, data_type='train'):\n", + " turns_x, turns_y = dialog\n", + " \n", + " print(\"User utterances:\\n----------------\\n\")\n", + " pprint(turns_x[0], indent=4)\n", + " print(\"\\nSystem responses:\\n-----------------\\n\")\n", + " pprint(turns_y[0], indent=4)\n", + " \n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "!cp my_data/simple-dstc2-trn.json my_data/simple-dstc2-trn.full.json" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train set is reduced to 50 dialogues (out of 967).\n" + ] + } + ], + "source": [ + "import json\n", + "\n", + "NUM_TRAIN = 50\n", + "\n", + "with open('my_data/simple-dstc2-trn.full.json', 'rt') as fin:\n", + " data = json.load(fin)\n", + "with open('my_data/simple-dstc2-trn.json', 'wt') as fout:\n", + " json.dump(data[:NUM_TRAIN], fout, indent=2)\n", + "print(f\"Train set is reduced to {NUM_TRAIN} dialogues (out of {len(data)}).\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Build database of items" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " \n", + "![gobot_database.png](img/gobot_database.png)\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For a valid goal-oriented bot there should be a `database` of relevant items. In the case of restaurant booking it will contain all available restaurants and their info.\n", + "\n", + " >> database([{'pricerange': 'cheap', 'area': 'south'}])\n", + " \n", + " Out[1]: \n", + " [[{'name': 'the lucky star',\n", + " 'food': 'chinese',\n", + " 'pricerange': 'cheap',\n", + " 'area': 'south',\n", + " 'addr': 'cambridge leisure park clifton way cherry hinton',\n", + " 'phone': '01223 244277',\n", + " 'postcode': 'c.b 1, 7 d.y'},\n", + " {'name': 'nandos',\n", + " 'food': 'portuguese',\n", + " 'pricerange': 'cheap',\n", + " 'area': 'south',\n", + " 'addr': 'cambridge leisure park clifton way',\n", + " 'phone': '01223 327908',\n", + " 'postcode': 'c.b 1, 7 d.y'}]]\n", + " \n", + "The dialogues in the training dataset should contain a `\"db_result\"` dictionary key. It is required for turns where system performs a special type of external action: an api call to the database of items. `\"db_result\"` should contain the result of the api call:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " {\r\n", + " \"speaker\": 2,\r\n", + " \"text\": \"api_call area=\\\"south\\\" food=\\\"dontcare\\\" pricerange=\\\"cheap\\\"\",\r\n", + " \"db_result\": {\r\n", + " \"food\": \"chinese\",\r\n", + " \"pricerange\": \"cheap\",\r\n", + " \"area\": \"south\",\r\n", + " \"addr\": \"cambridge leisure park clifton way cherry hinton\",\r\n", + " \"phone\": \"01223 244277\",\r\n", + " \"postcode\": \"c.b 1, 7 d.y\",\r\n", + " \"name\": \"the lucky star\"\r\n", + " },\r\n", + " \"slots\": [\r\n", + " [\r\n", + " \"area\",\r\n", + " \"south\"\r\n", + " ],\r\n", + " [\r\n", + " \"pricerange\",\r\n", + " \"cheap\"\r\n", + " ],\r\n", + " [\r\n", + " \"food\",\r\n", + " \"dontcare\"\r\n", + " ]\r\n", + " ],\r\n", + " \"act\": \"api_call\"\r\n", + " },\r\n" + ] + } + ], + "source": [ + "!head -n 78 my_data/simple-dstc2-trn.json | tail +51" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2019-09-04 14:40:49.312 WARNING in 'deeppavlov.core.models.serializable'['serializable'] at line 47: No load path is set for Sqlite3Database in 'infer' mode. Using save path instead\n", + "2019-09-04 14:40:49.313 INFO in 'deeppavlov.core.data.sqlite_database'['sqlite_database'] at line 70: Initializing empty database on /home/vimary/code-projects/Pilot/examples/my_bot/db.sqlite.\n" + ] + } + ], + "source": [ + "from deeppavlov.core.data.sqlite_database import Sqlite3Database\n", + "\n", + "database = Sqlite3Database(primary_keys=[\"name\"],\n", + " save_path=\"my_bot/db.sqlite\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Set `primary_keys` to a list of slot names that have unique values for different items (common SQL term). For the case of DSTC-2, the primary slot is restaurant name.\n", + "\n", + "Let's find all `\"db_result\"` api call results and add it to our database of restaurants:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2019-09-04 14:40:50.332 INFO in 'deeppavlov.core.data.sqlite_database'['sqlite_database'] at line 145: Created table with keys {'food': 'text', 'postcode': 'text', 'pricerange': 'text', 'area': 'text', 'phone': 'text', 'name': 'text', 'addr': 'text'}.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Adding 3016 items.\n" + ] + } + ], + "source": [ + "db_results = []\n", + "\n", + "for dialog in iterator.gen_batches(batch_size=1, data_type='all'):\n", + " turns_x, turns_y = dialog\n", + " db_results.extend(x['db_result'] for x in turns_x[0] if x.get('db_result'))\n", + "\n", + "print(f\"Adding {len(db_results)} items.\")\n", + "if db_results:\n", + " database.fit(db_results)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Interacting with database" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now play with the database and make requests to it:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[[{'food': 'chinese',\n", + " 'postcode': 'c.b 1, 7 d.y',\n", + " 'pricerange': 'cheap',\n", + " 'area': 'south',\n", + " 'phone': '01223 244277',\n", + " 'name': 'the lucky star',\n", + " 'addr': 'cambridge leisure park clifton way cherry hinton'},\n", + " {'food': 'portuguese',\n", + " 'postcode': 'c.b 1, 7 d.y',\n", + " 'pricerange': 'cheap',\n", + " 'area': 'south',\n", + " 'phone': '01223 327908',\n", + " 'name': 'nandos',\n", + " 'addr': 'cambridge leisure park clifton way'}]]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "database([{'pricerange': 'cheap', 'area': 'south'}])" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "db.sqlite\r\n" + ] + } + ], + "source": [ + "!ls my_bot" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Build Slot Filler" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " \n", + "![gobot_slotfiller.png](img/gobot_slotfiller.png)\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Slot Filler is component that inputs text and outputs dictionary of slot names and their values:\n", + "\n", + " >> slot_filler(['I would like some chineese food'])\n", + " \n", + " Out[1]: [{'food': 'chinese'}]\n", + "\n", + "To implement a slot filler you need to provide\n", + " \n", + " - **slot types**\n", + " - all possible **slot values**\n", + " - optionally, it will be good to provide examples of mentions for every value of each slot\n", + " \n", + "The data should be in `slot_vals.json` file with the following format:\n", + "\n", + " {\n", + " 'food': {\n", + " 'chinese': ['chinese', 'chineese', 'chines'],\n", + " 'french': ['french', 'freench'],\n", + " 'dontcare': ['any food', 'any type of food']\n", + " }\n", + " }\n", + " \n", + "\n", + "Let's use a simple non-trainable slot filler that relies on levenshtein distance:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2019-09-04 14:40:53.225 INFO in 'deeppavlov.core.data.utils'['utils'] at line 63: Downloading from http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz to my_bot/slotfill/dstc_slot_vals.tar.gz\n", + "100%|██████████| 1.62k/1.62k [00:00<00:00, 11.1MB/s]\n", + "2019-09-04 14:40:53.227 INFO in 'deeppavlov.core.data.utils'['utils'] at line 201: Extracting my_bot/slotfill/dstc_slot_vals.tar.gz archive into my_bot/slotfill\n" + ] + } + ], + "source": [ + "from deeppavlov.download import download_decompress\n", + "\n", + "download_decompress(url='http://files.deeppavlov.ai/deeppavlov_data/dstc_slot_vals.tar.gz',\n", + " download_path='my_bot/slotfill')" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "dstc_slot_vals.json\r\n" + ] + } + ], + "source": [ + "!ls my_bot/slotfill" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\r\n", + " \"food\": {\r\n", + " \"caribbean\": [\r\n", + " \"carraibean\",\r\n", + " \"carribean\",\r\n", + " \"caribbean\"\r\n", + " ],\r\n", + " \"kosher\": [\r\n", + " \"kosher\"\r\n", + " ],\r\n" + ] + } + ], + "source": [ + "!head -n 10 my_bot/slotfill/dstc_slot_vals.json" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Metric scores on valid&test" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's check performance of our slot filler on DSTC-2 dataset:" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "from deeppavlov import configs\n", + "from deeppavlov.core.common.file import read_json\n", + "\n", + "slotfill_config = read_json(configs.ner.slotfill_simple_dstc2_raw)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We take [original DSTC2 slot-filling config](https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/ner/slotfill_dstc2_raw.json) and change variables determining data paths:" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "slotfill_config['metadata']['variables']['DATA_PATH'] = 'my_data'\n", + "slotfill_config['metadata']['variables']['SLOT_VALS_PATH'] = 'my_bot/slotfill/dstc_slot_vals.json'" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2019-09-04 14:40:55.992 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from /home/vimary/code-projects/Pilot/examples/my_data/simple-dstc2-trn.json]\n", + "2019-09-04 14:40:55.999 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from /home/vimary/code-projects/Pilot/examples/my_data/simple-dstc2-val.json]\n", + "2019-09-04 14:40:56.105 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 290: [loading dialogs from /home/vimary/code-projects/Pilot/examples/my_data/simple-dstc2-tst.json]\n", + "2019-09-04 14:40:56.150 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 282: There are 479 samples in train split.\n", + "2019-09-04 14:40:56.151 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 283: There are 6231 samples in valid split.\n", + "2019-09-04 14:40:56.151 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 284: There are 6345 samples in test split.\n", + "[nltk_data] Downloading package punkt to /home/vimary/nltk_data...\n", + "[nltk_data] Package punkt is already up-to-date!\n", + "[nltk_data] Downloading package stopwords to /home/vimary/nltk_data...\n", + "[nltk_data] Package stopwords is already up-to-date!\n", + "[nltk_data] Downloading package perluniprops to\n", + "[nltk_data] /home/vimary/nltk_data...\n", + "[nltk_data] Package perluniprops is already up-to-date!\n", + "[nltk_data] Downloading package nonbreaking_prefixes to\n", + "[nltk_data] /home/vimary/nltk_data...\n", + "[nltk_data] Package nonbreaking_prefixes is already up-to-date!\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\"valid\": {\"eval_examples_count\": 1253, \"metrics\": {\"slots_accuracy\": 0.933}, \"time_spent\": \"0:00:34\"}}\n", + "{\"test\": {\"eval_examples_count\": 1190, \"metrics\": {\"slots_accuracy\": 0.9487}, \"time_spent\": \"0:00:31\"}}\n" + ] + } + ], + "source": [ + "from deeppavlov import evaluate_model\n", + "\n", + "slotfill = evaluate_model(slotfill_config);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've got slot accuracy of **93% on valid** set and **94% on test** set." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Interacting with slot filler" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "from deeppavlov import build_model\n", + "\n", + "slotfill = build_model(slotfill_config)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'food': 'chinese', 'pricerange': 'cheap'}]" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "slotfill(['i want cheap chinee food'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Dumping slot filler's config" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Saving slotfill config file to disk (we will require it's path later):" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "\n", + "json.dump(slotfill_config, open('my_bot/slotfill_config.json', 'wt'))" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "db.sqlite slotfill slotfill_config.json\r\n" + ] + } + ], + "source": [ + "!ls my_bot" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Train bot" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's assemble all modules together and train the final module: dialogue policy network.\n", + "\n", + " \n", + "![gobot_policy.png](img/gobot_policy.png)\n", + " \n", + "\n", + "Policy network decides which action the system should take on each turn of a dialogue: should it say goodbye, request user's location or make api call to a database.\n", + "\n", + "The policy network is a recurrent neural network (recurrent over utterances represented as bags of words) and a dense layer with softmax function on top. The network classifies user utterance into one of predefined system actions.\n", + "\n", + " \n", + "![gobot_templates.png](img/gobot_templates.png)\n", + " \n", + "\n", + "All actions available for the system should be listed in a `simple-dstc2-templates.txt` file. Each action should be associated with a string of the corresponding system response.\n", + "\n", + "Templates should be in the format `TAB