From 7c2b4684340f6d45a31aeea4180afd0c2b87c5a5 Mon Sep 17 00:00:00 2001
From: cjer <9987721+cjer@users.noreply.github.com>
Date: Tue, 7 Sep 2021 10:59:06 +0300
Subject: [PATCH 1/5] Update api_usage.ipynb
---
api/api_usage.ipynb | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/api/api_usage.ipynb b/api/api_usage.ipynb
index a59f969..f12e8b0 100644
--- a/api/api_usage.ipynb
+++ b/api/api_usage.ipynb
@@ -656,8 +656,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Working with sequence labels\n",
- "`iobes` can be used to parse the predictions (`pip install iobes`)"
+ "## Display and view ents\n",
]
},
{
From 53c43fbd0f2c08f009b18d2a289efc011f7d09b7 Mon Sep 17 00:00:00 2001
From: cjer <9987721+cjer@users.noreply.github.com>
Date: Tue, 7 Sep 2021 11:00:40 +0300
Subject: [PATCH 2/5] Update api_usage.ipynb
---
api/api_usage.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/api/api_usage.ipynb b/api/api_usage.ipynb
index f12e8b0..aa18355 100644
--- a/api/api_usage.ipynb
+++ b/api/api_usage.ipynb
@@ -656,7 +656,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Display and view ents\n",
+ "## Display and view ents\n"
]
},
{
From 96a980805e4c42be5b248fb343da748cd061c6ab Mon Sep 17 00:00:00 2001
From: cjer <9987721+cjer@users.noreply.github.com>
Date: Thu, 9 Sep 2021 13:20:23 +0300
Subject: [PATCH 3/5] Update README.md
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 1b3b92f..2ba0f4f 100644
--- a/README.md
+++ b/README.md
@@ -106,7 +106,7 @@ Finally, to get our desired output (tokens/morphemes), we can choose between dif
|
|
|
|
|`run_ner_model token-single` | `multi_to_single` | `morph_hybrid_align_tokens` |
-* Note: while the `morph_hybrid*` scenarios offer the best performance, they are less efficient since they requires running both `morph` and `token-multi` NER models.
+* Note: while the `morph_hybrid*` scenarios offer the best performance, they are slightly less efficient since they requires running both `morph` and `token-multi` NER models (yap calls take up most of the runtime anyway, so this is not extremely significant).
## Important Notes
From 8d1217c70ceee004842e5902872d846af8fb8176 Mon Sep 17 00:00:00 2001
From: cjer <9987721+cjer@users.noreply.github.com>
Date: Sun, 12 Sep 2021 11:51:52 +0300
Subject: [PATCH 4/5] update to MIT press citation
---
README.md | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/README.md b/README.md
index 2ba0f4f..e7d7e9e 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ Table of Contents
## Introduction
-Code and models for neural modeling of Hebrew NER. Described in the TACL paper ["*Neural Modeling for Named Entities and Morphology (NEMO2)"*](https://arxiv.org/abs/2007.15620) along with extensive experiments on the different modeling scenarios provided in this repository.
+Code and models for neural modeling of Hebrew NER. Described in the TACL paper ["*Neural Modeling for Named Entities and Morphology (NEMO2)"*](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00404/107206/Neural-Modeling-for-Named-Entities-and-Morphology) along with extensive experiments on the different modeling scenarios provided in this repository.
## Main Features
@@ -155,19 +155,19 @@ In our NEMO2 paper we also evaluate our models on the [Ben-Mordecai H
If you use any of the NEMO2 code, models, embeddings or the NEMO corpus, please cite the NEMO2 paper:
```bibtex
-@article{DBLP:journals/corr/abs-2007-15620,
- author = {Dan Bareket and
- Reut Tsarfaty},
- title = {Neural Modeling for Named Entities and Morphology (NEMO{\^{}}2)},
- journal = {CoRR},
- volume = {abs/2007.15620},
- year = {2020},
- url = {https://arxiv.org/abs/2007.15620},
- archivePrefix = {arXiv},
- eprint = {2007.15620},
- timestamp = {Mon, 03 Aug 2020 14:32:13 +0200},
- biburl = {https://dblp.org/rec/journals/corr/abs-2007-15620.bib},
- bibsource = {dblp computer science bibliography, https://dblp.org}
+@article{10.1162/tacl_a_00404,
+ author = {Bareket, Dan and Tsarfaty, Reut},
+ title = "{Neural Modeling for Named Entities and Morphology (NEMO2)}",
+ journal = {Transactions of the Association for Computational Linguistics},
+ volume = {9},
+ pages = {909-928},
+ year = {2021},
+ month = {09},
+ abstract = "{Named Entity Recognition (NER) is a fundamental NLP task, commonly formulated as classification over a sequence of tokens. Morphologically rich languages (MRLs) pose a challenge to this basic formulation, as the boundaries of named entities do not necessarily coincide with token boundaries, rather, they respect morphological boundaries. To address NER in MRLs we then need to answer two fundamental questions, namely, what are the basic units to be labeled, and how can these units be detected and classified in realistic settings (i.e., where no gold morphology is available). We empirically investigate these questions on a novel NER benchmark, with parallel token- level and morpheme-level NER annotations, which we develop for Modern Hebrew, a morphologically rich-and-ambiguous language. Our results show that explicitly modeling morphological boundaries leads to improved NER performance, and that a novel hybrid architecture, in which NER precedes and prunes morphological decomposition, greatly outperforms the standard pipeline, where morphological decomposition strictly precedes NER, setting a new performance bar for both Hebrew NER and Hebrew morphological decomposition tasks.}",
+ issn = {2307-387X},
+ doi = {10.1162/tacl_a_00404},
+ url = {https://doi.org/10.1162/tacl\_a\_00404},
+ eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00404/1962472/tacl\_a\_00404.pdf},
}
```
From 566633677cb56ddc718ae901c673e9208d502b10 Mon Sep 17 00:00:00 2001
From: cjer <9987721+cjer@users.noreply.github.com>
Date: Tue, 14 Sep 2021 13:13:52 +0300
Subject: [PATCH 5/5] Bump API version to 0.2.0
---
api_main.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/api_main.py b/api_main.py
index 9918888..242a1a8 100644
--- a/api_main.py
+++ b/api_main.py
@@ -453,7 +453,7 @@ def get_spans(doc, token_fields=None, morph_fields=None):
app = FastAPI(
title="NEMO",
description=description,
- version="0.1.0",
+ version="0.2.0",
terms_of_service="https://github.com/OnlpLab/NEMO",
contact={
"name": "Dan Bareket",
@@ -739,4 +739,4 @@ def morph_hybrid_align_tokens(q: NEMOQuery,
include_yap_outputs: Optional[bool]=False):
return morph_hybrid(q, multi_model_name, morph_model_name, align_tokens=True,
verbose=verbose, include_yap_outputs=include_yap_outputs)
-
\ No newline at end of file
+