Effective Idiomaticity Detection with Consideration at Different Levels of Contextualization

NAACL SemEval 2022 [Paper]

Summary

This research was presents a solution to SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding. The task focuses on determining whether a Multi-Word Expression (MWE) within a sentence is used idiomatically or not in a multilingual setting. Identifying idiomatic expressions in sentences using Large Language Models (LLMs) remains a challenging problem. This study explores how to effectively perform sentence embedding to address this task.

Core Idea

The expression “wet blanket,” when interpreted through the literal meanings of its individual words, refers to “a soaked cover.” However, in the context of a sentence, it is often used idiomatically to mean “a person who spoils the mood.” In other words, if the meaning derived from the sentence context differs from the meaning based solely on the combination of the individual words, it can be identified as an idiomatic expression. Building on this idea, when a Multi-Word Expression (MWE) is given, wouldn’t it be possible to effectively capture idiomaticity by generating semantic embeddings that combine the contextual embeddings of each word with their static embeddings (representing the literal combination of word meanings)?

Methods

We propose a framework for embedding MWEs and their related sentences that utilizes both contextualized and static representations to maximize semantic information. For more details, please refer to the paper.

Results

run

You can download the data from here.

python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
__pycache__		__pycache__
utils		utils
.DS_Store		.DS_Store
README.md		README.md
data_loader.py		data_loader.py
data_preproc.py		data_preproc.py
main.py		main.py
main_config.cfg		main_config.cfg
modeling.py		modeling.py
run_classifier_dataset_utils.py		run_classifier_dataset_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Effective Idiomaticity Detection with Consideration at Different Levels of Contextualization

Summary

Core Idea

Methods

Results

run

About

Releases

Packages

Languages

ojoo-J/Multilingual-Idiomaticity-Detection

Folders and files

Latest commit

History

Repository files navigation

Effective Idiomaticity Detection with Consideration at Different Levels of Contextualization

Summary

Core Idea

Methods

Results

run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages