We invite participants to benchmark systems for word sense induction (WSI) across multiple languages. Unlike traditional approaches, this task evaluates WSI without relying on predefined sense inventories, offering a more theoretically plausible framework for understanding word meanings.

Participants will be provided with:

  • A set of polysemous target headwords.
  • Sentences containing these words in diverse contexts.

The goal is to cluster the sentences according to the sense in which the target word is used.

A different set of headwords and contexts will be provided for each of the following languages: English, Czech, German, Spanish, Estonian, Chinese

For each language, there will be approximately 25 headwords with 1500+ contexts each.

No comments yet!

Linguistics

!linguistics@mander.xyz

Create post

Welcome to the community about the science of human Language!

Everyone is welcome here: from laymen to professionals, Historical linguists to discourse analysts, structuralists to generativists.

Rules:

  1. Stay on-topic. Specially for more divisive subjects.
  2. Post sources whenever reasonable to do so.
  3. Avoid crack theories and pseudoscientific claims.
  4. Have fun!

Related communities:

Community stats

  • 437

    Monthly active users

  • 64

    Posts

  • 155

    Comments