Program
Day 1: Thursday, December 5, 2024
Chair: Heike Zinsmeister |
|
09:30-10:00 |
TLT Opening |
10:00-11:00 |
Invited Speaker: Anna Nedoluzhko Multilingual Coreference and Treebanking: Benefits of Interaction [slides] |
11:00-11:30 | Coffee break |
Chair: Kiril Simov |
|
11:30-12:15 | Symmetric Dependency Structure of Coordination: Crosslinguistic Arguments from Dependency Length Minimization [slides]
Adam Przepiórkowski, Magdalena Borysiak, Adam Okrasiński, Bartosz, Pobożniak, Wojciech Stempniak, Kamil Tomaszek, Adam Głowacki |
12:15-13:00 |
Dependency Structure of Coordination in Head-final Languages: a Dependency-Length-Minimization-Based Study [slides] Wojciech Stempniak |
13:00-14:30 | Lunch break |
Chair: Stefanie Dipper |
|
14:30-15:15 |
Developing the Egyptian-UJaen Treebank [slides] Roberto A. Diaz Hernandez, Marco Carlo Passarotti |
15:15-16:00 |
A First Look at the Ugaritic Poetic Text Corpus Tillmann Dönicke, Clemens Steinberger, Max-Ferdinand Zeterberg, Noah Kröll |
16:00-16:30 | Coffee break |
Chair: Sandra Kübler |
|
16:30-17:15 |
Building a Universal Dependencies Treebank for Georgian Irina Lobzhanidze, Erekle Magradze, Svetlana Berikashvili, Anzor Gozalishvili, Tata Jalaghonia |
17:15-18:00 |
LuxBank: The First Universal Dependency Treebank for Luxembourgish Alistair Plum, Caroline Döhmer, Emilia Milano, Anne-Marie Lutgen, Christoph Purschke |
18:45 |
Conference dinner Trattoria da Mario, Thadenstr. 1 Ecke Wohlwillstraße, Hamburg St. Pauli |
Day 2: Friday December 6, 2024
Chair: Sarah Jablotschkin |
|
09:15-10:00 |
Stefanie Dipper, Ronja Laarmann-Quante |
10:00-10:45 |
Andrew Thomas Dyer, Ruveyda Betül Bahçeci, Maryam Rajestari, Andreas Rouvalis, Aarushi Singhal, Yuliya Stodolinska, Syahidah Asma Umniyati |
10:45-11:15 | Coffee break |
Chair: Daniel Dakota |
|
11:15-12:00 |
Introducing Shallow Syntactic Information within the Graph-based Dependency Parsing [slides] Nikolay Paev, Kiril Ivanov Simov, Petya Osenova |
12:00-13:00 |
Invited Speaker: Marcel Bollmann Increasing language diversity in NLP: Insights from CreoleVal [slides] |
13:00-14:30 | Lunch break |
Chair: Heike Zinsmeister | |
14:30-16:30 |
Panel: Treebanks and linguistic annotation in the area of LLMs Marcel Bollmann, Daniel Dakota, Sandra Kübler, Anna Nedoluzhko, Juri Opitz |
16:30-16:45 | Closing |
Invited speaker
Multilingual Coreference and Treebanking: Benefits of Interaction
Anna Nedoluzhko (Charles University, Prague)
Several years ago, we created CorefUD, a harmonized collection of coreference datasets for multiple languages. This collection has grown steadily, with new languages and datasets added each year. Currently, CorefUD 1.2 includes 21 datasets across 15 languages. CorefUD is compatible with morphosyntactic annotations in the Universal Dependencies (UD) framework, highlighting the close relationship between two types of linguistic annotation: coreference and syntax. But how do these annotations interact? Do UD tree structures correspond to mention spans in coreference annotations? Are syntactic heads in UD equivalent to the head mentions in coreference annotation? Can reconstructed empty nodes in enhanced UD effectively align with zero anaphora? And how do zeros in coreference relate to syntactic structures across the diverse languages in the collection? In the talk, I will address these questions with a specific focus on zero anaphora which was the special topic of the recent CRAC shared task on multilingual coreference resolution. [slides]
Invited speaker
Increasing language diversity in NLP: Insights from CreoleVal
Marcel Bollmann (Linköping University)
Linguistic diversity in NLP remains an important challenge, with many languages lagging behind in terms of available data and resources for training and evaluation of NLP models. In this talk, I will present CreoleVal, a project aimed at providing an evaluation benchmark for several Creole languages. I will discuss why we chose to work on Creoles in particular, what kinds of data and annotations we produced for CreoleVal, and what challenges we encountered in the process. Finally, I will give an outlook on challenges around data and data annotation in the TrustLLM project, an ongoing EU-funded project on creating trustworthy LLMs for the Germanic languages. [slides]
Panel
Treebanks and linguistic annotation in the area of LLMs
Panelists: Marcel Bollmann (Linköping University), Daniel Dakota (Indiana University), Sandra Kübler (Indiana University), Anna Nedoluzhko (Charles University Prague), Juri Opitz (University of Zurich)
Chair: Heike Zinsmeister (University of Hamburg)
Guiding questions:
- Do LLMs make treebanks redundant?
- What can we learn from treebanks that we can’t learn from LLMs?
- Is it still justified to spend money on creating and maintaining treebanks?
90 min. statements and comments by the panelists, 30 min. general discussion.