Cet évènement est passé.

Journée “apprentissage des représentations de la parole et du langage”

Name: Journée “apprentissage des représentations de la parole et du langage”
Start: 2022-04-01T00:00:00+02:00
End: 2022-04-01T23:59:59+02:00
Location: Auditorium IMAG

1 avril 2022

Bonjour à tous (english version below),

Dans le cadre du GDR Traitement automatique des langues (GDR TAL), nous organisons une journée scientifique sur « l’apprentissage des représentations de la parole et du langage ». Cette journée aura lieu à Grenoble le Vendredi 1er avril 2022. L’objectif de cette journée est de réunir des chercheurs des communautés du traitement automatique des langues écrites, orales et gestuelles autour de l’étude des représentations extraites par les modèles neuronaux profonds à partir de données massives. Ces modèles sont en effet aujourd’hui utilisés dans la plupart des tâches classiques du TAL, en apprentissage supervisé pour la mise en correspondance des différentes modalités du langage et de la communication parlée (ex: ASR, Text-to-speech), comme en apprentissage auto-supervisé pour l’extraction de représentations transférables à des tâches connexes (par exemple à faibles ressources).

La journée sera organisée autour de 4 présentations orales et d’une session poster. Les orateurs invités sont :

Pr. James Glass (MIT, USA) – https://www.csail.mit.edu/person/jim-glass

James Glass is a Senior Research Scientist at MIT where he leads the Spoken Language Systems Group in the Computer Science and Artificial Intelligence Laboratory. He is also a member of the Harvard University Program in Speech and Hearing Bioscience and Technology. Since obtaining his S.M. and Ph.D. degrees at MIT in Electrical Engineering and Computer Science, his research has focused on a wide range of speech and natural language processing topics. His current research activities include self-supervised or weakly supervised learning for speech and language processing, cross-modal learning between speech and vision, speech and language processing as a biomarker for health, and conversational interaction. He is an IEEE Fellow, and a Fellow of the International Speech Communication Association, and is currently an Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence.
Dr. Diane Larlus (Naver Labs Europe) – https://dlarlus.github.io

Diane Larlus is a principal research scientist at Naver Labs Europe and leads a Chair on Lifelong representation learning within the MIAI research institute of Grenoble. Her research mainly focuses on learning representations with weak supervision, continual learning, and visual search. As a PhD student, she worked at INRIA Grenoble, France. After a postdoctoral experience at TU Darmstadt, Germany, she joined the European research center of Xerox. She now works at Naver Labs Europe.
Dr. Lucas Ondel (LISN, Paris) – https://lucasondel.github.io

Lucas Ondel is a post-doctoral researcher at LISN and holds a PhD from Brno University of Technology. His research focuses on unsupervised learning methods applied to the speech signal with the aim of automatically documenting languages. He has been a very active member of the “unsupervised speech” community (JSALT 2016, 2017, Zero Resource Challenge 2019) and he his the creator of BEER (the Bayesian spEEch Recognizer): a toolkit to automatically learn acoustic units from unlabeled speech.
Dr. Guillaume Wisniewski (LLF, Paris) – https://gw17.github.io

Guillaume Wisniewski is an Assistant Professor of Computational Linguistics at Université Paris Cité and a member of the Laboratoire de Linguistique Formelle (LLF). His research interests lie in the application of Machine Learning to study language, especially in a multilingual and low-resource context. His recent work focuses on the question of explaining the ability of neural network to solve an impressive numbers of NLP tasks and, more precisely, on finding whether the contextualized distributed representation they are able to uncover from raw texts are consistent with models derived from linguistic theories.

La session poster est ouverte à tous mais est particulièrement destinée aux doctorants, elle permettra d’échanger et de nouer des contacts avec les différents participants. Nous sollicitons des propositions de communications sur les thèmes suivants :

les algorithmes d’apprentissage auto-supervisé de représentations de la parole (wav2vec, CPC, Hubert, etc)
l’apprentissage de représentations multimodales (par exemple texte/vision, parole audiovisuelle) et le transfert de représentations entre modalités (ex: utilisation des modèles de langage dans les systèmes de reconnaissance ou de synthèse vocale)
les techniques de désenchevêtrement (disantenglement) appliquées à la parole et au langage
l’interprétabilité des représentations et leurs relations avec les différents niveaux linguistiques
le contrôle des représentations (dimensions latentes) dans les systèmes performatifs
les liens avec l’apprentissage du langage par l’humain (découverte des unités du langage, apprentissage sensori-moteur).
et tout autre sujet en relation avec le thème de la journée.

Les inscriptions pour la journée et le déjeuner sont closes (mais en cas de soucis, n’hésitez pas à nous contacter). La journée sera partiellement diffusée via zoom (https://grenoble-inp.zoom.us/j/93740876038 / Code secret : 988791)

Programme (prévisionnel) de la journée :

9h – 9h15: Accueil des participants
9h15: 9h30 : Présentation introductive de la journée par le comité d’organisation
9h30- 10h15 : Présentation Dr. Lucas Ondel (LISN, Paris) : “Using Continuums to Learn the Phonetic Structure of Speech”
10h15-10h45 : Pause café
10:45-11h30: Présentation Dr. Guillaume Wisniewski (LLF, Paris) : “The long distance agreement task: assessing the capacity of NLM to uncover different representation for superficially similar but syntactically different structures”
11h30: 12h : Beatrice Daille (LS2N, responsable du GDR TAL) : Nouvelles du GDR TAL ; Vincent Claveau (IRISA) et DGA/AID : Appel à projets coopératifs DGA/AID et GDR TAL CNRS
12h-14h : Repas
14h-15h : Présentation Pr. Jame Glass (MIT, en visio depuis les USA) : “Towards Unsupervised Speech Processing”
15-16h15: Session poster (et pause café)
16h15-17h : Présentation Dr. Diane Larlus (Naver Labs Europe) : Learning transferable visual representations
17h : Fin de la journée

Nous espérons vous y voir nombreux !

Le comité d’organisation

Thomas Hueber (GIPSA-lab, Grenoble)

Frédéric Béchet (LIS Marseille)

Marco Dinarelli (LIG, Grenoble)

Benoit Favre (LIS, Marseille)

Olivier Perrotin (GIPSA-lab, Grenoble)

———————

Dear colleagues,

In the framework of the GDR « Traitement automatique des langues » (GDR TAL), we organize a scientific day on « representation learning for speech and language”. This day will take place in Grenoble on Friday, April 1st, 2022. The goal is to gather researchers from speech and natural language processing communities, around the topic of representation learning with end-to-end deep neural networks. Indeed, these models are now used in most NLP and speech processing tasks to map different modalities of spoken language (e.g. ASR, Text-to-speech), or to extract high-level representations that can be transferred to downstream tasks (e.g. low-resource).

The day will be organized around 4 oral presentations and a poster session. The invited speakers are :

Pr. James Glass (MIT, USA) – https://www.csail.mit.edu/person/jim-glass

James Glass is a Senior Research Scientist at MIT where he leads the Spoken Language Systems Group in the Computer Science and Artificial Intelligence Laboratory. He is also a member of the Harvard University Program in Speech and Hearing Bioscience and Technology. Since obtaining his S.M. and Ph.D. degrees at MIT in Electrical Engineering and Computer Science, his research has focused on a wide range of speech and natural language processing topics. His current research activities include self-supervised or weakly supervised learning for speech and language processing, cross-modal learning between speech and vision, speech and language processing as a biomarker for health, and conversational interaction. He is an IEEE Fellow, and a Fellow of the International Speech Communication Association, and is currently an Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence.
Dr. Diane Larlus (Naver Labs Europe) – https://dlarlus.github.io

Diane Larlus is a principal research scientist at Naver Labs Europe and leads a Chair on Lifelong representation learning within the MIAI research institute of Grenoble. Her research mainly focuses on learning representations with weak supervision, continual learning, and visual search. As a PhD student, she worked at INRIA Grenoble, France. After a postdoctoral experience at TU Darmstadt, Germany, she joined the European research center of Xerox. She now works at Naver Labs Europe.
Dr. Lucas Ondel (LISN, Paris) – https://lucasondel.github.io

Lucas Ondel is a post-doctoral researcher at LISN and holds a PhD from Brno University of Technology. His research focuses on unsupervised learning methods applied to the speech signal with the aim of automatically documenting languages. He has been a very active member of the “unsupervised speech” community (JSALT 2016, 2017, Zero Resource Challenge 2019) and he his the creator of BEER (the Bayesian spEEch Recognizer): a toolkit to automatically learn acoustic units from unlabeled speech.
Dr. Guillaume Wisniewski (LLF, Paris) – https://gw17.github.io

Guillaume Wisniewski is an Assistant Professor of Computational Linguistics at Université Paris Cité and a member of the Laboratoire de Linguistique Formelle (LLF). His research interests lie in the application of Machine Learning to study language, especially in a multilingual and low-resource context. His recent work focuses on the question of explaining the ability of neural network to solve an impressive numbers of NLP tasks and, more precisely, on finding whether the contextualized distributed representation they are able to uncover from raw texts are consistent with models derived from linguistic theories.

The poster session is open to all but particularly to doctoral students, in order to encourage exchanges with the participants. We welcome contributions addressing, but not limited to, the following topics:

self-supervised learning of speech representations (wav2vec, CPC, Hubert, etc)
multimodal representation learning (e.g. text/vision, audiovisual speech) and transfer learning between modalities (e.g. use of language models in speech recognition or speech synthesis systems)
disentanglement techniques applied to speech and language
interpretability of latent representations and their relations with the different linguistic levels
control of learned representations in performative systems
relationships with human representation learning (speech unit discovery, sensory-motor learning).

Registration for the day and the lunch are now closed (but if you have any concerns, do not hesitate to contact us). The day will be partially broadcasted via zoom (https://grenoble-inp.zoom.us/j/93740876038 / Secret code : 988791)

Program:

9h – 9h15: Welcome of the participants
9h15: 9h30 : Introductory talk by the organizers
9h30- 10h15 : Talk by Dr. Lucas Ondel (LISN, Paris) : “Using Continuums to Learn the Phonetic Structure of Speech”
10h15-10h45 : Coffee break
10:45-11h30: Talk by Dr. Guillaume Wisniewski (LLF, Paris) : “The long distance agreement task: assessing the capacity of NLM to uncover different representation for superficially similar but syntactically different structures”
11h30: 12h : Beatrice Daille (LS2N, responsable du GDR TAL): GDR TAL News; DGA/AID and Vincent Claveau (IRISA): Call for project DGA/AID / GDR TAL CNRS
12h-14h : Repas
14h-15h : Talk by Pr. Jame Glass (MIT, USA) : “Towards Unsupervised Speech Processing” (visioconference)
15-16h15: Poster session (and coffee break)
16h15-17h : Talk by Dr. Diane Larlus (Naver Labs Europe) : Learning transferable visual representations
17h : End of the day

Looking forward to seeing you there!

The organizing committee

Organizers :

Thomas Hueber (GIPSA-lab, Grenoble)

Frédéric Béchet (LIS Marseille)

Marco Dinarelli (LIG, Grenoble)

Benoit Favre (LIS, Marseille)

Olivier Perrotin (GIPSA-lab, Grenoble)

——————————

Détails

Date :: 1 avril 2022

Lieu

: Auditorium IMAG
: 700 Av. Centrale,
SAINT MARTIN D'HERES, 38400 Saint-Martin-d'Hères France + Google Map
: Voir Lieu site web

Journée “apprentissage des représentations de la parole et du langage”

1 avril 2022

Programme (prévisionnel) de la journée :

Program:

Détails

Lieu

Évènements à venir

jaii2024

Journée Club des partenaires