Tutorials

Analysis of representations emerging in multilayer recurrent neural networks

Grzegorz Chrupała
Tilburg University, Netherlands

Abstract: In this lecture I will review a family of approaches to the computational study of language which rely on modeling language acquisition by means of grounding in perception. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of indirect and noisy clues, including information from perceptual modalities co-occurring with spoken utterances, most crucially the visual modality.
The dominant modeling approach for this problem in recent years has relied on deep learning. A central concern for these architectures is understanding the nature and localization of representations which they learn and I will therefore also discuss in some different detail analytical techniques have been proposed for this purpose, as well as the main findings resulting from their application.

Symbolic, Distributed and Distributional Representations in the Era of Deep Learning

Fabio Massimo Zanzotto
Univ. Roma, Tor Vergata

Mail: fabio.massimo.zanzotto@uniroma2.it

Abstract: Natural and Artificial Languages are inherently discrete symbolic representations of human knowledge. Recent advances in machine learning (ML) and in natural language processing (NLP) seem to contradict the above intuition: discrete symbols are fading away, erased by vectors or tensors called distributed and distributional representations. However, there is a strict link between distributed/distributional representations and discrete symbols, being the first an approximation of the second. A clearer understanding of the strict link between distributed/distributional representations and symbols may certainly lead to radically new deep learning networks. In this talk I make a survey that aims to renew the link between symbolic representations and distributed/distributional representations. This is the right time to revitalize the area of interpreting how discrete symbols are represented inside neural networks.

Modeling Language Variation and Universals: Cross-Lingual NLP for Low-Resource and Typologically Diverse Languages

Ivan Vulic
University of Cambridge, UK

Mail: iv250@cam.ac.uk

Abstract: A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language. However, this ambition is hindered by the large variation in structural and semantic properties of the world’s languages. As a consequence, existing language technology is still largely limited to a handful of resource-rich languages. In this tutorial, we introduce and discuss a range of techniques that aim to deal with such cross-language variations to build robust multilingual and cross-lingual NLP models that work across typologically diverse languages, with a long-term dream to enable language technology also in low-resource languages. We provide an extensive overview of typologically informed and cross-lingual NLP transfer methods, focusing on: 1) the characterization of linguistic typology and the impact of semantic and syntactic variation on the performance of cross-lingual transfer and multilingual models; 2) techniques that integrate available discrete typological knowledge into neural NLP architectures to guide multilingual learning; 3) methods that attempt to implicitly capture the cross-lingual variation directly from data and leverage it for guiding cross-lingual and multilingual NLP models, and 4) recent efforts in neural representation learning that aim to construct widely portable cross-lingual representations and transfer methods with minimum cross-lingual supervision in zero-shot and few-shot learning setups, including adapter-based approaches, hypernetworks, target-specific tuning, and other potential solutions

Short Bio: Ivan Vulić is a Senior Research Associate in the Language Technology Lab, University of Cambridge and a Senior Scientist at PolyAI. He holds a PhD in Computer Science from KU Leuven awarded summa cum laude. His core expertise is in representation learning, cross-lingual learning, human language understanding, distributional, lexical, multi-modal, and knowledge-enhanced semantics in monolingual and multilingual contexts, transfer learning for enabling cross-lingual NLP applications such as conversational AI in low-resource languages, and machine learning for (cross-lingual) NLP. He has published more than 100 papers at top-tier NLP and IR conferences and journals. He co-lectured a tutorial on word vector space specialization at EACL 2017, ESSLLI 2018, EMNLP 2019, and tutorials on cross-lingual representation learning and cross-lingual NLP at EMNLP 2017 and ACL 2019. He also co-lectured tutorials on conversational AI at NAACL 2018 and EMNLP 2019. He co-authored a book on cross-lingual word representations for the Morgan & Claypool Handbook series, published in June 2019, and has started writing a book on NLP methods for low-resource languages. He serves as an area chair and regularly reviews for all major NLP and Machine Learning conferences and journals. Ivan has given invited talks at academia and industry such as Apple Inc., University of Cambridge, UCL, University of Copenhagen, Paris-Saclay, Bar-Ilan University, Technion IIT, University of Helsinki, UPenn, KU Leuven, University of Stuttgart, TU Darmstadt, London REWORK summit, University of Edinburgh, etc. He co-organised a number of NLP workshops, served as the publication chair for ACL 2019, and currently serves as the tutorial chair for EMNLP 2021 and the program chair for *SEM 2021.

Dialogue and Conversational Agents, Natural Language Generation via neural models

Konstas Ioannis
Heriot Watt University, UK

Mail: i.konstas@hw.ac.uk

Short Bio: Konstas Ioannis is lecturer at Heriot-Watt University and member of the Interaction Lab. His main research interests focus on the area of Natural Language Processing (NLP) and Natural Language Generation (NLG) with an emphasis on data-driven machine learning methods.
In particular, he studies the generation of coherent document-level text from meaning representations, programming language code and multiple documents, as well as fact-based text summaries. He is also interested in modeling open-domain dialogue, with an emphasis on Natural Language Generation, as well as grounded language acquisition through visual perception and language interaction. He was also faculty advisor for the Team Alana, HWU entry to Alexa Prize Challenge 2018 that finished 3rd (out of ~200 participants).

TUTORIALS