Dante or not Dante? That is the question

Dante and artificial intelligence:  ever thought about them together? In this workshop you will get the chance to familiarize with Digital Humanities and Computational Linguistics playing around with the results of statistical models for language generation, which will try to imitate the writing style of Dante Alighieri. Will you be able to tell apart the real Dante from the robotic one? Or will you be fooled?

This workshop is organized in collaboration with AIUCD

This workshop was presented here:

By |2024-12-04T17:42:41+01:0021 Oct, 2021|BLOG, LABORATORY, POP|

But Does a Computer Understand Me? What Computational Linguistics Is and What It Is Used For

Tools based on natural language processing and artificial intelligence, such as recommendation systems on social media, automatic translators, and voice assistants, are now part of our daily lives, both in personal and professional contexts.

These technologies rely on the representation of linguistic knowledge, the research object of a discipline often little known outside its narrow specialist field: Computational Linguistics.

The more pervasive these tools become, the more we take them for granted, without questioning how they were created, how they precisely work, and, above all, what the consequences of their widespread, massive, and largely unconscious use might be. more info here.

Ludovica Pannitto, University of Trento

Malvina Nissim, University of Trento

By |2024-12-04T17:05:40+01:0021 May, 2021|BLOG, POP, SEMINARS|

Bid for Lectures on Computational Linguistics 2022

L’Associazione Italiana di Linguistica Computazionale (AILC) sollecita candidature per ospitare
l’edizione 2022 delle “Lectures on Computational Linguistics”.

Le Lectures sono una iniziativa annuale di AILC rivolta alla formazione nel campo della
Linguistica Computazionale, e sono il frutto di una stretta collaborazione con l’alta formazione in
Università, in particolare con le Scuole di Dottorato. Informazioni sul format delle Lectures e
sulle edizioni precedenti sono disponibili qui.
Le sedi che si intendono proporre dovranno presentare un documento contenente le seguenti
informazioni:

  • Gruppo organizzatore locale: indicare le persone coinvolte nell’organizzazione locale
    delle Lectures, inclusa una persona proposta per far parte del Comitato Scientifico delle
    Lectures per due anni; indicare precedenti esperienze dei local organizers
    nell’organizzazione di eventi di formazione.
  • Caratteristiche della Sede: indicare la posizione della sede, numero di sale disponibili
    con la relativa capienza, spazi per sessione poster, presenza di attrezzature audio-video,
    possibilità di pranzo in mensa per i partecipanti.
  • Scuola di dottorato e corsi universitari collegati alla sede: indicare la/le scuole di
    dottorato coinvolte nell’organizzazione, gli eventuali corsi di laurea interessati, e il
    corrispondente numero di studenti potenzialmente interessati alle Lectures.
  • Caratterizzazione scientifica della Sede: indicare la caratterizzazione della sede (sede a
    orientamento umanistico, orientamento informatico o misto); indicare eventualmente
    alcuni temi scientifici che la Sede ospitante intende proporre al Comitato Scientifico delle
    Lectures, nel caso la sede fosse selezionata.
  • Alloggi: indicare la disponibilità di alloggi, in particolare a costi contenuti, per es., se
    presenti, studentati e strutture universitarie di accoglienza.
  • Evento sociale: indicare possibilità e costi per una cena o altro evento sociale.
  • Trasporti: indicare i collegamenti (aereo, treni) per raggiungere la sede; mezzi di
    trasporto urbani con i tempi per raggiungere la sede.
  • Budget: indicare i costi stimati relativi alle sale, i costi della mensa, e eventuali altri costi
    fissi richiesti dalla sede ospitante.
  • Sponsorizzazioni: indicare eventuali sponsorizzazioni da parte di istituzioni universitarie
    (dipartimento, scuola di dottorato).
  • Date: indicare le date possibili delle Lectures (tre giorni) nel periodo maggio-giugno 2022.

La selezione della sede ospitante verrà effettuata, in seduta congiunta, dal Comitato Scientifico
delle Lectures e dal Consiglio Direttivo AILC.
Le candidature dovranno essere inviate per posta elettronica al Presidente AILC (Bernardo
Magnini – magnini@fbk.eu) e al Coordinatore del Comitato Scientifico delle Lectures (Elisabetta
Jezek – jezek@unipv.it) entro il giorno 8 ottobre 2021.
Contatti: Bernardo Magnini (magnini@fbk.eu) e Elisabetta Jezek (jezek@unipv.it)

By |2022-07-20T23:30:09+02:0016 May, 2021|BLOG, EDUCATION, EVENTS, NEWS|

Computational Linguistics and the COVID-19 Outbreak

This page is maintained by AILC (the Italian Association for Computational Linguistics). It groups some of the initiatives that the Computational Linguistics community is carrying out to contribute to the fight against COVID-19. Everyone is invited to collaborate by reporting new initiatives. Please do so through our contact form.

Datasets


  • CORD-19 – The Allen Institute COVID-19 Open Research Dataset, a collection of Covid-19 scientific papers, weekly updated (March 2020)
  • Processed CORD-19 – The Allen Institute corpus processed with Sketch Engine (March 2020)
  • 40wita – A dataset of tweets in Italian collected daily by the University of Turi
  • Corona Corpus – A corpus of texts from online newspapers and magazines in 20 different English-speaking countries and part of the English-Corpora.org suite of corpora

Tools


Shared Tasks and Events


  • CLEF 2020: CheckThat! Lab Task 1 Tweet Check-Worthiness –The task asks to rank a stream of tweets on a number of topics, including COVID-19, according to their check-worthiness (March 2020)
  • Kaggle Tasks –Several tasks on COVID-19  (March 2020)
  • NLP COVID-19 Workshop an emergency workshop at ACL 2020 – Authors are invited to submit papers related to NLP applied to combat the COVID-19 pandemic (July 2020)
  • TREC-COVID program – Launched by NIST and OSTP, the challenge will follow the TREC assessment process to evaluate search systems, based on the CORD-19 documents

Publications


By |2020-05-18T12:47:59+02:002 Apr, 2020|BLOG, HOME, RESOURCES|

COVID-19 Browser: Using Natural Language Processing to Fight the Pandemic

Our society is facing an unprecedented crisis due to the recent COVID-19 outbreak that is putting sanitary systems in check all around the world. Recently, dozens of countries announced the shutdown of all non-essential activities for the next foreseeable future, and scientists are striving worldwide to find cures and vaccines able to stop the ongoing pandemic.

In these hard times, everyone should put their expertise at play to help in the fight against the virus. For Gabriele Sarti, a Data Science student at the University of Trieste and a young member of the Italian Association for Computational Linguistics (AILC), this meant exploiting his expertise in Natural Language Processing (NLP) to develop the COVID-19 Browser, a system leveraging state-of-the-art techniques in NLP to extract meaningful information and guide scientists towards a better understanding of COVID-19.

As of today, more than 32 000 scientific papers have been published by research laboratories worldwide on the topics of the new corona virus SARS-CoV-2 and the disease COVID-19. It is very likely that in such a large quantity of text a lot of useful information is lost, making our knowledge on the subject too sparse to be exploited to its full potential. COVID-19 Browser allows users to browse a large collection of those articles directly in their console, matching article’s abstracts with user queries formulated in natural language to delve deeper in our current knowledge of the subject.

The model underlying Covid-19 Browser is SciBERT-NLI, a cutting-edge language model trained by the American nonprofit AI2 on a corpus of 1.14M scientific papers and subsequently adjusted by Gabriele to be used for the retrieval task.

Gabriele Sarti is a student in the Data Science master at the University of Trieste (https://dssc.units.it/), and is affiliated with SISSA (https://www.sissa.it), and the CNR ItaliaNLP Lab in Pisa (http://www.italianlp.it). He is a member of the Italian Association for Computational Linguistics (https://www.ai-lc.it/en/) and plays an active role in its Dissemination Team.

Links

By |2020-04-06T10:33:03+02:0024 Mar, 2020|BLOG, RESEARCH|

Affective lexica and other resources for Italian

By |2017-10-04T16:45:23+02:002 Oct, 2017|BLOG, RESOURCES|

The usefulness of research for companies

Innovation and research in Italian companies of computational linguistics.

At the beginning of the 90s, when the young people of my generation were studying Computational Linguistics (or Natural Language Processing) University, the Center for the Study of Language and Information of the Stanford University was one of the most coveted and dreamed places. Many of us were in love with the Head-Driven Phrase Structure Grammar (HPSG), invented by Carl Pollard and Ivan A. Sag in California. It sounded like HPSG could be the definitive word on formal grammars of natural languages, because they joined some language universal principles (inspired by Noam Chomsky Linguistics) with a powerful computational framework. The approach, however, had two problems: it was difficult to create and manage all the rules quite complex; parsing was not as fast as we would have liked. We devoted ourselves to research but could not make effective commercial services based on this or other computational linguistic framework.

Since then some years have passed. In October 2016 I read an interview with Andrew Ng at the issue by the Chinese company Baidu a chatbot to make medical diagnoses: “As Melody has blackberries conversations, it will Also learn and keep getting better. This is just the start of a much larger, AI-driven transformation of the healthcare industry. “In 1990, Andrew Ng was 14 years old. After a couple of degrees and doctorates, in 2002 he began working at Stanford University. In 2011 he founded the Google Brain project at Google. Also in 2011 he gave a course Machine Learning online to Stanford University, which was followed by about 100,000 students around the world. In 2012 he founded Coursera. In 2014 Ng works in Baidu as chief scientist, and so far has remained to work in that company. This exceptional man is a brilliant example of how the world of research, training and production business will nourish each other with continuous exchanges.

The world of Computational Linguistics and Artificial Intelligence in general are experiencing a period of incredible acceleration

with fast passages between the research and the application of research results into practical services and vice versa, when the issues raised by real cases become a subject of study.
This lively exchange takes place even in Italian companies doing computational linguistics. As well as researchers in this field have always been at the forefront globally, even the Italian companies doing computational linguistics have relied on an international level. For example Expert System, a public limited company based in Modena, Naples, Rovereto, has landed a number of years in the United States and grew up in Europe. CELI as an SME, with offices in Turin, Milan, provides Natural Language Processing technologies and consulting to international companies, from Korea to California. Euregio, based in Bolzano, uses NLP to provide media intelligence services. Interactive Media SpA, with offices in Rome, Trento and in Brazil, specializes in speech solutions. The startup Puglia QuestionCube is focused on the question answering and use the main machine learning tool.
Even Almawave, the Almaviva Group, for some years integrates NLP technologies. Other smaller and larger companies are integrating these technologies to provide their services, using machine learning technologies combined with standard NLP technologies.

What services they offer to customers? The main service is the “Natural Language Understanding”, that is, the automatic analysis and understanding of written texts and speech.

The understanding is obviously partial compared to human understanding, but is much faster, and this allows you to do more things that otherwise would not be feasible, or oversimplify complex activities.
In the next post of this blog we will be described in more detail the issues and the problems of Computational Linguistics addressed in universities and companies.
One of the purposes dell’AILC is to facilitate exchanges between universities, research centers and companies in this sector. In this blog so you can tell some of the findings, the results obtained, the ongoing projects, and problems encountered in the various areas of this discipline.

CELI, Expert System, Euregio and QuestionCube are already members of the Italian Association for Computational Linguistics. We hope that in the coming months other companies will join to contribute to the Italian ecosystem creation of Computational Linguistics and Artificial Intelligence.

By |2017-04-04T16:26:40+02:0012 Dec, 2016|BLOG, INDUSTRY|
Go to Top