Towards abstractive summarization in Hungarian

We publish an abstractive summarizer for Hungarian, an encoder-decoder model initialized with huBERT, and fine-tuned on the ELTE.DH corpus of former Hungarian news portals. The model produces fluent output in the correct topic, but it hallucinates frequently. Our quantitative evaluation on automatic...

Full description

Saved in:
Bibliographic Details
Main Authors: Makrai Márton
Tündik Máté Ákos
Indig Balázs
Szaszák György
Corporate Author: Magyar számítógépes nyelvészeti konferencia (18.) (2022) (Szeged)
Format: Article
Published: 2022
Series:Magyar Számítógépes Nyelvészeti Konferencia 18
Kulcsszavak:Nyelvészet - számítógép alkalmazása
Subjects:
Online Access:http://acta.bibl.u-szeged.hu/75896
Description
Summary:We publish an abstractive summarizer for Hungarian, an encoder-decoder model initialized with huBERT, and fine-tuned on the ELTE.DH corpus of former Hungarian news portals. The model produces fluent output in the correct topic, but it hallucinates frequently. Our quantitative evaluation on automatic and human transcripts of news (with automatic and human-made punctuation) shows that the model is robust with respect to errors in either automatic speech recognition or automatic punctuation restoration.
Physical Description:505-519
ISBN:978-963-306-848-9