Analysing the semantic content of static Hungarian embedding spaces

Word embeddings can encode semantic features and have achieved many recent successes in solving NLP tasks. Although word embeddings have high success on several downstream tasks, there is no trivial approach to extract lexical information from them. We propose a transformation that amplifies desired...

Teljes leírás

Elmentve itt :
Bibliográfiai részletek
Szerzők: Ficsor Tamás
Berend Gábor
Testületi szerző: Magyar számítógépes nyelvészeti konferencia (17.) (2021) (Szeged)
Dokumentumtípus: Könyv része
Megjelent: 2021
Sorozat:Magyar Számítógépes Nyelvészeti Konferencia 17
Kulcsszavak:Nyelvészet - számítógép alkalmazása
Tárgyszavak:
Online Access:http://acta.bibl.u-szeged.hu/73360
Leíró adatok
Tartalmi kivonat:Word embeddings can encode semantic features and have achieved many recent successes in solving NLP tasks. Although word embeddings have high success on several downstream tasks, there is no trivial approach to extract lexical information from them. We propose a transformation that amplifies desired semantic features in the basis of the embedding space. We generate these semantic features by a distant supervised approach, to make them applicable for Hungarian embedding spaces. We propose the Hellinger distance in order to perform a transformation to an interpretable embedding space. Furthermore, we extend our research to sparse word representations as well, since sparse representations are considered to be highly interpretable.
Terjedelem/Fizikai jellemzők:91-105
ISBN:978-963-306-781-9