NER in archival finding aids: extended

doi:10.3390/make4010003

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/76687

Registo completo

Campo DC	Valor	Idioma
dc.contributor.author	Cunha, Luís Filipe da Costa	por
dc.contributor.author	Ramalho, José Carlos	por
dc.date.accessioned	2022-03-29T11:56:39Z	-
dc.date.available	2022-03-29T11:56:39Z	-
dc.date.issued	2022-01-17	-
dc.identifier.citation	Cunha, L.F.d.C.; Ramalho, J.C. NER in Archival Finding Aids: Extended. Mach. Learn. Knowl. Extr. 2022, 4, 42-65. https://doi.org/10.3390/make4010003	por
dc.identifier.uri	https://hdl.handle.net/1822/76687	-
dc.description.abstract	The amount of information preserved in Portuguese archives has increased over the years. These documents represent a national heritage of high importance, as they portray the country’s history. Currently, most Portuguese archives have made their finding aids available to the public in digital format, however, these data do not have any annotation, so it is not always easy to analyze their content. In this work, Named Entity Recognition solutions were created that allow the identification and classification of several named entities from the archival finding aids. These named entities translate into crucial information about their context and, with high confidence results, they can be used for several purposes, for example, the creation of smart browsing tools by using entity linking and record linking techniques. In order to achieve high result scores, we annotated several corpora to train our own Machine Learning algorithms in this context domain. We also used different architectures, such as CNNs, LSTMs, and Maximum Entropy models. Finally, all the created datasets and ML models were made available to the public with a developed web platform, NER@DI.	por
dc.language.iso	eng	por
dc.publisher	Multidisciplinary Digital Publishing Institute	por
dc.rights	openAccess	por
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	por
dc.subject	named entity recognition	por
dc.subject	archival search aids	por
dc.subject	machine learning	por
dc.subject	deep learning	por
dc.subject	maximum entropy	por
dc.title	NER in archival finding aids: extended	eng
dc.type	article	por
dc.peerreviewed	yes	por
dc.relation.publisherversion	https://www.mdpi.com/2504-4990/4/1/3	por
oaire.citationStartPage	42	por
oaire.citationEndPage	65	por
oaire.citationIssue	1	por
oaire.citationVolume	4	por
dc.date.updated	2022-03-24T14:47:06Z	-
dc.identifier.eissn	2504-4990	-
dc.identifier.doi	10.3390/make4010003	por
dc.subject.fos	Ciências Naturais::Ciências da Computação e da Informação	por
dc.subject.wos	Science & Technology	por
sdum.journal	Machine Learning and Knowledge Extraction (MAKE)	por
oaire.version	VoR	por
Aparece nas coleções:	CCTC - Artigos em revistas internacionais