DREAM. A project about non-Latin script data

Antonella Fallerini; Agnese Galeffi; Andrea Ribichini; Mario Santanché; Mattia Vallania

doi:10.4403/jlis.it-12727

Vol. 13 No. 1 (2022): The Bibliographic Control in the Digital Ecosystem

Articles

DREAM. A project about non-Latin script data

TEXT

Antonella Fallerini,
Agnese Galeffi,
Andrea Ribichini,
Mario Santanché,
Mattia Vallania

more info

Antonella Fallerini
Sapienza Università di Roma, Biblioteca Dipartimento ISO
Bio

Agnese Galeffi
Sapienza Università di Roma, Sistema Bibliotecario
Bio

Andrea Ribichini
Sapienza Università di Roma, Dipartimento DIAG (Ingegneria informatica, automatica e gestionale)
Bio

Mario Santanché
Sapienza Università di Roma, Sistema Bibliotecario
Bio

Mattia Vallania
Sapienza Università di Roma, Sistema bibliotecario
Bio

DOI: https://doi.org/10.4403/jlis.it-12727

Published 2022-01-13

Keywords

Romanization,
MARC records,
Cataloguing,
Transliteration

How to Cite

Fallerini, Antonella, Agnese Galeffi, Andrea Ribichini, Mario Santanché, and Mattia Vallania. 2022. “DREAM. A Project about Non-Latin Script Data”. JLIS.It 13 (1):347-55. https://doi.org/10.4403/jlis.it-12727.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

The DREAM project is a large research project founded by Sapienza University of Rome, dealing with bibliographic data in non-Latin scripts. As the National Bibliographic Service catalogue (SBN) does not yet manage data in non-Latin scripts, the aim of DREAM is to offer researchers a catalogue searchable through original scripts (such as Arabic, Chinese, Cyrillic, etc.). One of the most remarkable features of the project is the creation of an ILS-independent working context in which the cataloguer may find and retrieve data in original script from authoritative catalogues, starting from the existing romanized ones. From a technical standpoint, the ever increasing Unicode support offered by modern operating systems, DBMSs and indexing engines makes the rapid development of the relevant software tools a concrete possibility. This in turn implies a shift in scientific focus towards the (often subtle) record linkage operations between different data sources. The authors hope that the DREAM project will gather the adhesion of other Italian libraries that perceive the same needs. Furthermore, as soon as SBN will support the management of data in non-Latin scripts, the DREAM project partners will be able to contribute with their data.

TEXT

Metrics

Metrics Loading ...

References

Agenbroad, James E. 2006. “Romanization Is Not Enough.” Cataloging & Classification Quarterly 42 (2): 21-34. https://doi.org/10.1300/J104v42n02_03
DuBose, Joy. 2019. “Russian, Japanese, and Latin Oh My! Using Technology to Catalog Non-English Language Titles.” Cataloging & Classification Quarterly 57 (7-8): 496-506. https://doi.org/10.1080/01639374.2019.1671929
El-Sherbini, Magda, and Sherab Chen. 2011. “An Assessment of the Need to Provide Non-Roman Subject Access to the Library Online Catalog.” Cataloging & Classification Quarterly 49 (6): 457-483. https://doi.org/10.1080/01639374.2011.603108
Eryani, Fadhl, and Nizar Habash. 2021. “Automatic Romanization of Arabic Bibliographic Records.” https://arxiv.org/pdf/2103.07199.pdf
ICCU. 2016a. “Guida alla catalogazione in SBN – Materiale moderno.” Last modified July 13, 2016. https://norme.iccu.sbn.it/index.php?title=Guida_moderno/Descrizione/Capitolo_generale/Lingua_e_scrittura_della_descrizione
ICCU. 2016b. “Regole italiane di catalogazione. Appendice F – Traslitterazione o trascrizione di scritture diverse dall’alfabeto latino.” Last modified September 21, 2016. https://norme.iccu.sbn.it/index.php?title=Reicat/Appendici/Appendice_F
Inmon, William H. 2005. Building the data warehouse. 4th ed. Indianapolis: John Wiley & Sons.
Ismail, Mohd Ikhwan, and Nurul Azurah Md. Roni. 2010. “Issues and challenges in cataloguing Arabic books in Malaysia academic libraries.” Education for Information 28 (2-4): 151-163.
Kim, SungKyung. 2006. “Romanization in Cataloging of Korean Materials.” Cataloging & Classification Quarterly 43 (2): 53-76. https://doi.org/10.1300/J104v43n02_05
Kimball, Ralph, Margy Ross, Warren Thornthwaite, Joy Mundy, and Bob Becker. 2008. The data warehouse lifecycle toolkit. 2° ed. Indianapolis: John Wiley & Sons.
Kudo, Yoko. 2010. “A Study of Romanization Practice for Japanese Language Titles in OCLC WorldCat Records.” Cataloging & Classification Quarterly 48 (4): 279-302. https://doi.org/10.1080/01639370903338352
Levenshtein, Vladimir Iosifovich. 1966. "Binary codes capable of correcting deletions, insertions and reversals." Soviet Physics Doklady 10 (8): 707-710.
Li, Yue. 2004. “Consistency versus Inconsistency: Issues in Chinese Cataloging in OCLC.” Cataloging & Classification Quarterly 38 (2): 17-31. https://doi.org/10.1300/J104v38n02_04
Molavi, Fereshteh. 2006. “Main Issues in Cataloging Persian Language Materials in North America.” Cataloging & Classification Quarterly 43 (2): 77-82. https://doi.org/10.1300/J104v43n02_06
Navarro, Gonzalo. 2001. “A guided tour to approximate string matching.” ACM Computing Surveys 33 (1): 31-88. https://doi.org/10.1145/375360.375365
Rao, Chaitra, Avantika Mathur, and Nandini C. Singh. 2013. “‘Cost in Transliteration’: The neurocognitive processing of Romanized writing.” Brain and Language 124 (3): 205-212. https://doi.org/10.1016/j.bandl.2012.12.004

DREAM. A project about non-Latin script data

Keywords

How to Cite

Download Citation

Abstract

Metrics

References

Most read articles by the same author(s)