DANES Newsletter - April 2025

Spring heralds a season of renewed scholarly activity in the field of Digital Ancient Near Eastern Studies, with this month’s newsletter reflecting the vibrant interdisciplinary research currently underway.

A special volume in it-Information Technology presents multiple approaches to digital analysis of ancient material culture, while several articles demonstrate the effectiveness of computational methods for understanding Neo-Assyrian texts and social structures, and a breakthrough method for cuneiform optical character recognition (OCR) from 2D images.

Particularly noteworthy is the release of lemmatized datasets of Babylonian texts alongside the BabyLemmatizer 2.1 model, providing resources for future computational linguistic research. The publication of archaeobotanical data from Italy and desert kite mapping data offers new possibilities for quantitative spatial analysis.

This month features several educational opportunities, from DH workshops at Guelph to the Ancient Language Processing conference at NAACL2025, where integration of AI with traditional philological methods will be a central topic.

Recent Academic Publications

A Special volume in the journal it-Information Technology was published this month: Kleine Fächer Digital, edited by Tessa Gengnagel, Hubert Mara and Christian Schröter. This volume includes several publications that are of relevance to the DANES community: Discrete Morse theory segmentation on high-resolution 3D lithic artifacts, by Jan Philipp Bullenkamp, Theresa Kaiser, Florian Linsel, Susanne Krömker, and Hubert Mara, study the origins of tool making by modern humans and coexisting Neanderthals in the Paleolithic from 3D models of stone tools which allow for more accurate shape classification; Signed, sealed, delivered – digital approaches to Byzantine sigillography, by Claes Neuefeind et al., present two current projects at University of Cologne to digitize Byzantine seals with RTI technology and create digital scholarly editions of the inscriptions on the seals; A recursive encoding for cuneiform signs, by Daniel M. Stelzer, developed another method (in its early stages) to encode cuneiform signs based on wedge configuration; A case study of the use of logical data analysis in the Workmen’s Village in Tell el-Amarna, Egypt, by Sarah M. Klasse and Marcus Weber, utilizes archaeobotanical and archaeological data with data analysis methods to study spatial patterns in the distribution of kitchen tools to better understand food production practices in state-planned settlements.

Material philology and Syriac excerpting practices: A computational-quantitative study of the digitized catalog of the Syriac manuscripts in the British Library (article, PLOS ONE), by Noam Maeir, studies almost 20,000 excerpts found in Syriac manuscripts from the British Library’s online collection to introduce the Excerpts Per Manuscript (EPM) metric. It shows that while most manuscripts have fewer than 20 excerpts, a small number has much higher precentage of excerpting, particularly in the 6th-9th centuries CE, which corresponds with a period of intense literary compilation.

Neo-Assyrian Imperial Religion Counts: a Quantitative Approach to the Affiliations of Kings and Queens with Their Gods and Goddesses (article, Journal of Ancient Near Eastern Religions), by Amy Gansell, Tero Alstola, Heidi Jauhiainen, and Saana Svärd, investigates co-occurance of royal and divine pairings in Neo-Assyrian texts. They interpret and rank chronological trends in royal-divine affiliations, with equal attention and contrast between deities associated with kings and those with queens.

Neo-Assyrian Metaphors through the Telescope: Linguistic Patterns involving Body Part Constructions in the State Archives Letter Corpus (article, Asia Anteriore Antica), by Matthew Ong and Shai Gordin, present findings from a semi-automated linguistic analysis of the Neo-Assyrian letter corpus, focusing on compound prepositional phrases involving body parts. They show these metaphors are particulalry common in expressing directed motion, and that there are dialectical differences in some constructions between the Neo-Assyrian and Neo-Babylonian dialects of Akkadian.

ProtoSnap: Prototype Alignment for Cuneiform Signs (Conference Proceedings, ICLR 2025), by Rachel Mikulinsky, Morris Alper, Shai Gordin, Enrique Jiménez, Yoram Cohen, Hadar Averbuch-Elor, use an unsupervised generative machine learning approach to identify and represent the high-variability of wedge configuration of cuneiform signs. By using prototypical sign images generated from Unicode cuneiform fonts, they snap the skeleton-based template to photographed cuneiform signs. Their period and language-agnostic method also allows for generating synthetic data with more attested wedge configurations, which can boost performance of cuneiform sign recognition models. See also their GitHub webpage.

Special Mention

Demographic interactions between the last hunter-gatherers and the first farmers (article, PNAS), by Alfredo Cortell-Nicolau, Javier Rivas, Enrico R. Crema, Stephen Shennan, Oreto García-Puchol, Jan Kolář, Robert Staniuk, and Adrian Timpson, adapt Lotka–Volterra (LV) models, originally used to model interactions between different population groups competing for the same space and resources, to model interactions between early migrant communities of farmers and incumubent populations of hunter-gatherers. They use three archaeological case-studies from Eastern Iberia, southern Scandinavia, and the island of Kyushu (Japan).

Archaeological Artefact Database of Finland (AADA) (article, Nature Scientific Data), by P. Pesonen, U. Moilanen, M. Roose, et al., presents a dataset of prehistoric (covering period of almost 11,000 years) artefacts in Finland or held in Finnish collections. They are categorised by type and are accompanied with photos of the artefacts, as well as with spatio-temporal context. There are ca. 38,000 artefacts.

Datasets Published

Linguistically Annotated Achemenet Babylonian Texts (Zenodo dataset), by Tero Alstola, Aleksi Sahala, Jonathan Valk, and Matthew Ong, publishes lemmatized and POS-tagged Neo-Babylonian and later texts from the Achemenet project. The annotations took place with the help of the BabyLemmatizer 2.1 model, see below.

BALT: Babylonian Administrative and Legal Texts (Zenodo dataset), by Tero Alstola, Aleksi Sahala, Jonathan Valk, and Matthew Ong, published another annotated dataset of Neo-Babylonian and later texts made available by scholars in the field, most of them through the pioneering digitization work of János Everling. It also includes basic metadata on these texts gathered from CDLI and NaBuCCo. The annotations took place with the help of the BabyLemmatizer 2.1 model, see below.

Late Babylonian Model for BabyLemmatizer 2.1 (Zenodo model), by Aleksi Sahala, Tero Alstola, and Jonathan Valk, is a repository that hold the BabyLemmatizer 2.1 model that was used to create the above two datasets. The repository includes the training and test data used for developing this model, which were taken from ORACC.

Archaeobotanical Data from the Italian Peninsula in the 1st Millennium CE (Journal of Open Archaeology Data), by Roberto Ragno, includes raw counts of archaeobotanical remains discovered in Italy from the 1^st century BCE to the 11^th century CE, focusing on 40 plant taxa that were most common. These were gathered from published sources and reports and incorporated in a new dataset with relevant metadata.

Desert Kites and Related Constructions: Data from the Globalkites Project (Journal of Open Archaeology Data), by Olivier Barge, Emmanuelle Régagnon, Wael Abu-Azizeh, Sofiane Bouzid, Jacques Élie Brochier, Rémy Crassard, provides the locations of all identified desert kites, along with detailed morphological information for a sample representing approximately 10% of them. Desert kites are a wide phenomenon, probably used in hunting, attested in the Middle East across 3,500 km and their chronological distribution is estimated to be between 7000 BCE–1800 CE. These data were gathered through the analysis of high-resolution satellite imagery and are organized into feature classes.

Towards Robust Demographic Models: A Systematic Approach to 14C Data Aggregation and Analysis: Lessons from the Southern Levant (Journal of Open Archaeology Data), by Magdalena Maria Elisabeth Bunbury, presents a robust seven-step framework for curating and analysing extensive radiocarbon (14C) datasets, optimised to construct accurate Summed Probability Distribution (SPD) models. It uses a dataset of 4,657 14C dates from 582 archaeological sites in the Southern Levant spanning the last 50,000 years.

Events

Talks and Conferences

The Dutch Symposium of the Ancient Near East 2025: ‘Go down in flames’ that is taking place at Leiden University on April 17 will feature a talk by Gustav Ryberg Smidt (Ghent University), Assyriology in flames? which will discuss applications of AI technology in traditional assyriological research and its implications for the field.

The second workshop on Ancient Language Processing, co-located with NAACL2025, will take place on April 29-May 4 in Alburquerque, New Mexico and online. The program includes talks on implementation of machine learning and large and small language models for ancient languages including cuneiform-based languages, Egyptian, ancient Chinese, Hebrew, Persian and more. It will also include the results of the EvaCun task force for creating lemmatization and masked word restoration for Akkadian and Sumerian. Participation (for a fee) is required to attend (in-person or remote), but all accepted talks will be published online immediately after the conference as open access publications.

Training Opportunities

The University of Guelph hosts a series of four-day workshops (DH@Guelph) on a variety of topics related to DH theory and method. This year the workshops will be in person on May 12th-15th, 2025. Other events include ‘dine-around’ dinners for mingling and networking. This year eight workshops are offered, including making digital scholarly editions, visualizing data, minimal web design with Jekyll, making play-based digital archives, and more. Early bird registration ends on April 15.

The Institute of Classical Studies at the University of London is organizing an online research training titled An introduction to essential resources for Classics and Archaeology research on 23 April, which will introduce participants to relevant online resources and how to use them.

The Digital Humanities at Oxford summer school, taking place at 4-8 August, is annual training event for gaining digital humanities skills. This years event include six workshops, four taking place in person (Applied Data Analysis, Text Encoding Initiative, Humanities Data, From Text to Tech) and two online (Introduction to Digital Humanities, AI in Research Libraries). Early bird registration fees end on 18 April.

The European Summer University in Digital Humanities “Culture and Technology” will take place at the Université Marie et Louis Pasteur in Besançon, France from 21 July to 2 August. It is meant for both students from the humanities and computer and engineering sciences, the former gaining practical knowledge of the application of computational method, and the latter acquiring insights into humanities data and the computational and humanities approaches to study them. Workshops are in parallel for two weeks or one week. Students are encouraged to bring their own datasets or be in the process of a technology-based research project. Application is open until 18 May.

Call for Papers

The Special Interest Group in Digital Literary Studies (SIG-DLS) organizes a pre-conference event at the upcoming DH2025 Conference in Lisbon, Portugal on Monday 14 July, 13:30-20:00 WET, this year in collaboration with the ICLA Digital Comparative Literature Research Committee and the Computational Literary Studies Infrastructure. It is dedicated to all applications of digital and computational methods in the study of literature. The program will include a series of lightning talks and demos. It welcomes contributions on (but not limited to) distant reading, multilingual literary archives and digitization, born-digital literature, GIS, data visualization, machine translation, and Language models. Submission is open via this submission form until 25 April 2025.

The IEEE International Conference on Cyber Humanities 2025 will take place in Florence, Italy on 8-10 September. It focuses on theoretical and practical aspects of technologies applied to social science and humanities (SSH) that includes arts, heritage, history, archeology, linguistics, libraries, and so forth. It invites papers covering nine related topics: digitization, processing and curation, preservation, protection and security, retrieval and analysis, valorization and applications, SSH research infrastructures, ethical and legal aspects, and societal aspects. Paper submission deadline is 5 May.

Did we miss relevant articles published in the previous month? Did we miss upcoming events in the next month? Would you like to ensure your news will appear in the next newsletter? Please send us an email at digpasts@gmail.com! Corrections to published Newsletters will be sent via the DANES mailing list.

On this page