Unsilencing the VOC testaments is exemplar of how digital methods can create new avenues to access records from the past and contribute to the current debates on decolonising archives especially those concerning silences in the archive (Trouillot 1995).

This research project was supervised by Mrinalini Luthra, pre-PhD fellow at CREATE and Prof. Charles Jeurgens, Archival Studies, department of Media Studies, University of Amsterdam. It was carried out by master archival students: Thijs Vorstenburg, Saskia Virginia Noot and Clare Schutt.

This project won the National Archives’ Innovation in Archival Research Prize!

The Dutch National Archives have large collections of archival material in custody which were the administrative output of colonial state agencies from the 17th to the 20th century. In the current discourse, archival institutions are being challenged to rethink (decolonize) how to engage with these archives and how these archives are made accessible and interpretable.

This project drew up a proof of concept to improve accessibility of one such collection housed with the National Archives: the VOC Testaments. This was done by creating additional finding aids to the testaments through automatic transcriptions, close reading, annotations and natural language processing. To know how this project was received by the Dutch National Archives, read their blog post: Vrouwen en tot slaafgemaakten VOC niet langer stille getuigen.

The Dutch East India Company (VOC) offered jobs to thousands of people as sailors, soldiers and servants working in the trading posts. Every VOC employee was obliged to have a will drawn. These testaments are important sources to get a glimpse into the private lives of common people. The historical value of these documents was already recognised in the 19th century by archivists who created an index only by name of the male testators to access the wills. Recently, these testaments were digitised and can be accessed online but the 19th century index is still the main tool to access the wills. Thus the previously held view of white male dominance and obfuscation of marginalised groups is preserved and continued in the twenty-first century search infrastructure.

With this as our starting point, the main question we wanted to investigate was whether we could identify the traces of women and non-European people in the documents using digital methods. This option became realistic through the application of Handwritten Text Recognition (henceforth HTR)—Tranksribus to these documents as part of the National Archives’ project: ijsberg zichtbaar maken.

In order to understand the historical representation of marginalised peoples we began by close-reading the testaments (both scans and HTR texts) to identify terminology used to denote these groups. We took this approach because very often marginalised groups are mentioned without name and only in relation to the male agent (for example: “wife”, “sister”, “the slave”, “widow”). We created annotations to identify women and non-European peoples using the online annotation tool brat, based on which we could use natural language processing (Github link) to create identifier based indices for these marginalised groups. An excerpt of the index based on classifiers for women can be seen here.

The results of this project demonstrated that at least 60% of the pages of the testaments contain at least one mention of a woman, whilst nearly 30% of these documents contain at least one mention of a non-European persons. This project thus demonstrates the richness of these documents in studying the lives of marginalised peoples and how the use of digital methods in archival studies can enable efforts to decolonise the archive.

This project is continued by the supervisors and Dr. Giovanni Colavizza currently in the form of an annotation campaign. The created annotations will not only serve in improving the identifier based indices but shall also be essential in training named entity recognition models to be able to create an index based on names for the entire archive. Furthermore, this dataset will be a useful resource for researchers working with cultural heritage material in the Netherlands. 

Trouillot, Michel-Rolph. Silencing the past: Power and the production of history. Beacon Press, 1995.