In the sixth CREATE Salon we followed up on the topic of Performing Arts data. This time, we invited speakers to reflect on using datasets for analysis of, for instance, patterns of spatial distribution and network formation in cinema and theatre. Two datasets took center stage: Cinema Context, with historical data on films, screenings, individuals, distributors, and cinemas, and Theaterencyclopedie, which contains data on all premieres in the Netherlands since 1900, numerous biographies as well as visual material.

Thunnis van Oort presented part of his FWO Pegasus project Movie-going at the docks. A media historical comparative analysis of cinema cultures in Antwerp (Flanders) and Rotterdam (Netherlands) (1910-1990). Within the field of new cinema history, researchers such as Thunnis focus on the circulation and consumption of films rather than on the content of the films, and this has stimulated the creation of new datasets on cinema. In the Netherlands and Belgium these trends are embodied by the projects and datasets such as Cinema Context, and ‘the Enlightened City as well as memory archive Antwerpen Kinemastad.  Cinema Context database was developed in the 2000s by Karel Dibbets and can be searched through a web interface. The dataset was tied to the research project Cinema, modern life and cultural identity, 1896–1940, funded by NWO and executed at Utrecht University and the University of Amsterdam between 2002-2006. The Enlightened City project analyzes screen culture between ideology, economics and experience through a study on the social role of film exhibition and film consumption in Flanders (1895-2004) in interaction with modernity and urbanization.

In his project, Thunnis explores means to compare and connect such datasets across borders. The cases of the Netherlands and Belgium are particularly interesting; Flanders boasted among the highest cinema attendance per capita on the continent, while their Dutch neighbors were about the least frequent moviegoers of all Europeans. How to explain these differences? Thunnis’s exploration of the two datasets showed that comparisons of programming data were almost impossible, as  Dutch data runs up to 1948, whereas the Belgian data covers the second half of the century. Biographical data on firms, cinema’s and entrepreneurs, however, is available for both countries, and in his presentation Thunnis showed one possible method of analysis: GIS mapping. With the data one can map cinema distribution across space and time, but Thunnis also discussed  the obstacles he encountered. Zooming in on port cities Antwerp and Rotterdam, he located fixed cinemas in Antwerp and Rotterdam in different years on historical maps, and it’s easy to see how this may result in a neat visualization of differences between the cities, as well as changes over time. However, the analytical part of the job proved much more difficult, as data on explanatory variables such as population density, social class, politics and religion, as well other dimensions of cultural infrastructure are not readily available at the neighbourhood level, across time, or for both countries. But even so, Thunnis’s project shows how historians may employ GIS mapping techniques beyond simply visualizing, and to establish correlations between historical outcomes and social, economic, political, and cultural variables. Collecting and converting the necessary contextual data for more complex GIS querying is therefore definitely something that warrants our collective attention, as it also provides us with instruments for research on other (twentieth-century) cultural industries.

The presentation of Kathleen Lotze and Rosa Merino Claros expanded on the use of data in Cinema Context for advanced querying on spatial distribution and networks in the early Amsterdam film world. Building on a month-long internship by student Ingmar Meeboer with professor Julia Noordegraaf, Kathleen highlighted some methodological challenges in using digital tools and data on cinema for historical network analysis. In the internship, Ingmar had set out to identify key actors in early Amsterdam cinema networks, but encountered several technical challenges. Although the Cinema Context dataset is incredibly rich, and the web interface allows for basic querying, due to limits on the number of records that can be downloaded at once it’s not possible to quickly access the data. Moreover, the downloads are xml’s that are not in all ways conveniently structured or easy to use for researchers. Rosa Merino Claros, one of the CREATE pre-PhD fellows, recently wrote a blogpost on our site on building dynamics maps, using Cinema Context data as a sample. During the Salon, she demonstrated how she extracted the relevant information (the coordinates of the cinema, its name and the years when it was active) and stored it in a CSV file. Ideally, she explained, the Cinema Context interface (and similar databases) would in the future provide the possibility of downloading selected data as both xml and csv files so that researchers can choose, depending on their aims and technical skills. That way it would be possible for historians to only download the information relevant for their research purposes and not worry too much about reading xml files.

Finally, Ad Aerts, from Special Collections at the University Library, provided an excellent example of the ways in which  databases originally designed as encyclopedias or for collection administration are now being opened up for research purposes. After the closing of Theater Instituut Nederland in 2012, the collection was taken in by the University Library. One of the hallmarks of the collection is the digital Theaterencyclopedie, that was initially based on the yearbooks initiated by Luisa Treves in the 1950s, focusing on the simple question ‘what is going on in the Dutch theater world?’. This question, as it turned out, inspired what is now an immensely rich Adlib database with over 100,000 performances. However, much of the entries are still incomplete and it is simply impossible to complete them without substantial additional funding. The Theaterencyclopedie was therefore further developed into a crowd-sourcing project that allows TIN to share their information while at the same time extending it with the help of users, by linking their database to a wiki. Although it  still functions primarily as an online encyclopedia, important steps are being made that will enable historical data analysis. One of these is the use of the semantic wiki model, that list not only entities, but also properties and relations between entities, and that can be processed in a relational database. This means that the Theaterencyclopedie is developing from a list of pages with names, to put it bluntly, into a web of linked pages with properties.  The Wiki section already has a public API, and hopefully soon also an rdf export function. The general Adlib data, however, can at this point not be opened up completely, due to privacy issues regarding certain fields in the database. A solution for this is under construction, so hopefully we can also use their API in the near future. Ad showed that the TIN has a clear ambition: to make information as widely useful as possible; make tools as widely useful as possible. And this is excellent news for researchers on historical creative industries.

Although the datasets discussed in this Salon have different origins, with Cinema Context having been developed for research purposes and the Theaterencylopedie as a reference work, the presentations also revealed marked similarities: both are true treasure troves for research on the history of cultural industries and both are not yet quite as easily accessible for advanced querying as we would like.