Posts by Collection

portfolio

publications

Coca-Cola: An Icon of the American Way of Life. An Iterative Text Mining Workflow for Analyzing Advertisements in Dutch Twentieth-Century Newspapers.

Published in Digital Humanities Quarterly, 2017

Download paper here Read more

Recommended citation: Melvin Wevers, Jesper Verhoef (2017). “Coca-Cola: An Icon of the American Way of Life. An Iterative Text Mining Workflow for Analyzing Advertisements in Dutch Twentieth-Century Newspapers.” 11-4 http://digitalhumanities.org:8081/dhq/vol/11/4/000338/000338.html

Constructing a Recipe Web from Historical Newspapers

Published in ISWC 2018, 2018

Historical newspapers provide a lens on customs and habits of the past. For example, recipes published in newspapers highlight what and how we ate and thought about food. The challenge here is that newspaper data is often unstructured and highly varied. Digitised historical newspapers add an additional challenge, namely that of fluctuations in OCR quality. Therefore, it is difficult to locate and extract recipes from them. We present our approach based on distant supervision and automatically extracted lexicons to identify recipes in digitised historical newspapers, to generate recipe tags, and to extract ingredient information. We provide OCR quality indicators and their impact on the extraction process. We enrich the recipes with links to information on the ingredients. Our research shows how natural language processing, machine learning, and semantic web can be combined to construct a rich dataset from heterogeneous newspapers for the historical analysis of food culture. Read more

Recommended citation: van Erp M., Wevers M., Huurdeman H. (2018) Constructing a Recipe Web from Historical Newspapers. In: Vrandečić D. et al. (eds) The Semantic Web – ISWC 2018. ISWC 2018. Lecture Notes in Computer Science, vol 11136. Springer, Cham. https://doi.org/10.1007/978-3-030-00671-6_13 https://link.springer.com/chapter/10.1007/978-3-030-00671-6_13

The visual digital turn: Using neural networks to study historical images

Published in Digital Scholarship in the Humanities, 2019

Digital humanities research has focused primarily on the analysis of texts. This emphasis stems from the availability of technology to study digitized text. Optical character recognition allows researchers to use keywords to search and analyze digitized texts. However, archives of digitized sources also contain large numbers of images. This article shows how convolutional neural networks (CNNs) can be used to categorize and analyze digitized historical visual sources. We present three different approaches to using CNNs for gaining a deeper understanding of visual trends in an archive of digitized Dutch newspapers. These include detecting medium-specific features (separating photographs from illustrations), querying images based on abstract visual aspects (clustering visually similar advertisements), and training a neural network based on visual categories developed by domain experts. We argue that CNNs allow researchers to explore the visual side of the digital turn. They allow archivists and researchers to classify and spot trends in large collections of digitized visual sources in radically new ways. Read more

Recommended citation: Melvin Wevers, Thomas Smits, The visual digital turn: Using neural networks to study historical images, Digital Scholarship in the Humanities, Volume 35, Issue 1, April 2020, Pages 194–207, https://doi.org/10.1093/llc/fqy085 https://academic.oup.com/dsh/article-pdf/35/1/194/32976784/fqy085.pdf

Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990

Published in ACL - LangChang Workshop, 2019

Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent news media of the past exhibited biases. This paper specifically examines bias related to gender in six Dutch national newspapers between 1950 and 1990. We measure bias related to gender by comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds. We demonstrate clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, we see the bias moving toward women, whereas, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements. Even though Dutch society became less stratified ideologically (depillarization), we found an increasing divergence in gender bias between religious and social-democratic on the one hand and liberal newspapers on the other. Methodologically, this paper illustrates how word embeddings can be used to examine historical language change. Future work will investigate how fine-tuning deep contextualized embedding models, such as ELMO, might be used for similar tasks with greater contextual information. Read more

Recommended citation: Melvin Wevers. "Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990." arXiv preprint arXiv:1907.08922 (2019). https://arxiv.org/pdf/1907.08922

Digital begriffsgeschichte: Tracing semantic change using word embeddings

Published in Historical Methods: A Journal of Quantitative and Interdisciplinary History , 2020

Recently, the use of word embedding models (WEM) has received ample attention in the natural language processing community. These models can capture semantic information in large corpora of text by learning distributional properties of words, that is how often particular words appear in specific contexts. Scholars have pointed out the potential of WEMs for historical research. In particular, their ability to capture semantic change might assist historians studying conceptual change or specific discursive formations over time. Concurrently, others voiced their criticism and pointed out that WEMs require large amounts of training data, that they are challenging to evaluate, and they lack the specificity looked for by historians. The ability to examine semantic change resonates with the goals of historians such as Reinhart Koselleck, whose research focused on the formation of concepts and the transformation of semantic fields. However, word embeddings can only be used to study particular types of semantic change, and the model’s use is dependent on the size, quality, and bias in training data. In this article, we examine what is required of historical data to produce reliable WEMs, and we describe the types of questions that can be answered using WEMs. Read more

Recommended citation: Melvin Wevers & Marijn Koolen (2020) Digital begriffsgeschichte: Tracing semantic change using word embeddings, Historical Methods: A Journal of Quantitative and Interdisciplinary History, DOI: 10.1080/01615440.2020.1760157 https://www.tandfonline.com/doi/full/10.1080/01615440.2020.1760157?scroll=top&needAccess=true

Tracking the Consumption Junction: Temporal Dependencies between Articles and Advertisements in Dutch Newspapers

Published in Digital Humanities Quarterly, 2020

Historians have regularly debated whether advertisements can be used as a viable source to study the past. One of their main concerns centered on the question of agency. Were advertisements a reflection of historical events and societal debates, or were ad makers instrumental in shaping society and the ways people interacted with consumer goods? Using techniques from econometrics (Granger causality test) and complexity science (Adaptive Fractal Analysis), this paper analyzes to what extent advertisements shaped or reflected society. We found evidence that indicates a fundamental difference between the dynamic behavior of word use in articles and advertisements published in a century of Dutch newspapers. Articles exhibit persistent trends. Contrary to this, advertisements have a more irregular behavior characterized by short bursts and fast decay, which, in part, mirrors the dynamic through which advertisers introduced terms into public discourse. On the issue of whether advertisements shaped or reflected society, we found particular product types that seemed to be collectively driven by a Granger causality going from advertisements to articles. Generally, we found support for a complex interaction pattern, analogous to Cowan’s concept of the consumption junction. Finally, we discovered noteworthy patterns in terms of Granger causality and long-range dependencies for specific product groups. All in, this study shows how methods from econometrics and complexity science can be applied to humanities data to improve our understanding of complex cultural-historical phenomena such as the role of advertising in society. Read more

Recommended citation: Melvin Wevers, Jianbo Gao, Kristoffer Nielbo (2020). "Tracking the Consumption Junction: Temporal Dependencies between Articles and Advertisements in Dutch Newspapers." Digital Humanities Quarterly, 14:1. http://digitalhumanities.org/dhq/vol/14/2/000445/000445.html

talks

teaching