Welfare State Analytics. Text Mining and Modeling Swedish Politics, Media & Culture, 1945-1989 (WeStAc) is a digital humanities research project with three co-operatings partners: Umeå University, Uppsala University and the National Library of Sweden. The project will digitise literature, curate already digitised collections, and perform research via probabilistic methods and text mining models. WeStAc will both digitise and curate three massive textual datasets—in all, Big Data of almost four billion tokens—from the domains of Swedish politics, news media and literary culture during the second half of the 20th century. The dataset of “Politics” contains already digitised Swedish Governmental Reports (SOU) and material from the Swedish Parliament, “Media” contains two digitised Swedish newspapers, Aftonbladet and Dagens Nyheterm, and “Culture”—which will be digitised—contains a literary journal, Bonniers Litterära magasin and all Swedish novels from 1945 to 1989. WeStAc will establish a scholarly ecosystem of digitisation, curation and research with a twofold objective: (A.) to develop digital curation work, including the preparation of massive datasets for research at the library, and (B.) to develop digital history scholarship and perform DH-inspired textual research. WeStAc will trace discursive changes on a scale hitherto unexplored by Swedish scholars. Considering the possibility to process large amounts of data through methods as probabilistic topic models, NER or word embeddings, WeStAc will analyse how societal transformations can be empirically measured—for example by distant reading the notion of globalisation, or data modeling ideas of emancipation and individualisation.
The project design of WeStAc is organised into three distinct, but parallel work packages: WP1 Digitisation & data curation , WP2 Text mining & modeling , and WP3 Welfare state analytics & research. Data driven humanistic research—assisted by an open notebook environment—is often explorative. Datasets ingested into and worked upon by different computational methods, usually results in a scholarly practice where researchers learn about, and gradually familiarises themselves with the data at hand. As a data driven research proposal WeStAc will indeed yield to explorative scholarly practices. However, research within WP3 will also examine more specific matters. Accordingly, the work package is organised into six research tasks—including a number of subtasks—where the first three focuses on general tendencies that cut across all three datasets, and the latter three on particular research issues within each dataset. Task 3.1-3.3 will scrutinize three broad Swedish post-war themes central in all three datasets: globalisation, emancipation and individualisation. While Task 3.4-3.6 will approach the three macro spheres and datasets of politics, media and culture with more specific questions and methods, designated to meet the characteristics of each dataset. Given WeStAc’s combination of general and particular research questions, the project will give new perspectives on well-researched topics, and explore novel ways of analysing historical Big Data.