perldoc Stefan::Evert •
Computational Corpus Linguistics •
Stefan Evert - Research - Teaching - CV - Publications - Software - Private Life
My computational corpus linguistics group at FAU Erlangen-Nürnberg carries out foundational methodological research on the quantitative analysis of large text corpora. The algorithms and software tools developed by the group support innovative studies in the digital humanities and social sciences as well as practical applications in language technology. A particular focus lies on understanding cooccurrence phenomena and their application in corpus-based discourse analysis.
Methodological foundations – Corpus tools – Cooccurrence phenomena
Corpus research in linguistics as well as in the digital humanities and social sciences relies on a wide range of statistical techniques and visualizations. A central goal of my research is to develop sound methodological foundations for corpus linguistics, which address key problems in order to ensure that quantitative analyses are both reliable and meaningful.
The FAU sub-project was concerned with methodological issues and the interpretation of quantitative measures in literary stylometry, focussing on authorship attribution (phase 1) and lexical/syntactic complexity (phase 2).
My group develops algorithms and software tools for the automatic linguistic annotation, efficient indexing, flexible query and quantitative analysis of large text corpora. These tools form the basis of innovative research in the digital humanities as well as practical and commercial applications in language technology.
A corpus-linguistic approach to argumentation mining in social media, combined with representation and inference in a powerful logical framework.
Cooccurrence patterns – such as collocations, multiword expression, valency and distributional semantics – play a central role not only in corpus linguistics but also for studying public discourses and political propaganda. My research in this area focuses on improving and refining the underlying analytical techniques as well as the development of new interactive methods for multi-modal corpus-based discourse analysis.
“Attitudes and Opinions towards Nuclear Power and Renewable Energy and the Emergence of a Transnational Algorithmic Public Sphere.” A key contribution of this project is the development of the innovative MMDA methodology and software toolkit for corpus-assisted discourse analysis.