Concept Mover’s Distance
In collaboration with Marshall A. Taylor, we propose a method for measuring a text’s engagement with a focal concept using distributional representations of the meaning of words. More specifically, this measure relies on Word Mover’s Distance, which uses word embeddings to determine similarities between two documents. In our approach, which we call Concept Mover’s Distance, a document is measured by the minimum distance the words in the document need to travel to arrive at the position of an ideal ”pseudo document” consisting of words denoting a specified concept. This approach captures the prototypical structure of concepts, can be used even when terms denoting concepts are absent from the corpora, and is fairly robust to pruning sparse terms as well as variation in text lengths within a corpus. the paper is forthcoming in the Journal of Computational Social Science and pre-print of “Concept Mover’s Distance: Measuring Concept Engagement in Texts via Word Embeddings” can be found on SocArXiv.
In collaboration with Marshall A. Taylor, we propose a measure of textual spanning which increases when a document is similar to documents which are not also similar to each other (and vice versa). This measure is particularly well-suited for the unique properties of text networks built from document similarity matrices when considered as dense weighted graphs. The paper “Textual Spanning: Finding Discursive Holes in Text Networks” can be found in Socius.
You can check out the reproduction materials, including the code for the function at: github.com/dustinstoltz/textual_spanning_socius. The R-based package,
textSpan, is currently under development in collaboration with Marshall A. Taylor.