Point and click data mining

Barbara Stephenson
Idaho State University

While the Institute last year was truly transformative for me, in how I think about my research and the work of the grant project, there was one aspect of the softwares we were introduced to that really bothered me. That was the text-mining, which works well for those scholars who are working on literary texts, but left a lot to be desired by those working on different types of sources. One of the biggest drawbacks was that these programs do not work for texts that are not in English. I remarked more than once that it would be great if there was a program that could take a text file, in any language, and let the user select text and define it as a node or an attribute and create an edgelist without having to use a spreadsheet or some other predefined format. At the time, I thought such software was a pipe dream; however, when talking to the programmers on the grant project, I was surprised to be told that such a program would be easy! So I would like to demonstrate the prototype of this software. As many will remember, the data I am collecting on the Parlement of Bordeaux contains a great deal of data that is difficult to store in an edgelist, and also includes complex relationships that change over time. One of the problems in dealing with this data is that it is in early modern French, and even modern French dictionaries do not recognize many of the terms. This software eliminates that problem; it not only deals with the text as a string, not a word or words, but it also allows the user to define what the term represents and what its relationships are to other terms in the data.

Back to Networks and Network Analysis for the Humanities: Reunion Conference