janaSONIC lab is proud to welcome Jana Diesner, who will present a talk on Thursday, Mar 07, 2013 (12:00-01:15pm) in Frances Searle Building Room 1.483 on Northwestern’s Evanston Campus. All are welcome to attend.

About The Talk: From Words to Networks: Relevance of methodological choices for real-world applications

Coding texts as socio-technical networks – a process also known as relation extraction – can be used to collect network data on hard-to-access groups and organizations. This process requires people to choose appropriate methods and parameter settings. The impact of these choices on the resulting data and findings can be strong, but is hardly understood. I discuss our findings from addressing this problem:

We applied four common relation extraction methods – from fairly qualitative to fully automated (including probabilistic, machine-learning based techniques) – to large-scale, open-source corpora from the business, science and geopolitical domain, and compared the retrieved networks. I will report on common agreements and disagreements about network structure and behavior depending on the considered methods, and show how these methods can be combined to gain a more robust and comprehensive understanding of a network.

Another factor limiting the reliability of relation extraction methods is the propagation of errors throughout multi-step analysis procedures and pipelines. I will present our findings from a series of empirical experiments that we conducted to find answers to the following question: How much variation in network structure and properties is due to the error rates of the involved sub-routines? Does increasing the accuracy of these techniques actually matter for network analysis results?

About Jana Diesner

Jana Diesner is an Assistant Professor at the iSchool (a.k.a. Graduate School of Library and Information Science) at the University of Illinois at Urbana-Champaign. She earned her PhD from Carnegie Mellon University, School of Computer Science. Jana conducts research at the nexus of network science, natural language processing and machine learning. With her work, she aims to advance the understanding and computational analysis of the interplay and co-evolution of information and socio-technical networks. She develops, analyzes and applies methods and technologies for extracting information about networks from text data and considering the substance of information for network analysis. In her empirical work, she studies networks from the business, science and geopolitical domain. She is particularly interested in covert information and covert networks. For more information see http://people.lis.illinois.edu/~jdiesner/.

Download the flyer for this talk