Methodology
The main idea of our research is to combine natural language (NLP) methods and approaches with network-based methods for the analysis and characterisation of information spreading patterns in social media related to COVID-19. In this research, we will propose a multilayer framework that defines a set of approaches and methods that capture three aspects of information spreading analysis: content, context and dynamics.
The core of the proposed framework is a multilayer network that integrates heterogeneous networks into one complex structure, which enables an analysis of various layers and its interconnections. The general models of multilayer networks are initially proposed in (S. Boccaletti et al., 2014; M. Kivelä et al, 2014). According to Boccaletti et al., multilayer networks explicitly incorporate multiple aspects of connectivity to describe systems interconnected through different categories of connections. Within this structure, each aspect is represented by a separate layer and the same node or entity may have different kinds of interactions on different layers.
Within the multilayer framework we will define a set of approaches for the thorough quantitative and qualitative analysis of textual data crawled from social media, i.e. a content-based analysis, a context-based analysis and a dynamic-based analysis. The goal of the analyses is to identify which properties differentiate between various information spreading patterns. This is the first step toward the definition of algorithms for the recognition of information spreading patterns.
The content-based analysis will be focused on the content of textual information and will rely on NLP approaches. We will apply a set of methods and techniques based on neural networks and deep learning (R. Collobert et al., 2011; Q. Le & t. Mikolov, 2013; LeCun, Bengio & Hinton, 2015). We will use some standard NLP techniques as well (C. Manning & H. Schütze, 1999). We will perform tasks of keyword extraction, measuring semantic similarity of texts, text classification, etc. Additionally, we will apply methods from descriptive statistics to describe and visualise texts/corpora properties.
The context-based analysis of the information spreading will be performed in terms of network analysis on the global, middle and local scale (A. Réka & A. Barabási, 2002; S. Boccaletti et al., 2006). We will also analyse inter-layer and intra-layer connections across layers in the multilayer networks.
The dynamic-based analysis will include the analysis of the cascade effects and network dynamics (J. Mahdi & M. Perc, 2017; D. Brien et al., 2019). Here we will apply standard measures to characterise the dynamics of information spreading combined with the dynamics of the multilayer network.