WP10 - JRA3 - Exploratories

Ojectives

This WP cross-disciplinary social mining research will be structured in vertical thematic environments, called ‘exploratories’, aimed at creating new datasets and services to be integrated within the SoBigData++ research infrastructure. Each exploratory will have two major figures: (i) a Scientific Leader which will be a domain expert and will have the responsibility of fostering integration of research activities and of supervising their scientific soundness and validity; (ii) a User Community Activist, a researcher of the domain with the tasks of share the information among the partners, to collaborate with WP3, WP4 and WP5 and the other Exploratories. She/he will help in organizing events and disseminating the results and to foster interests among the e-infrastructure users through social network activities (i.e. posts, users interactions, etc.). The Scientific Leader and User Community Activist roles will be assigned at the beginning of the projects and the monitoring of their activities will be part of the task leader duties. The WP activity may include additional emerging Exploratories which will be included in the SoBigData++ platform from the interaction between associated partners or other users, those new Exploratories will be evaluated by the consortium using interests groups.

Tasks

T10.1 Societal Debates and Misinformation Analysis
Task leader: USFD
Participants: IMT, BSC, CNR, ETHZ, UT, CNRS, UNIPI, UvA, CSD, CEU
By analysing discussions on social media and newspaper articles, this exploratory aims to develop methods and datasets for studying online public debates in (near) real-time and at scale, i.e. during election campaigns or on controversial topics such as vaccination, abortion, or LGBT rights. The starting point will be the identification of key themes and points of view debates. Thus, the discussion on this topic will be analysed and visualised. Moreover, there will be an assessment of their evolution through time and space (i.e. in different countries or regions). The central focus will regard misinformation, a field where we will develop new methods for detecting, analysing, and tracking online misinformation and propaganda across social media platforms, countries, and over time. A key aim is to improve the accuracy of the methods through more data, experimentation with semi-supervised and unsupervised methods, and integrating the latest advances in deep learning. We will also study the effect of different social relationships when it comes to opinion formation. A multi-disciplinary approach will be adopted, going beyond computer science to integrate also social and political scientists, as well as end-users and practitioners, such as the Centre for Study of Democracy (CSD), which will focus specifically on Russian propaganda and misinformation in Eastern and Central Europe. The results of T10.1 will be the development of new tools for the infrastructure, thus empowering researchers from outside the consortium to work on these topics.

T10.2 Demography, Economy & Finance 2.0
Task leader: ETHZ
Participants: CNR, SNS, IMT, UT, CEU, UNIROMA1, PSE, SSSA, CEU
The aim is to combine statistical methods and traditional economic data (typically at low-frequency) with high-frequency data from non-traditional sources, such as i.e. web, supermarkets, etc., for now-casting economic, socio-economic and well-being indicators. This will allow to study and measure real-life cost by studying price variation and socioeconomic status inference. Furthermore, this task’s activity is expected to support studies on the correlation between people wellbeing and their social and mobility data, aiming at discovering whether they change in less affluent areas. This exploratory will study traditional complex socio-economic financial systems in conjunction with emerging ones, in particular, block-chain & cryptocurrency markets and their applications such as smart property, Internet of things (IoT), energy trading, and smart contracts. In the field of finance, different aspects will be studied such as risk and liquidity estimation, microstructure dynamics & market predictions, as well as different connections to social media and news. A particular emphasis will be devoted to the stability and “fairness” properties (for example the absence of manipulations such as wash trades) of cryptocurrency markets. The Swiss partner ETHZ, in collaboration with the other partners, already started collecting data in order to attract the research community.

T10.3 Sustainable Cities for Citizens
Task leader: IMT
Participants: CNR, FRH, CEU, UAQ, CRA, KTH, URV, Eli
This exploratory will focus on narrating stories about cities, the sustainability of their flows of energy and materials and people living in it. Data scientists describe those territories by means of an industrial ecology perspective driven by data, statistics and models. This allows citizens and local administrator to better understand cities and how to improve them. In this task, we analyse data describing a city from different spatial and temporal scales. On city-wide scales, we analyse the main energy and material flows (i.e. water, energy and material consumption) in order to give insights on the sustainability of transformation processes occurring in cities (the so-called ‘urban metabolism’), pointing out the circularity of flows and main polluting/GHG emission sectors and factors. On a smaller scale, data related to electric mobility services in major EU cities will allow the characterisation of the demand of dynamic users, granting the derivation of operational models to optimize the electric mobility charging and relocation service, at the same time minimising its impact on the power grid.

T10.4 Migration Studies
Task leader: PSE
Participants: UNIPI, CNR, AALTO, CEU
Could Big Data help to understand the migration phenomenon? In this exploratory, our scientists will try to answer various questions about migration in Europe and in the world. Several studies are ongoing, including developing economic models of migration, now-casting migration stocks and flows, identifying the perception of migration and effect on the leaving and the receiving communities. We will also study the effect of migrants’ personal networks (through the ego network graph abstraction) on the different migration phases (i.e. migration choices as well as cultural assimilation and transnationalism).

T10.5 Sports Data Science
Task leader UNIPI
Participants: CNR, FRH
The task will provide massive heterogeneous dynamic data describing several sports – especially soccer, cycling and rugby – to construct an interpretable, explainable and easy-to-use tool for a variety of stakeholders in sports: coaches and managers, athletes, scouts, journalists and the general public. Those studies open an interesting perspective on how to understand and explain the factors influencing sports success and how to build simulation tools for boosting both individual and collective performance.

T10.6 Social Impacts of AI and Explainable Machine Learning
Task leader: UNIPI
Participants: CNR, FRH, USFD, LUH, KTH, UNIROMA1, ETHZ, UPF, UT, KCL
The task will investigate the foreseeable impact of AI and Big Data on society, developing analytical and simulation tools. It will also integrate a vast repertoire of practical tools for explainable AI, in particular, methods for deriving meaningful explanations of black-boxes decision systems based on machine learning.

T10.7 SoBigData Interest groups
Task leader: UNIROMA1
Participants: ALL
Interests groups are possible future Exploratories which will be investigated by the consortium to understand if there are interest and experiences which may be transformed in services. Those interest groups will organize meetings with experts in the field, researchers and industries to eventually become Exploratories in SoBigData++. The first interest group will be Network Medicine: it will include the collaboration with the Department of Computer, Control, and Management Engineering Antonio Ruberti at Sapienza University of Rome “DIAG”, professor Albert László Barabási and other experts in the medical field. Since network medicine deals with sensitive data that causes privacy concerns in the field of medical Big Data, the ethical and legal framework developed by the WP2, will be deployed to ensure compliance with the EU ethics, principles and privacy regulations.