Methods for the description and analysis of processes in real-life networks

The roots of graph theory lead back to the puzzle of Königsberg's bridges. In 1736 Leonhardt Euler published a paper detailing this problem, and also proposed a solution for it. Since then much has been learned about the mathematical properties of graphs, and the field has a long history of app...

Teljes leírás

Elmentve itt :
Bibliográfiai részletek
Szerző: Bóta András
További közreműködők: Krész Miklós (Témavezető)
Pluhár András (Témavezető)
Dokumentumtípus: Disszertáció
Megjelent: 2015-05-07
Tárgyszavak:
doi:10.14232/phd.2510

mtmt:2993200
Online Access:http://doktori.ek.szte.hu/2510
Leíró adatok
Tartalmi kivonat:The roots of graph theory lead back to the puzzle of Königsberg's bridges. In 1736 Leonhardt Euler published a paper detailing this problem, and also proposed a solution for it. Since then much has been learned about the mathematical properties of graphs, and the field has a long history of applications including sociology, biology or operations research and optimization. Things changed when computers became accessible and affordable to most researchers, allowing them to collect, store, share and study large amounts of data on graphs observed in real-life. This gave rise to the interdisciplinary field of network science, which is dedicated to the description and analysis of real-life graphs borrowing methods from mathematics, physics, computer science and sociology. The topics of network science include the study of the organization principles of networks, for example their degree distribution and the formation of groups, as well as several processes taking place on the networks themselves. The goal of the dissertation is to provide an overview on the authors' works focusing on three major topics of network science: overlapping community detection, dynamic community detection and the study of infection processes. Based on our experiences with known community detection algorithms and the works of Csizmadia et al. we have developed an overlapping community detection method, the hub percolation method, that is capable of handling various types of networks or discover different layers of the community structure of single network. We have tested our method on several benchmarks including Newman's networks and Lancichinetti's community based graph generator. We have given two real-life case-studies: one of them examines the properties of an ownership network of Hungarian companies. The other one compares the community structures of a Hungarian and an English word association graph. Motivation for examining another aspect of community detection -- dynamic community detection -- came from applications on other social and economic networks. Based on the works of Palla et al. we have developed an algorithm to efficiently match the communities of two neighboring networks in a time series of graphs. Our method distinguishes eleven community events, and its mechanism does not depend on the specifics of the used community detection algorithm. We have evaluated our method on two real-life networks. The last topic of this dissertation is the study of infection processes in networks with a focus on economic applications. A common problem of these applications is, that the edge infection probabilities required to compute these processes are often not available, missing values are either estimated or constants are used. In order to provide a more systematic approach we have proposed the Inverse Infection Problem, its task being the computation of the edge infection probabilities with the help of observations on the beginning and the end of the process. We have proposed the Generalized Cascade model to help with the computations required for this model, as well as several heuristics to further improve performance. We have given a learning method based on Particle Swarm Optimization as a solution. Finally we have tested the performance of our solution on benchmark networks, and we have also published a case-study dealing with the application of our method in the estimation of credit default in banking.