• Nenhum resultado encontrado

Clustering and assortativity in configuration graphs

M . M . L e r i 1

1 Institute o f Applied Mathematical Research o f the Karelian Research Centre o f the Russian Academy o f Sciences, Petrozavodsk, Russia,

leriQkrc.karelia.ru

Since the end of the XX-th century the study of random graphs with node degrees being independent identically distributed random variables following a common power-law distribution has gained steam. The rea­

son was quite obvious: observations of real-world complex communica­

tion networks showed (see e.g. Faloutsos etc. fl], Hofstad [2]) that these

© Leri M .M ., 2018

models could be used for their description. However, with networks’

growth it has become obvious that it is not enough to know the node degree distribution and its parameters to get a good-fit model o f a real network, but there are some numerical characteristics that have to fit in also.

In this work we consider configuration graphs introduced by Bol- lobas [3] with the following power-law node degree distribution (see Re- ittu and Norros [4])

P {£ = k} = k-T — (k + 1) -T , т > 1, k = 1, 2, . . . ,

where £ is a random variable equal to the degree o f an arbitrary node.

Node degrees form incident semiedges numbered in an arbitrary order and the graph is constructed by an equiprobable joining of all semiedges one to another to form links. Obviously, such construction supposes the sum of node degrees to be even, so if otherwise one semiedge is added to an equiprobably chosen node to form a lacking connection. Configuration model allows loops and multiple links in its graph.

Recent works (see e.g. Biaconi and Barabasi [5], Pavlov [6]) that the node degree distribution can not only change with the growth of a net­

work size but even be random, which means that the graph is constructed in a so called random environment. Thus, in our work we consider two types of configuration graphs. The first one with the parameter т of the distribution (1) being a fixed value and the second one with the values of т being determined separately for each node from either uni­

form or truncated normal distribution on some predefined interval (a, b), 1 < a < b < x , so we can say that the graph is formed in random environment.

Along with the node degree distribution description of real-world complex networks includes studying various numerical characteristics that show both local and global network properties. The best known among them are global and local clustering coefficients and assortativity coefficient.

Assortativity coefficient A is used for estimating correlation between the degrees of incident nodes, wherefore it is proposed (see e.g. New­

man [7]) to use Pearson correlation coefficient for this purpose. Obvi­

ously, if nodes with high degrees connect mostly to nodes with also high degrees, then the assortativity coefficient A will be positive and the net­

work is called assortative, otherwise the coefficient will be negative and a corresponding network is called disassortative.

For estimating the degree of graph clusterization we used the fol­

lowing global C o and network average CL clustering coefficients (see Newman [7]):

C 3 x number of graph triangles number of connected triples of nodes ’

1 N

c'l = o , i=1 where

C number of triangles connected to node i i number of triples centered on node i

where a "triple" means a single node connected by links to two others, Ci is local clustering coefficient (Newman [7]). Since configuration graphs may have loops and multiple links, in calculating clustering coefficients loops are not counted and multiple links are considered as one.

The results were obtained by simulation technique. We considered configuration graphs with the number of nodes 100 < N < 10000 in two cases of the node degree distribution: with fixed values of 1.01 < т < 2.5 and random environment, when т was either uniformly distributed on a predefined interval [a, b] or was a random variable following a truncated normal distribution on the same interval (a, b) with the expectation of

£ at each interval (a, b) being defined as the middle value (a + b)/2 and the standard deviation a = (b — a)/6 in accordance with the three-sigma rule. The considered intervals (a, b) were the following: (1, 2), which corresponds to a well-known property of communication networks (Hof- stad [2]), (2,3), connected with forest fire modeling (Leri and Pavlov(8j) and (1, 3) as a generalization of the first two. Based on the obtained results we derived regression dependencies of coefficients A, C o and CL on the graph size N and the parameter of the node degree distribution т in the first considered case, when т was fixed. The general form o f the obtained equations looked like the following (here and in what follows C F denotes either of the three considered coefficients):

C F = c • N -d +h/T,

where the coefficient c was negative in the relation for assortativity co­

efficient A, which means that configuration graphs are to be used for modeling only disassortative networks, and for clustering coefficients C o and CL c was positive. The coefficients d and h were always positive.

Determination coefficients for all models were greater than 0.95.

In the case of random environment we also obtained regression rela­

tions of the coefficients A Co and CL on the graph size N. The general form of these equations was derived to be as follows:

C F = p • N - q,

where the coefficient p was negative in the relation for the coefficient A and positive for C o and CL. Coefficient q was positive in all cases and R 2 > 0.97 for all models.

We believe that these results will be helpful in constructing models of specific networks in the form of configuration graphs with the power-law node degree distribution ( 1) by choosing the best fitting values of the parameter т от by choosing the distribution of a random т fitting the real values of the assortativity and clustering coefficients of these networks.

Moreover, we compared the values of A Co and CL calculated for real- world networks and given by Newman [7] with the same coefficients for the corresponding configuration graphs of the same size obtained from our equations. The results showed that for modeling of the Internet on AS-level configuration graphs with 1.02 < т < 1.17 give the best fit, while for modeling of some social networks the value of т must be greater than 2.

The study was supported by the Russian Foundation for Basic Re­

search, grant 16-01-00005. The research was carried out using the equip­

ment of the Core Facility of the Karelian Research Centre of the Russian Academy of Sciences.

References

1. C.Faloutsos, P. Faloutsos, M. Faloutsos, On power-law relationships of the Internet topology, Comp. Comm. Rev. 29 (1999) 251-262.

2. R. Hofstad, Random, Graphs and Complex Networks, Cambridge Univ.

Press, Cambridge, 2017, Vol. 1.

3. B. A. Bollobas, A probabilistic proof o f an asymptotic formula o f the number o f labelled regular graphs, European J. Com,bin. 1:4 (1980) 311-316.

4. H. Reittu, I. Norros, On the power-law random graph model o f massive data networks, Perform,an,ce Evaluation 55:1-2 (2004) 3-23.

5. G. Biaconi, A.-L. Barabasi, Bose-Einstein condensation in complex net­

works, Physical Review Letters 86:24 (2001) 5632-5635.

6. Yu. L. Pavlov, Conditional configuration graphs with discrete power-law distribution o f vertex degrees, Sbornic: Mathematics 209:2 (2018) 258­

275.

7. M. E. J. Newman, The structure and function of complex networks, SIAM Rev. 45:2 (2003) 167-256.

8. M. Leri, Yu. Pavlov, Forest fire models on configuration random graphs, Fundamenta Informaticae 145:3 (2016) 313-322.

Documentos relacionados