• Nenhum resultado encontrado

A PageRank Algorithm based on Asynchronous Gauss-Seidel Iterations

N/A
N/A
Protected

Academic year: 2021

Share "A PageRank Algorithm based on Asynchronous Gauss-Seidel Iterations"

Copied!
20
0
0

Texto

(1)

A PageRank Algorithm based on Asynchronous

Gauss-Seidel Iterations

D. Silvestre, J. Hespanha and C. Silvestre

2018 American Control Conference Milwaukee

(2)

Outline

1 Introduction 2 Problem Statement 3 Proposed Solution 4 Results 5 Simulation Results 6 Concluding Remarks

(3)

Motivation

Traditional PageRank - Compute relative importance of each webpage in the network.

Ranking of Researchers - PageRank has also been proposed to be an alternative to citation count. General Ranking - Since PageRank is equivalent to the relative time spent in a node for a random walk on the graph many more

applications are possible.

1

2 3

4

High-impact peers

(4)

Intuition behind PageRank

The ranks should translate to the relative time one would spent on that node for a random walk.

It should be a steady-state vector.

Dangling nodes must not appear observant and the random walk must be able to jump to different partitions.

Problem: x?∈ [0, 1]n, Pn

(5)

Intuition behind PageRank

The ranks should translate to the relative time one would spent on that node for a random walk.

It should be a steady-state vector.

Dangling nodes must not appear observant and the random walk must be able to jump to different partitions.

Problem: x? = Ax?, x? ∈ [0, 1]n, Pn

(6)

Intuition behind PageRank

The ranks should translate to the relative time one would spent on that node for a random walk.

It should be a steady-state vector.

Dangling nodes must not appear observant and the random walk must be able to jump to different partitions.

Problem: x? = M x?, x? ∈ [0, 1]n, n X i=1 x?i = 1 M = (1 − m)A + m nS, m = 0.15, S := 1n1 | n

(7)

Some related work

Defining distributed randomized versions of the power method:

C. Ravazzi et al. (2015). “Ergodic Randomized Algorithms

and Dynamics Over Networks”. In: IEEE Transactions on

Control of Network Systems

Use of sequential Gauss-Seidel with no projection:

A. Arasu et al. (2002). “PageRank computation and the

structure of the Web: experiments and algorithms”. In: In

The Eleventh International WWW Conference. ACM Press

Using stochastic approximation of the gradient:

J. Lei and H. F. Chen (2015). “Distributed Randomized PageRank Algorithm Based on Stochastic Approximation”.

(8)

Equivalent formulations for PageRank

Using the power method it becomes:

x(k + 1) = M x(k) = (1 − m)Ax(k) +m

n1n Intuitively, the Power Method searches for the eigenvector; It can be solved as an optimization problem;

The rank vector is also a eigenvector:

(In− (1 − m)A)x =

m n1n.

(9)

Problem Statement

Find an algorithm that can exploit the fact that subsets of pages are stored in the same processor;

The iteration should involve A and not M to keep sparsity; Solve the equation:

(In− (1 − m)A)x = m n1n (1) with constraints: x?∈ [0, 1]n, n X i=1 x?i = 1 (2)

PageRank Algorithm Problem

(10)

Power Method as the Jacobi Method

For a linear equation Ax = b partition A = D + R where D = diag(A) and R = A − diag(A);

The Jacobi Method is given by the iteration:

x(k + 1) = D−1(b − Rx(k)) (3)

If we take D = In, R = −(1 − m)A and b = mn1n the Jacobi

Method is equivalent to the Power Method;

Jacobi Method is typically used for solving diagonally dominant equations.

(11)

PageRank using Gauss-Seidel

Partitioning A = L + D + U , for lower and upper triangular matrices L and U and diagonal D;

The Gauss-Seidel Method becomes:

x(k + 1) = (L + D)−1(b − U x(k)) (4)

The method can be written to take advantage of updated values xi(k + 1) = 1 Aii  bi− i−1 X j=1 Aijxj(k + 1) − n X j=i+1 Aijxj(k)  . (5) The Gauss-Seidel is also an iterative algorithm for linear equations but corresponding to a different partitioning of the matrix.

(12)

Results

Theoretical result: ρ(Tgs) < ρ(Tp)

Main challenge: computed x? using the Gauss-Seidel might

not sum to one. It is simulated:

Projection on the n-simplex; Normalization;

(13)

Simulation Results (1/2)

Setup: Simple case of four pages.

Gauss-Seidel is faster than the Power Method.

Performance is affected with no projection.

Normalization and projection are similar. Asynchronous version has an intermediary performance.

1

2 3

(14)

Simulation Results (1/2)

Setup: Simple case of four pages.

Gauss-Seidel is faster than the Power Method.

Performance is affected with no projection.

Normalization and projection are similar. Asynchronous version has an intermediary performance. 0 5 10 15 20 25 30 35 40 45 50 iterations 10-18 10-16 10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 k (M ! In )x (k )k2 Power Method Gauss-Seidel

(15)

Simulation Results (1/2)

Setup: Simple case of four pages.

Gauss-Seidel is faster than the Power Method.

Performance is affected with no projection.

Normalization and projection are similar.

Asynchronous version has an intermediary performance. 0 5 10 15 20 25 30 35 40 45 50 iterations 10-18 10-16 10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 k (M ! In )x (k )k2 Simplex Projection Divide by the sum No projection

(16)

Simulation Results (1/2)

Setup: Simple case of four pages.

Gauss-Seidel is faster than the Power Method.

Performance is affected with no projection.

Normalization and projection are similar.

Asynchronous version has an intermediary performance. 0 5 10 15 20 25 30 35 40 45 50 iterations 10-18 10-16 10-14 10-12 10-10 10-8 10-6 10-4 10-2 100 k (M ! In )x (k )k2 Power Method Gauss-Seidel Asynchronous GS

(17)

Simulation Results (2/2)

Setup: 1000 Monte Carlo experiment for a 500-page network

generated by Barab´asi-Albert algorithm to simulate a power law

network as the World-Wide Web. In a more realistic scenario the projection is key to get very small errors;

Fast initial convergence with normalization;

The problem of having Gauss-Seidel going to zero can be avoided by normalization. 0 5 10 15 iterations 10-8 10-7 10-6 10-5 10-4 10-3 10-2 10-1 k (M ! In )x (k )k2 Power Method Gauss-Seidel Gauss-Seidel w/ projection Gauss-Seidel w/normalization

(18)

Concluding Remarks

Contributions:

We have shown that Gauss-Seidel is faster than the Power Method

Gauss-Seidel does not keep the sum of the state Two approaches:

projection - implies two communication rounds; normalization - no added complexity.

Shown how it can be made Asynchronous and Randomized with performance between the Power Method and the Synchronous Gauss-Seidel.

(19)
(20)

D. Silvestre, J. Hespanha and C. Silvestre

2018 American Control Conference Milwaukee

Referências

Documentos relacionados

Semiologia do aparelho Locomotor parte 1 29/09 Carolina Semiologia do aparelho Locomotor parte 2 06/10 Carolina Semiologia sistema hemolinfopoiético 13/10 Felipe Semiologia

E depois estávamos na dúvida sobre como é que fazíamos… Muitas das vezes nos momentos que nós tínhamos quando as crianças descansavam, ou noutros momentos em que

Dado alguns destes sujeitos também terem níveis altos de ansiedade (revelados pelos resultados obtidos no S.T.A.I.), torna-se difícil fazer interpretações que não caiam

Na fundamentação teórica são abordados os aspetos teóricos relativamente à vivência da prematuridade e da parentalidade pela família, as necessidades sentidas pelos pais durante a

O Endomarketing é uma ferramenta da área de administração que usa estratégias do marketing direcionadas ao público interno das empresas, é uma ferramenta que

Segundo Martins e Duarte (2010) de acordo com as mudanças nas políticas e práticas de ensino, o educador deve ensinar visando o aluno em questão de sua

Descrição: Este é o processo pelo qual o cliente pode escolher a forma de pagamento do pacote de viagem e suas reservas (hotéis, passagens, guia turístico, passeios

Organizar e articular o Setor Juventude Diocesano, quando houver mais de um grupo ou seguimento articulado no trabalho de evangelização da Juventude: Pastorais