srwr

Python Implementation for Signed Random Walk with Restart (SRWR)

This repository aims to implement Signed Random Walk with Restart (SRWR) model which was proposed by the following paper:

Personalized Ranking in Signed Networks using Signed Random Walk with Restart, Jinhong Jung, Woojung Jin, Lee Sael, and U Kang, IEEE International Conference on Data Mining (ICDM) 2016, Barcelona, Spain.

The following materials (slides, matlab codes, and datasets) can be useful.

You can download the slides for the paper here.
The MATLAB version used in the paper can be downloaded here.
The datasets used in the paper are available at here.

Installation

To install python packages used in this repository, type the followings:

cd srwr
pip install -r requirements.txt

Usage

We provide the following simple command line usage:

python -m srwr --input-path data/sample.tsv --output-path output/scores.tsv --output-type rd --seed 3942

This command will compute the SRWR score vector w.r.t. the seed node given by --seed in the input graph specified by --input-path, and write the result vector into the target file in --output-path.

Input and Output Format

Input Format

The default input of srwr represents the edge list of a graph with the following format (tab separated):

# format: source (int) \t target (int) \t sign (int)
1	2   +1
1	4   -1
2	3   +1
...

Although we assume the sign is +1 or -1, any real number can be used as a signed weight of an edge in the input.

Output Format

The default output of srwr contains an SRWR score vector (--output-type is among rp, rn, and rd) w.r.t. the given seed node as follows:

# format : an SRWR score of i-th node
0.1232e-3
0.0349e-4
...

When --output-type is both, then the output consists of rp and rn as follows:

# format : a positive SRWR score \t a negative SRWR score of i-th node
0.1232e-3   0.2322e-9
0.0349e-9   0.2939e-4
...

How to Use `srwr` in My Codes?

The following example shows how to import srwr and compute an SRWR query.

from srwr.srwr import SRWR
srwr = SRWR()
srwr.read_graph(input_path) # read graph from input_path
srwr.normalize() # do semi row-normalization
rd, rp, rn, residuals = srwr.query(seed, c, epsilon, beta, gamma, max_iters, handles_deadend) # compute an SRWR query w.r.t. seed

Note that rp is a positive SRWR score vector, rn is a negative SRWR score vector, and rd is called a relative trusthworthiness score vector (i.e., rd = rp - rn) indicating that the negative rd score of node u means seed considers node u as enemy while the positive one means seed considers node u as friend.

Arguments of `srwr`

We summarize the input arguments of srwr in the following table:

Arguments	Explanation	Default
`input-path`	Input path for a graph	`None`
`output-path`	Output path for storing a query result	`None`
`output-type`	Type of output vector {`rd`, `rp`, `rn`, `both`}	`None`
`seed`	A single seed (query) node id	`None`
`c`	Restart probablity	`0.15`
`beta`	Balance attenuation factor	`0.5`
`gamma`	Balance attenuation factor	`0.5`
`epsilon`	Error tolerance for power iteration	`1e-9`
`max-iters`	Maximum number of iterations for power iteration	`100`
`handles-deadend`	If true, handles the deadend issue	`True`
`verbose`	If true, it will show a progress bar over iterations	`True`

Note the followings:

For directed graphs, there might be deadend nodes whose outdegree is zero. In this case, a naive power iteration would incur leaking out scores. handles_deadend exists for such issue handling deadend nodes. With handles_deadend, you can guarantee that the sum of rp + rn vector is 1. Otherwise, the sum would less than 1 in directed graphs. The strategy srwr exploits is that whenever a random surfer visits a deadend node, go back to a seed node (or one of seed nodes), and restart.

How to test this repository?

We add a simple script to test codes in this repository.

python3 -m unittest test.srwr_test

This command will test the codes on two testcases located in data/testcases-*, and show the below messages if it passes the tests.

----------------------------------------------------------------------
Ran 2 tests in 0.186s

OK

Citation

@inproceedings{conf/icdm/JungJSK16,
  author    = {Jinhong Jung and
               Woojeong Jin and
               Lee Sael and
               U. Kang},
  title     = {Personalized Ranking in Signed Networks Using Signed Random Walk with
               Restart},
  booktitle = {{IEEE} 16th International Conference on Data Mining, {ICDM} 2016,
               December 12-15, 2016, Barcelona, Spain},
  pages     = {973--978},
  year      = {2016},
}

Todo

Add a notebook for SRWR tutorial
Add an option for global ranking of SRWR
Add a link for describing the SRWR's global ranking
Add a link for describing the deadend issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

srwr

Installation

Usage

Input and Output Format

Input Format

Output Format

How to Use `srwr` in My Codes?

Arguments of `srwr`

How to test this repository?

Citation

Todo

Files

README.md

Latest commit

History

README.md

File metadata and controls

srwr

Installation

Usage

Input and Output Format

Input Format

Output Format

How to Use srwr in My Codes?

Arguments of srwr

How to test this repository?

Citation

Todo

How to Use `srwr` in My Codes?

Arguments of `srwr`