Skip to content

Community Detection for Twitter follower network of 40 million users using mapreduce

Notifications You must be signed in to change notification settings

derdewey/TwitterCommunityDetection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Scalable Community Detection using Label Propagation and Map Reduce

Author: Akshay Bhat 
Contact: akshaybhat [at] gmail.com

Please visit http://www.akshaybhat.com/LPMR for more information

Organization:
Folder				Description
lp				Code for Communtiy Detection 

pre				Code to pre processing the edgelist file 

twitter			Code for automating everything for twitter dataset

Usage:
note that this is an experimental code, and not a library. Thus it involves multiple hacks.

You will need a working hadoop installation, this code has been tested using a cluster which used hadoop 0.19. Thus It should work very well with versions > 0.19. 
Still you will need to change path to hadoop streaming jar file.
 
Download Twitter_rv.net from http://an.kaist.ac.kr/traces/WWW2010.html

Download numeric2users.tar.gz from above website, extract it, rename it as Users.txt and put it outside the LPMR folder. (sorry if this sounds weird, will fix this soon) 

cd into twitter directory and execute
./run-twitter.sh twitter_rv.net

[you will most likely get errors due to hadoop not being ]

License: Research purpose only

About

Community Detection for Twitter follower network of 40 million users using mapreduce

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published