diff --git a/.gitignore b/.gitignore index ea4e8eb..1ce9df9 100644 --- a/.gitignore +++ b/.gitignore @@ -41,3 +41,4 @@ cred.R *.rds *.csv *.graphml +.Rproj.user diff --git a/README.md b/README.md index 050d5bc..1e8dbbf 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ## What does this package do? -`vosonSML` is an R package that provides a suite of tools for collecting and constructing networks from social media data. It provides easy-to-use functions for collecting data across popular platforms (Instagram, Facebook, Twitter, and YouTube) and generating different types of networks for analysis. +`vosonSML` is an R package that provides a suite of tools for collecting and constructing networks from social media data. It provides easy-to-use functions for collecting data across popular platforms (Twitter, YouTube, Reddit, Instagram and Facebook) and generating different types of networks for analysis. `vosonSML` is the `SocialMediaLab` package, renamed. We decided that `SocialMediaLab` was a bit too generic and also we wanted to indicate the connection to the [Virtual Observatory for the Study of Online Networks Lab](http://vosonlab.net), where this package was conceived and created. @@ -18,7 +18,7 @@ If you are having trouble getting data from Facebook it is probably due to a num #### Twitter -If you are getting the error `Error in check_twitter_oauth( )`, please find a [solution here](https://github.com/geoffjentry/twitteR/issues/90). +If you are getting the error `Error in check_twitter_oauth()`, please find a [solution here](https://github.com/geoffjentry/twitteR/issues/90). #### Instagram @@ -26,7 +26,7 @@ Instagram API access is severely limited if you do not have an authorised app, w ### Special thanks -This package would not be possible without key packages by other authors in the R community, particularly: [igraph](https://github.com/igraph/rigraph), [Rfacebook](https://github.com/pablobarbera/Rfacebook), [instaR](https://github.com/pablobarbera/instaR), [twitteR](https://github.com/geoffjentry/twitteR), [data.table](https://github.com/Rdatatable/data.table), [tm](https://cran.r-project.org/web/packages/tm/index.html), and [httr](https://github.com/hadley/httr). +This package would not be possible without key packages by other authors in the R community, particularly: [igraph](https://github.com/igraph/rigraph), [twitteR](https://github.com/geoffjentry/twitteR), [RedditExtractoR](https://github.com/ivan-rivera/RedditExtractoR), [instaR](https://github.com/pablobarbera/instaR), [Rfacebook](https://github.com/pablobarbera/Rfacebook), [data.table](https://github.com/Rdatatable/data.table), [tm](https://cran.r-project.org/web/packages/tm/index.html), and [httr](https://github.com/hadley/httr). ## Getting started @@ -36,23 +36,37 @@ The [vosonSML page on the VOSON website](http://vosonlab.net/vosonSML) also has ## Using Magrittr's pipe interface -The process of authentication, data collection and creating social network can be expressed with the 3 verb functions: *Authenticate*, *Collect* and *Create*. The following are some of the examples from the package documentation expressed with the pipe interface. +The process of authentication, data collection and creating social network is now expressed with the 3 verb functions: *Authenticate*, *Collect* and *Create*. The following are some of the examples from the package documentation using the pipe interface. -```{r} -require(magrittr) -# Authenticate with youtube, Collect data from youtube and Create an actor network -Authenticate("youtube", apiKey= apiKey) %>% Collect(videoIDs = videoIDs) %>% Create("Actor") - -# Authenticate with facebook, archive the API credential, Collect data about Starwars Page and Create a bimodal network -# You can use facebook, FaCebooK or Facebook in the datasource field -Authenticate("Facebook", appID = appID, appSecret = appSecret) %>% SaveCredential("FBCredential.RDS") %>% Collect(pageName="StarWars", rangeFrom="2015-05-01",rangeTo="2015-06-03") %>% Create("Bimodal") +```R +library(magrittr) +library(vosonSML) -# Authenticate with Twitter, Collect data about #auspol and Create a semantic network -Authenticate("twitter", apiKey=myapikey, apiSecret=myapisecret,accessToken=myaccesstoken, accessTokenSecret=myaccesstokensecret) %>% Collect(searchTerm="#auspol", numTweets=150) %>% Create("Semantic") - -# Create Instagram Ego Network -myUsernames <- -Authenticate("instagram", appID = myAppId, appSecret = myAppSecret) %>% Collect(ego = TRUE, username = c("adam_kinzinger","senatorreid")) %>% Create +# Authenticate with youtube, Collect data from youtube and Create an actor network +actorNetwork <- Authenticate("youtube", apiKey = myYoutubeAPIKey) %>% + Collect(videoIDs = myYoutubeVideoIds) %>% Create("actor") + +# Authenticate with twitter, Collect 150 tweets for the "#auspol" hashtag and Create a semantic network +semanticNetwork <- Authenticate("twitter", apiKey = myTwitAPIKey, apiSecret = myTwitAPISecret, + accessToken = myTwitAccessToken, + accessTokenSecret = myTwitAccessTokenSecret) %>% + Collect(searchTerm = "#auspol", numTweets = 150) %>% Create("semantic") + +# Collect reddit threads and Create an actor network with comment text as edge attribute +actorCommentsNetwork <- Authenticate("reddit") %>% + Collect(threadUrls = myThreadUrls, waitTime = 5) %>% + Create("actor", includeTextData = TRUE) + +# Authenticate with facebook, archive the API credential, Collect data about the "Starwars" Page and +# Create a bimodal network +bimodalNetwork <- Authenticate("facebook", appID = myFacebookAppId, appSecret = myFacebookAppSecret) %>% + SaveCredential("FBCredential.RDS") %>% + Collect(pageName = "StarWars", rangeFrom = "2015-05-01", rangeTo = "2015-06-03") %>% + Create("bimodal") + +# Create an instagram ego network for provided users +egoNetwork <- Authenticate("instagram", appID = myInstaAppId, appSecret = myInstaAppSecret) %>% + Collect(ego = TRUE, username = c("adam_kinzinger", "senatorreid")) %>% Create() ``` ## Example networks diff --git a/vosonSML/.lintr b/vosonSML/.lintr deleted file mode 100644 index 0980422..0000000 --- a/vosonSML/.lintr +++ /dev/null @@ -1,3 +0,0 @@ -linters: with_defaults(#camel_case_linter = NULL, - object_usage_linter = NULL, - line_length_linter(120)) diff --git a/vosonSML/DESCRIPTION b/vosonSML/DESCRIPTION index b4afb81..62acbc5 100644 --- a/vosonSML/DESCRIPTION +++ b/vosonSML/DESCRIPTION @@ -1,12 +1,18 @@ Package: vosonSML -Version: 0.23.5 -Date: 2018-11-01 +Version: 0.24.0 Title: Tools for Collecting Social Media Data and Generating Networks for Analysis -Description: A suite of tools for collecting and constructing networks from social media data. Provides easy-to-use functions for collecting data across popular platforms (Instagram, Facebook, Twitter, and YouTube) and generating different types of networks for analysis. +Description: A suite of tools for collecting and constructing networks from social media data. + Provides easy-to-use functions for collecting data across popular platforms (Instagram, + Facebook, Twitter, YouTube and Reddit) and generating different types of networks for analysis. Type: Package -Imports: tm, stringr, twitteR, RCurl, bitops, rjson, plyr, igraph, Rfacebook (>= 0.6.15), Hmisc, data.table, httpuv, instaR, methods, httr -Suggests: magrittr, testthat -Author: Timothy Graham & Robert Ackland with contributions from Bryan Gertzel & Chung-hong Chan +Imports: tm, stringr, twitteR, RCurl, bitops, rjson, plyr, igraph (>= 1.2.2), Rfacebook (>= 0.6.15), + Hmisc, data.table, httpuv, instaR, methods, httr, RedditExtractoR (>= 2.1.2), magrittr, + dplyr (>= 0.7.8), rlang (>= 0.3.0.1) +Depends: R (>= 3.2.0) +Suggests: testthat +Encoding: UTF-8 +Author: Timothy Graham, Robert Ackland, Chung-hong Chan, Bryan Gertzel Maintainer: Bryan Gertzel License: GPL (>= 2) -RoxygenNote: 6.1.0 +RoxygenNote: 6.1.1 +NeedsCompilation: no diff --git a/vosonSML/NAMESPACE b/vosonSML/NAMESPACE index 4fd240f..831488c 100644 --- a/vosonSML/NAMESPACE +++ b/vosonSML/NAMESPACE @@ -32,14 +32,30 @@ import(methods) import(rjson) import(tm) importFrom(Hmisc,escapeRegex) +importFrom(RedditExtractoR,reddit_content) +importFrom(RedditExtractoR,user_network) importFrom(Rfacebook,fbOAuth) importFrom(Rfacebook,getPage) importFrom(Rfacebook,getPost) importFrom(Rfacebook,getUsers) +importFrom(dplyr,coalesce) +importFrom(dplyr,filter) +importFrom(dplyr,group_by) +importFrom(dplyr,left_join) +importFrom(dplyr,mutate) +importFrom(dplyr,rename) +importFrom(dplyr,row_number) +importFrom(dplyr,select) +importFrom(dplyr,summarise) +importFrom(dplyr,ungroup) importFrom(igraph,'V<-') importFrom(igraph,V) importFrom(igraph,delete.vertices) +importFrom(igraph,delete_vertex_attr) importFrom(igraph,graph.data.frame) +importFrom(igraph,graph_from_data_frame) +importFrom(igraph,set.graph.attribute) +importFrom(igraph,set_graph_attr) importFrom(igraph,simplify) importFrom(igraph,write.graph) importFrom(instaR,getComments) @@ -49,7 +65,9 @@ importFrom(instaR,getLikes) importFrom(instaR,getUser) importFrom(instaR,instaOAuth) importFrom(instaR,searchInstagram) +importFrom(magrittr,'%>%') importFrom(plyr,ldply) +importFrom(rlang,'.data') importFrom(stats,'na.omit') importFrom(stringr,str_extract) importFrom(stringr,str_match_all) diff --git a/vosonSML/R/Authenticate.R b/vosonSML/R/Authenticate.R index b2b61d7..6d48a35 100644 --- a/vosonSML/R/Authenticate.R +++ b/vosonSML/R/Authenticate.R @@ -21,20 +21,21 @@ #' \code{Collect}, \code{Create} workflow. #' #' @param socialmedia character string, social media API to authenticate, -#' currently supports "facebook", "youtube", "twitter" and "instagram" +#' currently supports "facebook", "youtube", "twitter", "instagram" and "reddit" #' @param ... additional parameters for authentication #' \code{facebook}: appID, appSecret #' \code{youtube}: apiKey #' \code{twitter}: apiKey, apiSecret, accessToken, accessTokenSecret #' \code{instagram}: appID, appSecret -#' +#' \code{reddit}: appName, appKey, appSecret, useTokenCache +#' #' @return credential object with authentication information -#' +#' #' @note Currently, \code{Authenticate} with socialmedia = "twitter" generates #' oauth information to be used in the current active session only (i.e. #' "side-effect") and no authentication-related information will be stored in #' the returned \code{credential} object. -#' +#' #' @author Chung-hong Chan #' @seealso \code{\link{AuthenticateWithFacebookAPI}}, #' \code{\link{AuthenticateWithInstagramAPI}}, @@ -64,17 +65,18 @@ #' } #' @export Authenticate <- function(socialmedia, ...) { - authenticator <- switch(tolower(socialmedia), - facebook = facebookAuthenticator, - youtube = youtubeAuthenticator, - twitter = twitterAuthenticator, - instagram = instagramAuthenticator, - stop("Unknown socialmedia") - ) - auth <- authenticator(...) - credential <- list(socialmedia = tolower(socialmedia), auth = auth) - class(credential) <- append(class(credential), "credential") - return(credential) + authenticator <- switch(tolower(socialmedia), + facebook = facebookAuthenticator, + youtube = youtubeAuthenticator, + twitter = twitterAuthenticator, + instagram = instagramAuthenticator, + reddit = redditAuthenticator, + stop("Unknown socialmedia") + ) + auth <- authenticator(...) + credential <- list(socialmedia = tolower(socialmedia), auth = auth) + class(credential) <- append(class(credential), "credential") + return(credential) } ### For the side effect of saving the credential into a file. @@ -114,19 +116,19 @@ Authenticate <- function(socialmedia, ...) { #' } #' @export SaveCredential <- function(credential, filename = "credential.RDS") { - if (credential$socialmedia == "twitter") { - warning("Credential created for Twitter will not be saved.") - } else { - saveRDS(credential, filename) - } - return(credential) + if (credential$socialmedia == "twitter") { + warning("Credential created for Twitter will not be saved.") + } else { + saveRDS(credential, filename) + } + return(credential) } #' @rdname SaveCredential #' @export LoadCredential <- function(filename = "credential.RDS") { - credential <- readRDS(filename) - return(credential) + credential <- readRDS(filename) + return(credential) } ### *Authenticator functions should not be exported. It is just a bunch of helper functions to bridge the AuthenticateWith* functions with Authenticate(), but with datasource as the first argument and always return an auth object @@ -134,7 +136,7 @@ LoadCredential <- function(filename = "credential.RDS") { ### As a convention, function starts with lower case shouldn't be exported. youtubeAuthenticator <- function(apiKey) { - return(authenticateWithYoutubeAPI(apiKey)) + return(authenticateWithYoutubeAPI(apiKey)) } ### Currently, this Authenticator will return nothing, only for its side effect @@ -142,14 +144,19 @@ youtubeAuthenticator <- function(apiKey) { ### i.e. cannot use SaveCredential and LoadCredential! twitterAuthenticator <- function(apiKey, apiSecret, accessToken, accessTokenSecret, createToken) { - AuthenticateWithTwitterAPI(api_key = apiKey, api_secret = apiSecret, access_token = accessToken, access_token_secret = accessTokenSecret, createToken = createToken) # ah, only for its side effect, really bad design decision, twitteR! - return(NULL) + AuthenticateWithTwitterAPI(api_key = apiKey, api_secret = apiSecret, access_token = accessToken, access_token_secret = accessTokenSecret, createToken = createToken) # ah, only for its side effect, really bad design decision, twitteR! + return(NULL) } facebookAuthenticator <- function(appID, appSecret, extendedPermissions = FALSE) { - return(AuthenticateWithFacebookAPI(appID, appSecret, extended_permissions = extendedPermissions, useCachedToken = FALSE)) + return(AuthenticateWithFacebookAPI(appID, appSecret, extended_permissions = extendedPermissions, useCachedToken = FALSE)) } instagramAuthenticator <- function(appID, appSecret) { - return(AuthenticateWithInstagramAPI(appID, appSecret)) + return(AuthenticateWithInstagramAPI(appID, appSecret)) +} + +redditAuthenticator <- function(appName, appKey, appSecret, useTokenCache) { + # return(AuthenticateWithRedditAPI(appName, appKey, appSecret, useTokenCache)) + return(NULL) } diff --git a/vosonSML/R/AuthenticateWithRedditAPI.R b/vosonSML/R/AuthenticateWithRedditAPI.R new file mode 100644 index 0000000..cbdfe71 --- /dev/null +++ b/vosonSML/R/AuthenticateWithRedditAPI.R @@ -0,0 +1,50 @@ +#' Reddit API authentication. +#' +#' OAuth2 based authentication with the Reddit API that returns an authentication token. +#' +#' The httr package has a known OAuth2 issue with its parameter "use_basic_auth", The default value is set to FALSE +#' and is missing parameter pass through meaning it can not be set to TRUE as required by reddit oauth2 authentication. +#' The point patch devtools::install_github("r-lib/httr#485") fixes this issue. +#' Further information: https://github.com/r-lib/httr/issues/482 +#' +#' Reddit oauth tokens are only valid for one hour and using cached token will subsequently produce 401 errors. +#' +#' @param appName character string containing the reddit app name associated with the API key. +#' @param appKey character string containing the app key. +#' @param appSecret character string containing the app secret. +#' @param useTokenCache logical. Use cached authentication token if found. +#' +#' @return a reddit authentication token +#' +AuthenticateWithRedditAPI <- function(appName, appKey, appSecret, useTokenCache) { + + if (missing(appName)) { + appName <- "reddit" + } + + if (missing(appKey) | missing(appSecret)) { + cat("Error. One or more API credentials are missing.\nPlease specify these.\n") + return() + } + + if (missing(useTokenCache)) { + useTokenCache <- FALSE + } + + # sets up oauth2 for reddit + reddit_endpoint <- httr::oauth_endpoint( + authorize = "https://www.reddit.com/api/v1/authorize", + access = "https://www.reddit.com/api/v1/access_token" + ) + + reddit_app <- httr::oauth_app(appName, key = appKey, secret = appSecret) + + reddit_token <- httr::oauth2.0_token(reddit_endpoint, reddit_app, + user_params = list(duration = "permanent"), + scope = c("read"), + use_basic_auth = TRUE, + config_init = user_agent("httr oauth"), + cache = useTokenCache) + + return(reddit_token) +} diff --git a/vosonSML/R/AuthenticateWithYoutubeAPI.R b/vosonSML/R/AuthenticateWithYoutubeAPI.R new file mode 100644 index 0000000..af818f1 --- /dev/null +++ b/vosonSML/R/AuthenticateWithYoutubeAPI.R @@ -0,0 +1,18 @@ +#' YouTube API Authentication +#' +#' OAuth based authentication with the Google API. +#' +#' In order to collect data from YouTube, the user must first authenticate with Google's Application Programming +#' Interface (API). Users can obtain a Google Developer API key at: https://console.developers.google.com. +#' +#' @param apiKeyYoutube character string specifying your Google Developer API key. +#' +#' @return This is called for its side effect. +#' +#' @note In the future this function will enable users to save the API key in working directory, and the function will +#' automatically look for a locally stored key whenever it is called without apiKeyYoutube argument. +#' +#' @noRd +authenticateWithYoutubeAPI <- function(apiKeyYoutube) { + return(apiKeyYoutube) +} diff --git a/vosonSML/R/Collect.R b/vosonSML/R/Collect.R index a036061..ea13fa3 100644 --- a/vosonSML/R/Collect.R +++ b/vosonSML/R/Collect.R @@ -17,18 +17,20 @@ #' \code{facebook}: pageName, rangeFrom, rangeTo, verbose, n, writeToFile, dynamic #' \code{youtube}: videoIDs, verbose, writeToFile, maxComments #' \code{twitter}: searchTerm, numTweets, verbose, writeToFile, language -#' \code{instagram}: credential, tag, n, lat, lng, distance, folder, mindate, maxdate, verbose, sleep, writeToFile, +#' \code{instagram}: credential, tag, n, lat, lng, distance, folder, mindate, maxdate, verbose, sleep, writeToFile, #' waitForRateLimit -#' +#' \code{reddit}: threadUrls, waitTime, writeToFile +#' #' \code{instagram} with \code{ego} = TRUE: username, userid, verbose, #' degreeEgoNet, waitForRateLimit, getFollows #' @return A data.frame object of class \code{dataSource.*} that can be used #' with \code{Create}. #' @author Chung-hong Chan -#' @seealso \code{CollectDataFromFacebook}, -#' \code{CollectDataFromInstagram}, -#' \code{CollectDatFromTwitter}, -#' \code{CollectEgoInstagram} +#' @seealso \code{CollectDataFacebook}, +#' \code{CollectDataInstagram}, +#' \code{CollectDataTwitter}, +#' \code{CollectEgoInstagram}, +#' \code{CollectDataReddit}, #' @examples #' #' \dontrun{ @@ -50,6 +52,7 @@ #' Authenticate("youtube", #' apiKey = my_apiKeyYoutube) %>% Collect(videoIDs = videoIDs) %>% Create('actor') #' } +#' #' @export Collect <- function(credential, ego = FALSE, ...) { if (ego) { @@ -63,6 +66,7 @@ Collect <- function(credential, ego = FALSE, ...) { youtube = youtubeCollector, twitter = twitterCollector, instagram = instagramCollector, + reddit = redditCollector, stop("Unsupported socialmedia") ) } @@ -92,3 +96,7 @@ instagramCollector <- function(credential, tag, n, lat, lng, distance, folder, m instagramEgo <- function(credential, username, userid, verbose, degreeEgoNet, waitForRateLimit, getFollows) { return(CollectEgoInstagram(username, userid, verbose, degreeEgoNet, waitForRateLimit, getFollows, credential)) } + +redditCollector <- function(credential, threadUrls, waitTime, writeToFile) { + return(CollectDataReddit(threadUrls, waitTime, writeToFile)) +} diff --git a/vosonSML/R/CollectDataReddit.R b/vosonSML/R/CollectDataReddit.R new file mode 100644 index 0000000..40278e9 --- /dev/null +++ b/vosonSML/R/CollectDataReddit.R @@ -0,0 +1,48 @@ +#' Collect reddit thread data +#' +#' Uses RedditExtractoR::reddit_content to collect user and comment data for thread urls. +#' +#' @param threadUrls character string vector. Reddit thread url's to collect data from. +#' @param waitTime numeric integer. Time in seconds to wait in-between url collection requests. +#' @param writeToFile logical. If the data should be written to file. +#' +#' @note The reddit API endpoint used for thread collection has maximum limit of 500 comments per thread url. +#' +#' @return A data frame object of class dataSource.reddit that can be used for creating unimodal +#' networks (CreateActorNetwork). +#' +CollectDataReddit <- function(threadUrls, waitTime = 5, writeToFile) { + + if (missing(threadUrls)) { + cat("Error. Argument `threadUrls` is missing.\nPlease provide a reddit thread url.\n") + return(NA) + } + + if (!is.vector(threadUrls) || length(threadUrls) < 1) { + cat("Error. Please provide a vector of one or more reddit thread urls.\n") + return(NA) + } + + if (missing(writeToFile)) { + writeToFile <- FALSE + } + + cat("\nCollecting thread data for reddit urls:\n") + + # make the get request for the reddit thread url + threads_df <- RedditExtractoR::reddit_content(threadUrls, waitTime) + + # add thread id to df, extracted from url + threads_df$thread_id <- gsub("^(.*)?/comments/([0-9A-Za-z]{6})?/.*?(/)?$", "\\2", + threads_df$URL, ignore.case = TRUE, perl = TRUE) + + if (isTrueValue(writeToFile)) { + writeOutputFile(threads_df, "csv", "RedditData") + } + + class(threads_df) <- append(class(threads_df), c("dataSource", "reddit")) + + cat("\nDone!\n") + + return(threads_df) +} \ No newline at end of file diff --git a/vosonSML/R/Create.R b/vosonSML/R/Create.R index a72c4f5..8b00e1e 100644 --- a/vosonSML/R/Create.R +++ b/vosonSML/R/Create.R @@ -1,60 +1,60 @@ #' Create networks from social media data #' -#' This function creates networks from social media data (i.e. from data frames -#' of class \code{dataSource}. \code{Create} is the final step of the -#' \code{Authenticate}, \code{Collect}, \code{Create} workflow. This function is -#' a convenient UI wrapper to the core Create*Network family of functions. -#' -#' Note: when creating Twitter networks, the user information -#' can be collected separately using the \code{\link{PopulateUserInfo}} function -#' and stored into the network as vertex attributes (this involves additional +#' This function creates networks from social media data (i.e. from data frames of class \code{dataSource}. +#' \code{Create} is the final step of the \code{Authenticate}, \code{Collect}, \code{Create} workflow. This function is +#' a convenient UI wrapper to the core create*Network family of functions. +#' +#' Note: when creating Twitter networks, the user information can be collected separately using the +#' \code{\link{PopulateUserInfo}} function and stored into the network as vertex attributes (this involves additional #' calls to the Twitter API). #' #' @param dataSource a data frame of class \code{dataSource} -#' @param type character, type of network to be created, currently supports -#' "actor", "bimodal", "dynamic", "semantic" and "ego" -#' @param ... additional parameters for Create*Network functions -#' @return An igraph graph object +#' @param type character, type of network to be created, currently supports "actor", "bimodal", "dynamic", "semantic" +#' and "ego" +#' @param ... additional parameters for create*Network functions +#' @return an igraph graph object +#' #' @author Chung-hong Chan -#' @seealso \code{\link{CreateActorNetwork}}, -#' \code{\link{CreateBimodalNetwork}}, \code{\link{CreateDynamicNetwork}}, -#' \code{\link{CreateSemanticNetwork}}, \code{\link{CreateEgoNetworkFromData}} +#' #' @examples -#' #' \dontrun{ #' require(magrittr) -#' ## Instagram ego network example -#' myAppID <- "123456789098765" -#' myAppSecret <- "abc123abc123abc123abc123abc123ab" -#' myUsernames <- c("senjohnmccain","obama") -#' -#' Authenticate("instagram", -#' appID = myAappId, -#' appSecret = myAppSecret) %>% Collect(ego = TRUE, -#' username = myUsernames) %>% Create -#' -#' ## YouTube actor network example -#' my_apiKeyYoutube <- "314159265358979qwerty" -#' videoIDs <- c("W2GZFeYGU3s","mL27TAJGlWc") -#' -#' Authenticate("youtube", -#' apiKey = my_apiKeyYoutube) %>% Collect(videoIDs = videoIDs) %>% Create('actor') +#' +#' ## instagram ego network example +#' +#' my_app_id <- "123456789098765" +#' my_app_secret <- "abc123abc123abc123abc123abc123ab" +#' my_usernames <- c("senjohnmccain", "obama") +#' +#' my_ego_network <- Authenticate("instagram", appID = my_app_id, appSecret = my_app_secret) %>% +#' Collect(ego = TRUE, username = my_usernames) %>% Create +#' +#' ## youtube actor network example +#' +#' my_api_key <- "314159265358979qwerty" +#' my_video_ids <- c("W2GZFeYGU3s","mL27TAJGlWc") +#' +#' my_actor_network <- Authenticate("youtube", apiKey = my_api_key) %>% +#' Collect(videoIDs = my_video_ids) %>% Create('actor') +#' #' } #' @export -Create <- function(dataSource, type = "Actor", ...) { - if (inherits(dataSource, "ego")) { - return(CreateEgoNetworkFromData(dataSource)) ## you cannot create actor out of ego data - } - creator <- switch(tolower(type), - actor = CreateActorNetwork, - bimodal = CreateBimodalNetwork, - dynamic = CreateDynamicNetwork, - semantic = CreateSemanticNetwork, - ego = CreateEgoNetworkFromData, - stop("Unknown Type") - ) - # return() - networkToReturn <- creator(dataSource, ...) - class(networkToReturn) <- append(class(networkToReturn),c("vosonSML")) - return(networkToReturn) +Create <- function(dataSource, type = "actor", ...) { + + if (inherits(dataSource, "ego")) { + return(CreateEgoNetworkFromData(dataSource)) ## you cannot create actor out of ego data + } + + creator <- switch(tolower(type), + actor = CreateActorNetwork, + bimodal = CreateBimodalNetwork, + dynamic = CreateDynamicNetwork, + semantic = CreateSemanticNetwork, + ego = CreateEgoNetworkFromData, + stop("Unknown Type")) + + network_to_return <- creator(dataSource, ...) + class(network_to_return) <- append(class(network_to_return), c("vosonSML")) + + return(network_to_return) } diff --git a/vosonSML/R/CreateActorNetwork.R b/vosonSML/R/CreateActorNetwork.R index 437d34e..daf6e7b 100644 --- a/vosonSML/R/CreateActorNetwork.R +++ b/vosonSML/R/CreateActorNetwork.R @@ -1,91 +1,46 @@ -#' Note: this function is DEPRECATED and will be removed in a future release. -#' Please use the \code{Create} function -#' -#' Create 'actor' networks from social media data -#' -#' This function creates a unimodal 'actor' network from social media data -#' (i.e. from data frames of class \code{dataSource}, or for Twitter data it is -#' also possible to provide a *list* of data frames). In this actor network, -#' edges represent relationships between actors of the same type (e.g. -#' interactions between Twitter users). For example, with Twitter data an -#' interaction is defined as a 'mention' or 'reply' or 'retweet' from user i to -#' user j, given 'tweet' m. With YouTube comments, an interaction is defined as -#' a 'reply' or 'mention' from user i to user j, given 'comment' m. -#' -#' This function creates a (weighted and directed) unimodal 'actor' network -#' from a data frame of class \code{dataSource} (which are created using the -#' `CollectData` family of functions in the vosonSML package), or a -#' *list* of Twitter data frames collected using \code{CollectDataTwitter} -#' function. -#' -#' The resulting network is an igraph graph object. This graph object is -#' unimodal because edges represent relationships between vertices of the same -#' type (read: 'actors'), such as replies/retweets/mentions between Twitter -#' users. Edges are directed and weighted (e.g. if user i has replied n times -#' to user j, then the weight of this directed edge equals n). -#' -#' @param x a data frame of class \code{dataSource}. For Twitter data, it is -#' also possible to provide a *list* of data frames (i.e. data frames that -#' inherit class \code{dataSource} and \code{twitter}). Only lists of Twitter -#' data frames are supported at this time. If a list of data frames is -#' provided, then the function binds these row-wise and computes over the -#' entire data set. -#' @param writeToFile logical. If \code{TRUE} then the network is saved to file -#' in current working directory (GRAPHML format), with filename denoting the -#' current date/time and the type of network. -#' @return An igraph graph object, with directed and weighted edges. -#' @note Not all data sources in vosonSML can be used for creating actor -#' networks. -#' -#' Currently supported data sources are: -#' -#' - YouTube - Twitter -#' -#' Other data sources (e.g. Facebook) will be implemented in the future. The -#' user is notified if they try to create actor networks for incompatible data -#' sources. -#' -#' For Twitter data, actor networks can be created from multiple data frames -#' (i.e. datasets collected individually using CollectDataTwitter). Simply -#' create a list of the data frames that you wish to create a network from. For -#' example, \code{myList <- list(myTwitterData1, myTwitterData2, -#' myTwitterData3)}. -#' @author Timothy Graham & Robert Ackland -#' -#' @seealso See \code{CollectDataYoutube} and \code{CollectDataTwitter} to -#' collect data sources for creating actor networks in vosonSML. -#' @keywords SNA unimodal network igraph social media -#' @examples -#' -#' \dontrun{ -#' ## This example shows how to collect YouTube comments data and create an actor network -#' -#' # Use your own Google Developer API Key here: -#' myApiKey <- "1234567890" -#' -#' # Authenticate with the Google API -#' apiKeyYoutube <- AuthenticateWithYoutubeAPI(apiKeyYoutube=myApiKey) -#' -#' # Generate a vector of YouTube video IDs to collect data from -#' # (or use the function `GetYoutubeVideoIDs` to automatically -#' # generate from a plain text file of video URLs) -#' videoIDs <- c("W2GZFeYGU3s","mL27TAJGlWc") -#' -#' # Collect the data using function `CollectDataYoutube` -#' myYoutubeData <- CollectDataYoutube(videoIDs,apiKeyYoutube,writeToFile=FALSE) -#' -#' # Create an 'actor' network using the function `CreateActorNetwork` -#' g_actor_youtube <- CreateActorNetwork(myYoutubeData) -#' -#' # Description of actor network -#' g_actor_youtube -#' } -#' -CreateActorNetwork <- -function(x,writeToFile) - { - if (missing(writeToFile)) { - writeToFile <- FALSE # default = not write to file - } - UseMethod("CreateActorNetwork",x) - } +#' Create actor networks from social media data +#' +#' This function creates a unimodal 'actor' network from social media data (i.e. from data frames of class dataSource, +#' or for Twitter data it is also possible to provide a list of data frames). In this actor network, edges represent +#' relationships between actors of the same type (e.g. interactions between Twitter users). For example, with Twitter +#' data an interaction is defined as a 'mention' or 'reply' or 'retweet' from user i to user j, given 'tweet' m. With +#' YouTube comments, an interaction is defined as a 'reply' or 'mention' from user i to user j, given 'comment' m. +#' +#' This function creates a (weighted and directed) unimodal 'actor' network from a data frame of class dataSource +#' (which are created using the CollectData family of functions in the vosonSML package), or a list of Twitter data +#' frames collected using CollectDataTwitter function. +#' +#' The resulting network is an igraph graph object. This graph object is unimodal because edges represent relationships +#' between vertices of the same type (read: actors), such as replies/retweets/mentions between Twitter users. Edges are +#' directed and weighted (e.g. if user i has replied n times to user j, then the weight of this directed edge equals n). +#' +#' @param x a data frame of class dataSource. For Twitter data, it is also possible to provide a list of data frames +#' (i.e. data frames that inherit class dataSource and twitter). Only lists of Twitter data frames are supported at +#' this time. If a list of data frames is provided, then the function binds these row-wise and computes over the entire +#' data set. +#' @param writeToFile logical. If TRUE then the network is saved to file in current working directory (GRAPHML format), +#' with filename denoting the current date/time and the type of network +#' @param ... additional parameters to pass to the network creation method +#' @return an igraph graph object, with directed and weighted edges +#' +#' @note Not all data sources in vosonSML can be used for creating actor networks. +#' Currently supported data sources are: YouTube, Twitter +#' +#' Other data sources (e.g. Facebook) will be implemented in the future. The user is notified if they try to create +#' actor networks for incompatible data sources. +#' +#' For Twitter data, actor networks can be created from multiple data frames (i.e. datasets collected individually +#' using CollectDataTwitter). Simply create a list of the data frames that you wish to create a network from. For +#' example: my_list <- list(my_twitter_data_1, my_twitter_data_2, my_twitter_data_3) +#' +#' @author Timothy Graham , Robert Ackland +#' +#' @noRd +CreateActorNetwork <- function(x, writeToFile, ...) { + + if (missing(writeToFile)) { + writeToFile <- FALSE + } + + UseMethod("CreateActorNetwork", x) +} diff --git a/vosonSML/R/CreateActorNetwork.reddit.R b/vosonSML/R/CreateActorNetwork.reddit.R new file mode 100644 index 0000000..aac8d77 --- /dev/null +++ b/vosonSML/R/CreateActorNetwork.reddit.R @@ -0,0 +1,136 @@ +#' Creates a reddit actor network from collected threads +#' +#' Uses RedditExtractoR::user_network to create an igraph directed actor network with comment ids as edge attribute. +#' +#' @param x a dataframe as vosonSML class object containing collected social network data +#' @param weightEdges logical. Combines and weights directed edges. Can't be used with includeTextData. +#' @param includeTextData logical. If the igraph network edges should include the comment text as attribute. +#' @param cleanText logical. If non-alphanumeric, non-punctuation, and non-space characters should be removed from the +#' included text attribute data. Default is TRUE +#' @param writeToFile logical. If the igraph network graph should be written to file. +#' +#' @note Can create three types of network graphs: +#' * Directed graph with subreddit, thread_ids and comment ids as edge attributes - default option +#' * Directed graph with weighted edges (without comment ids) - weightEdges = TRUE +#' * Directed graph with comment text included as edge attribute - includeTextData = TRUE +#' +#' Comment ids as edge attributes in graphs refer to the Collect dataframe comment id not reddits comment id +#' If "Forbidden control character 0x19 found in igraph_i_xml_escape, Invalid value" then set cleanText = TRUE +#' +#' @return an igraph object of the actor network +#' +CreateActorNetwork.reddit <- function(x, weightEdges, includeTextData, cleanText, writeToFile) { + + if (missing(writeToFile) || writeToFile != TRUE) { + writeToFile <- FALSE + } + + if (missing(weightEdges) || weightEdges != TRUE) { + weightEdges <- FALSE + } + + # if weightEdges then includeTextData set FALSE + if (missing(includeTextData) || includeTextData != TRUE || weightEdges == TRUE) { + includeTextData <- FALSE + } + + # default cleanText = TRUE as reddit comments often contain forbidden XML control characters + if (missing(cleanText) || cleanText != FALSE) { + cleanText <- TRUE + } else { + cleanText <- FALSE + } + + if (includeTextData == FALSE) { + cleanText <- FALSE + } + + # append string to file name to indicate different graph types, only used if writeToFile = TRUE + appendToName <- "" + + thread_df <- x + + # actor_network <- RedditExtractoR::user_network(thread_df, include_author = TRUE, agg = FALSE) + + # modified from RedditExtractoR::user_network to include the df comment id, subreddit and thread id as edge + # attributes to support post-processing. author of sender_receiver_df, node_df, and edge_df @ivan-rivera. + include_author <- TRUE + sender_receiver_df <- + thread_df %>% + dplyr::select(.data$id, .data$subreddit, .data$thread_id, .data$structure, .data$user, .data$author, + .data$comment) %>% + dplyr::rename("comment_id" = .data$id, "sender" = .data$user) %>% + dplyr::mutate(response_to = ifelse(!grepl("_", .data$structure), "", gsub("_\\d+$", "", .data$structure))) %>% + dplyr::left_join(thread_df %>% + dplyr::select(.data$structure, .data$user) %>% + dplyr::rename("response_to" = .data$structure, "receiver" = .data$user), + by = "response_to") %>% + dplyr::mutate(receiver = dplyr::coalesce(.data$receiver, ifelse(include_author, .data$author, ""))) %>% + dplyr::filter(.data$sender != .data$receiver, + !(.data$sender %in% c("[deleted]", "")), + !(.data$receiver %in% c("[deleted]", ""))) %>% + dplyr::mutate(count = 1) %>% + dplyr::select(.data$sender, .data$receiver, .data$comment_id, .data$subreddit, .data$thread_id, .data$comment, + .data$count) + + node_df <- data.frame(user = with(sender_receiver_df, {unique(c(sender, receiver))}), + stringsAsFactors = FALSE) %>% + dplyr::mutate(id = as.integer(dplyr::row_number() - 1)) %>% + dplyr::select(.data$id, .data$user) + + edge_df <- sender_receiver_df %>% + dplyr::left_join(node_df %>% + dplyr::rename("sender" = .data$user, "from" = .data$id), + by = "sender") %>% + dplyr::left_join(node_df %>% + dplyr::rename("receiver" = .data$user, "to" = .data$id), + by = "receiver") %>% + dplyr::rename("weight" = .data$count, "title" = .data$comment) %>% + dplyr::select(.data$from, .data$to, .data$weight, .data$comment_id, .data$subreddit, .data$thread_id, + .data$title) + + # edge_df <- actor_network$edges + + # weight edges network graph + if (weightEdges) { + edge_df$comment_id <- edge_df$title <- NULL + edge_df <- edge_df %>% dplyr::group_by(.data$from, .data$to) %>% + dplyr::summarise(weight = sum(.data$weight)) %>% dplyr::ungroup() + + appendToName <- "Weighted" + # include comment text as edge attribute network graph + } else if (includeTextData) { + edge_df$weight <- NULL + + # rename the edge attribute containing the thread comment + edge_df <- edge_df %>% dplyr::rename("vosonTxt_comment" = .data$title) + + # problem control characters encountered in reddit text + # edge_df$vosonTxt_comment <- gsub("[\x01\x05\x18\x19\x1C]", "", edge_df$vosonTxt_comment, perl = TRUE) + appendToName <- "Txt" + + if (cleanText) { + edge_df$vosonTxt_comment <- gsub("[^[:punct:]^[:alnum:]^\\s]", "", edge_df$vosonTxt_comment, perl = TRUE) + appendToName <- "CleanTxt" + } + } else { + edge_df$title <- edge_df$weight <- NULL + } + + g <- graph_from_data_frame(d = edge_df, vertices = node_df, directed = TRUE) + + # set name to actors user name + V(g)$name <- V(g)$user + g <- delete_vertex_attr(g, "user") + g <- set_graph_attr(g, "type", "reddit") + + if (writeToFile) { + name <- paste0("RedditActorNetwork", appendToName) + writeOutputFile(g, "graphml", name) + } + + cat("\nDone!\n") + flush.console() + + return(g) +} diff --git a/vosonSML/R/vosonSML-package.R b/vosonSML/R/vosonSML-package.R index 9cd44e2..0fd95d3 100644 --- a/vosonSML/R/vosonSML-package.R +++ b/vosonSML/R/vosonSML-package.R @@ -1,27 +1,25 @@ #' Collection and network analysis of social media data #' #' The goal of the vosonSML package is to provide a suite of easy-to-use tools for collecting data from social media -#' sources (Instagram, Facebook, Twitter, and Youtube) and generating different types of networks suited to +#' sources (Instagram, Facebook, Twitter, Youtube, and Reddit) and generating different types of networks suited to #' Social Network Analysis (SNA) and text analytics. It offers tools to create unimodal, multimodal, semantic, and -#' dynamic networks. It draws on excellent packages such as \pkg{twitteR}, \pkg{instaR}, \pkg{Rfacebook}, and -#' \pkg{igraph} in order to provide an integrated 'work flow' for collecting different types of social media data and -#' creating different types of networks out of these data. Creating networks from social media data is often -#' non-trivial and time consuming. This package simplifies such tasks so users can focus on analysis. +#' dynamic networks. It draws on excellent packages such as \pkg{twitteR}, \pkg{instaR}, \pkg{Rfacebook}, +#' \pkg{RedditExtractoR} and \pkg{igraph} in order to provide an integrated 'work flow' for collecting different types +#' of social media data and creating different types of networks out of these data. Creating networks from social media +#' data is often non-trivial and time consuming. This package simplifies such tasks so users can focus on analysis. #' #' vosonSML uses a straightforward S3 class system. Data collected with this package produces \code{data.table} objects #' (extension of class \code{data.frame}), which are assigned the class \code{dataSource}. Additionally, -#' \code{dataSource} objects are assigned a class identifying the source of data, e.g. \code{facebook} or -#' \code{youtube}. In this way, \code{dataSource} objects are fast, easy to work with, and can be used as input to -#' easily construct different types of networks. For example, the function \code{\link{Collect}} can be used to collect -#' Twitter data, which is then 'piped' to the \code{\link{Create}} function, resulting in a network (an igraph object) -#' that is ready for analysis. +#' \code{dataSource} objects are assigned a class identifying the source of data, e.g. \code{facebook} or \code{youtube} +#' . In this way, \code{dataSource} objects are fast, easy to work with, and can be used as input to easily construct +#' different types of networks. For example, the function \code{\link{Collect}} can be used to collect Twitter data, +#' which is then 'piped' to the \code{\link{Create}} function, resulting in a network (an igraph object) that is ready +#' for analysis. #' #' @name vosonSML-package #' @aliases vosonSML-package vosonSML #' @docType package -#' @author Timothy Graham & Robert Ackland, with contribution Chung-hong Chan & Bryan Gertzel -#' -#' Maintainer: Bryan Gertzel +#' @author Created by Timothy Graham and Robert Ackland, with major contributions by Chung-hong Chan and Bryan Gertzel. #' @import tm #' @import RCurl #' @import bitops @@ -31,7 +29,8 @@ #' @import methods #' @import httr #' @importFrom Hmisc escapeRegex -#' @importFrom igraph delete.vertices graph.data.frame simplify write.graph V 'V<-' +#' @importFrom igraph delete.vertices graph.data.frame simplify write.graph V 'V<-' set.graph.attribute +#' graph_from_data_frame delete_vertex_attr set_graph_attr #' @importFrom Rfacebook fbOAuth getPost getPage getUsers #' @importFrom instaR getComments getLikes instaOAuth searchInstagram getUser getFollowers getFollows #' @importFrom plyr ldply @@ -39,4 +38,8 @@ #' @importFrom stringr str_extract str_replace_all str_match_all #' @importFrom stats 'na.omit' #' @importFrom utils "flush.console" head "install.packages" "read.table" "write.csv" "read.csv" +#' @importFrom RedditExtractoR reddit_content user_network +#' @importFrom magrittr '%>%' +#' @importFrom dplyr rename group_by summarise ungroup left_join select mutate filter coalesce row_number +#' @importFrom rlang '.data' NULL diff --git a/vosonSML/man/Authenticate.Rd b/vosonSML/man/Authenticate.Rd index 30f1500..7affa01 100644 --- a/vosonSML/man/Authenticate.Rd +++ b/vosonSML/man/Authenticate.Rd @@ -8,13 +8,14 @@ Authenticate(socialmedia, ...) } \arguments{ \item{socialmedia}{character string, social media API to authenticate, -currently supports "facebook", "youtube", "twitter" and "instagram"} +currently supports "facebook", "youtube", "twitter", "instagram" and "reddit"} \item{...}{additional parameters for authentication \code{facebook}: appID, appSecret \code{youtube}: apiKey \code{twitter}: apiKey, apiSecret, accessToken, accessTokenSecret -\code{instagram}: appID, appSecret} +\code{instagram}: appID, appSecret +\code{reddit}: appName, appKey, appSecret, useTokenCache} } \value{ credential object with authentication information diff --git a/vosonSML/man/AuthenticateWithRedditAPI.Rd b/vosonSML/man/AuthenticateWithRedditAPI.Rd new file mode 100644 index 0000000..db949a2 --- /dev/null +++ b/vosonSML/man/AuthenticateWithRedditAPI.Rd @@ -0,0 +1,31 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/AuthenticateWithRedditAPI.R +\name{AuthenticateWithRedditAPI} +\alias{AuthenticateWithRedditAPI} +\title{Reddit API authentication.} +\usage{ +AuthenticateWithRedditAPI(appName, appKey, appSecret, useTokenCache) +} +\arguments{ +\item{appName}{character string containing the reddit app name associated with the API key.} + +\item{appKey}{character string containing the app key.} + +\item{appSecret}{character string containing the app secret.} + +\item{useTokenCache}{logical. Use cached authentication token if found.} +} +\value{ +a reddit authentication token +} +\description{ +OAuth2 based authentication with the Reddit API that returns an authentication token. +} +\details{ +The httr package has a known OAuth2 issue with its parameter "use_basic_auth", The default value is set to FALSE +and is missing parameter pass through meaning it can not be set to TRUE as required by reddit oauth2 authentication. +The point patch devtools::install_github("r-lib/httr#485") fixes this issue. +Further information: https://github.com/r-lib/httr/issues/482 + +Reddit oauth tokens are only valid for one hour and using cached token will subsequently produce 401 errors. +} diff --git a/vosonSML/man/Collect.Rd b/vosonSML/man/Collect.Rd index 2c6e8e2..dbc207e 100644 --- a/vosonSML/man/Collect.Rd +++ b/vosonSML/man/Collect.Rd @@ -19,8 +19,9 @@ CollectDataFrom* and CollectEgo* functions) \code{facebook}: pageName, rangeFrom, rangeTo, verbose, n, writeToFile, dynamic \code{youtube}: videoIDs, verbose, writeToFile, maxComments \code{twitter}: searchTerm, numTweets, verbose, writeToFile, language -\code{instagram}: credential, tag, n, lat, lng, distance, folder, mindate, maxdate, verbose, sleep, writeToFile, +\code{instagram}: credential, tag, n, lat, lng, distance, folder, mindate, maxdate, verbose, sleep, writeToFile, waitForRateLimit +\code{reddit}: threadUrls, waitTime, writeToFile \code{instagram} with \code{ego} = TRUE: username, userid, verbose, degreeEgoNet, waitForRateLimit, getFollows} @@ -57,12 +58,14 @@ videoIDs <- c("W2GZFeYGU3s","mL27TAJGlWc") Authenticate("youtube", apiKey = my_apiKeyYoutube) \%>\% Collect(videoIDs = videoIDs) \%>\% Create('actor') } + } \seealso{ -\code{CollectDataFromFacebook}, -\code{CollectDataFromInstagram}, -\code{CollectDatFromTwitter}, -\code{CollectEgoInstagram} +\code{CollectDataFacebook}, +\code{CollectDataInstagram}, +\code{CollectDataTwitter}, +\code{CollectEgoInstagram}, +\code{CollectDataReddit}, } \author{ Chung-hong Chan diff --git a/vosonSML/man/CollectDataReddit.Rd b/vosonSML/man/CollectDataReddit.Rd new file mode 100644 index 0000000..028db5e --- /dev/null +++ b/vosonSML/man/CollectDataReddit.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CollectDataReddit.R +\name{CollectDataReddit} +\alias{CollectDataReddit} +\title{Collect reddit thread data} +\usage{ +CollectDataReddit(threadUrls, waitTime = 5, writeToFile) +} +\arguments{ +\item{threadUrls}{character string vector. Reddit thread url's to collect data from.} + +\item{waitTime}{numeric integer. Time in seconds to wait in-between url collection requests.} + +\item{writeToFile}{logical. If the data should be written to file.} +} +\value{ +A data frame object of class dataSource.reddit that can be used for creating unimodal +networks (CreateActorNetwork). +} +\description{ +Uses RedditExtractoR::reddit_content to collect user and comment data for thread urls. +} +\note{ +The reddit API endpoint used for thread collection has maximum limit of 500 comments per thread url. +} diff --git a/vosonSML/man/CollectDataTwitter.Rd b/vosonSML/man/CollectDataTwitter.Rd index bd151ea..abaf665 100644 --- a/vosonSML/man/CollectDataTwitter.Rd +++ b/vosonSML/man/CollectDataTwitter.Rd @@ -5,8 +5,9 @@ \title{Note: this function is DEPRECATED and will be removed in a future release. Please use the \code{Collect} function} \usage{ -CollectDataTwitter(searchTerm, numTweets, verbose, writeToFile, language, since, - until, locale, geocode, sinceID, maxID, resultType, retryOnRateLimit) +CollectDataTwitter(searchTerm, numTweets, verbose, writeToFile, language, + since, until, locale, geocode, sinceID, maxID, resultType, + retryOnRateLimit) } \arguments{ \item{searchTerm}{character string, specifying a search term or phrase (e.g. diff --git a/vosonSML/man/CollectEgoInstagram.Rd b/vosonSML/man/CollectEgoInstagram.Rd index d3d6af6..815421a 100644 --- a/vosonSML/man/CollectEgoInstagram.Rd +++ b/vosonSML/man/CollectEgoInstagram.Rd @@ -5,8 +5,8 @@ \title{Note: this function is DEPRECATED and will be removed in a future release. Please use the \code{Collect} function} \usage{ -CollectEgoInstagram(username, userid, verbose, degreeEgoNet, waitForRateLimit, - getFollows, credential = NULL) +CollectEgoInstagram(username, userid, verbose, degreeEgoNet, + waitForRateLimit, getFollows, credential = NULL) } \arguments{ \item{username}{character vector, specifying a set of usernames who will be diff --git a/vosonSML/man/Create.Rd b/vosonSML/man/Create.Rd index 3206837..6442628 100644 --- a/vosonSML/man/Create.Rd +++ b/vosonSML/man/Create.Rd @@ -4,57 +4,51 @@ \alias{Create} \title{Create networks from social media data} \usage{ -Create(dataSource, type = "Actor", ...) +Create(dataSource, type = "actor", ...) } \arguments{ \item{dataSource}{a data frame of class \code{dataSource}} -\item{type}{character, type of network to be created, currently supports -"actor", "bimodal", "dynamic", "semantic" and "ego"} +\item{type}{character, type of network to be created, currently supports "actor", "bimodal", "dynamic", "semantic" +and "ego"} -\item{...}{additional parameters for Create*Network functions} +\item{...}{additional parameters for create*Network functions} } \value{ -An igraph graph object +an igraph graph object } \description{ -This function creates networks from social media data (i.e. from data frames -of class \code{dataSource}. \code{Create} is the final step of the -\code{Authenticate}, \code{Collect}, \code{Create} workflow. This function is -a convenient UI wrapper to the core Create*Network family of functions. +This function creates networks from social media data (i.e. from data frames of class \code{dataSource}. +\code{Create} is the final step of the \code{Authenticate}, \code{Collect}, \code{Create} workflow. This function is +a convenient UI wrapper to the core create*Network family of functions. } \details{ -Note: when creating Twitter networks, the user information -can be collected separately using the \code{\link{PopulateUserInfo}} function -and stored into the network as vertex attributes (this involves additional +Note: when creating Twitter networks, the user information can be collected separately using the +\code{\link{PopulateUserInfo}} function and stored into the network as vertex attributes (this involves additional calls to the Twitter API). } \examples{ - \dontrun{ require(magrittr) -## Instagram ego network example -myAppID <- "123456789098765" -myAppSecret <- "abc123abc123abc123abc123abc123ab" -myUsernames <- c("senjohnmccain","obama") -Authenticate("instagram", -appID = myAappId, -appSecret = myAppSecret) \%>\% Collect(ego = TRUE, -username = myUsernames) \%>\% Create +## instagram ego network example -## YouTube actor network example -my_apiKeyYoutube <- "314159265358979qwerty" -videoIDs <- c("W2GZFeYGU3s","mL27TAJGlWc") +my_app_id <- "123456789098765" +my_app_secret <- "abc123abc123abc123abc123abc123ab" +my_usernames <- c("senjohnmccain", "obama") + +my_ego_network <- Authenticate("instagram", appID = my_app_id, appSecret = my_app_secret) \%>\% + Collect(ego = TRUE, username = my_usernames) \%>\% Create + +## youtube actor network example + +my_api_key <- "314159265358979qwerty" +my_video_ids <- c("W2GZFeYGU3s","mL27TAJGlWc") + +my_actor_network <- Authenticate("youtube", apiKey = my_api_key) \%>\% + Collect(videoIDs = my_video_ids) \%>\% Create('actor') -Authenticate("youtube", -apiKey = my_apiKeyYoutube) \%>\% Collect(videoIDs = videoIDs) \%>\% Create('actor') -} } -\seealso{ -\code{\link{CreateActorNetwork}}, -\code{\link{CreateBimodalNetwork}}, \code{\link{CreateDynamicNetwork}}, -\code{\link{CreateSemanticNetwork}}, \code{\link{CreateEgoNetworkFromData}} } \author{ Chung-hong Chan diff --git a/vosonSML/man/CreateActorNetwork.Rd b/vosonSML/man/CreateActorNetwork.Rd deleted file mode 100644 index 5016a60..0000000 --- a/vosonSML/man/CreateActorNetwork.Rd +++ /dev/null @@ -1,108 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/CreateActorNetwork.R -\name{CreateActorNetwork} -\alias{CreateActorNetwork} -\title{Note: this function is DEPRECATED and will be removed in a future release. -Please use the \code{Create} function} -\usage{ -CreateActorNetwork(x, writeToFile) -} -\arguments{ -\item{x}{a data frame of class \code{dataSource}. For Twitter data, it is -also possible to provide a *list* of data frames (i.e. data frames that -inherit class \code{dataSource} and \code{twitter}). Only lists of Twitter -data frames are supported at this time. If a list of data frames is -provided, then the function binds these row-wise and computes over the -entire data set.} - -\item{writeToFile}{logical. If \code{TRUE} then the network is saved to file -in current working directory (GRAPHML format), with filename denoting the -current date/time and the type of network.} -} -\value{ -An igraph graph object, with directed and weighted edges. -} -\description{ -Create 'actor' networks from social media data -} -\details{ -This function creates a unimodal 'actor' network from social media data -(i.e. from data frames of class \code{dataSource}, or for Twitter data it is -also possible to provide a *list* of data frames). In this actor network, -edges represent relationships between actors of the same type (e.g. -interactions between Twitter users). For example, with Twitter data an -interaction is defined as a 'mention' or 'reply' or 'retweet' from user i to -user j, given 'tweet' m. With YouTube comments, an interaction is defined as -a 'reply' or 'mention' from user i to user j, given 'comment' m. - -This function creates a (weighted and directed) unimodal 'actor' network -from a data frame of class \code{dataSource} (which are created using the -`CollectData` family of functions in the vosonSML package), or a -*list* of Twitter data frames collected using \code{CollectDataTwitter} -function. - -The resulting network is an igraph graph object. This graph object is -unimodal because edges represent relationships between vertices of the same -type (read: 'actors'), such as replies/retweets/mentions between Twitter -users. Edges are directed and weighted (e.g. if user i has replied n times -to user j, then the weight of this directed edge equals n). -} -\note{ -Not all data sources in vosonSML can be used for creating actor -networks. - -Currently supported data sources are: - -- YouTube - Twitter - -Other data sources (e.g. Facebook) will be implemented in the future. The -user is notified if they try to create actor networks for incompatible data -sources. - -For Twitter data, actor networks can be created from multiple data frames -(i.e. datasets collected individually using CollectDataTwitter). Simply -create a list of the data frames that you wish to create a network from. For -example, \code{myList <- list(myTwitterData1, myTwitterData2, -myTwitterData3)}. -} -\examples{ - -\dontrun{ - ## This example shows how to collect YouTube comments data and create an actor network - - # Use your own Google Developer API Key here: - myApiKey <- "1234567890" - - # Authenticate with the Google API - apiKeyYoutube <- AuthenticateWithYoutubeAPI(apiKeyYoutube=myApiKey) - - # Generate a vector of YouTube video IDs to collect data from - # (or use the function `GetYoutubeVideoIDs` to automatically - # generate from a plain text file of video URLs) - videoIDs <- c("W2GZFeYGU3s","mL27TAJGlWc") - - # Collect the data using function `CollectDataYoutube` - myYoutubeData <- CollectDataYoutube(videoIDs,apiKeyYoutube,writeToFile=FALSE) - - # Create an 'actor' network using the function `CreateActorNetwork` - g_actor_youtube <- CreateActorNetwork(myYoutubeData) - - # Description of actor network - g_actor_youtube -} - -} -\seealso{ -See \code{CollectDataYoutube} and \code{CollectDataTwitter} to -collect data sources for creating actor networks in vosonSML. -} -\author{ -Timothy Graham & Robert Ackland - -} -\keyword{SNA} -\keyword{igraph} -\keyword{media} -\keyword{network} -\keyword{social} -\keyword{unimodal} diff --git a/vosonSML/man/CreateActorNetwork.reddit.Rd b/vosonSML/man/CreateActorNetwork.reddit.Rd new file mode 100644 index 0000000..6633d69 --- /dev/null +++ b/vosonSML/man/CreateActorNetwork.reddit.Rd @@ -0,0 +1,36 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/CreateActorNetwork.reddit.R +\name{CreateActorNetwork.reddit} +\alias{CreateActorNetwork.reddit} +\title{Creates a reddit actor network from collected threads} +\usage{ +\method{CreateActorNetwork}{reddit}(x, weightEdges, includeTextData, + cleanText, writeToFile) +} +\arguments{ +\item{x}{a dataframe as vosonSML class object containing collected social network data} + +\item{weightEdges}{logical. Combines and weights directed edges. Can't be used with includeTextData.} + +\item{includeTextData}{logical. If the igraph network edges should include the comment text as attribute.} + +\item{cleanText}{logical. If non-alphanumeric, non-punctuation, and non-space characters should be removed from the +included text attribute data. Default is TRUE} + +\item{writeToFile}{logical. If the igraph network graph should be written to file.} +} +\value{ +an igraph object of the actor network +} +\description{ +Uses RedditExtractoR::user_network to create an igraph directed actor network with comment ids as edge attribute. +} +\note{ +Can create three types of network graphs: +* Directed graph with subreddit, thread_ids and comment ids as edge attributes - default option +* Directed graph with weighted edges (without comment ids) - weightEdges = TRUE +* Directed graph with comment text included as edge attribute - includeTextData = TRUE + +Comment ids as edge attributes in graphs refer to the Collect dataframe comment id not reddits comment id +If "Forbidden control character 0x19 found in igraph_i_xml_escape, Invalid value" then set cleanText = TRUE +} diff --git a/vosonSML/man/SaveCredential.Rd b/vosonSML/man/SaveCredential.Rd index aa0d26a..1dbfb61 100644 --- a/vosonSML/man/SaveCredential.Rd +++ b/vosonSML/man/SaveCredential.Rd @@ -3,7 +3,6 @@ \name{SaveCredential} \alias{SaveCredential} \alias{LoadCredential} -\alias{LoadCredential} \title{Save and load credential information} \usage{ SaveCredential(credential, filename = "credential.RDS") diff --git a/vosonSML/man/vosonSML-package.Rd b/vosonSML/man/vosonSML-package.Rd index bb98dad..2cd30f6 100644 --- a/vosonSML/man/vosonSML-package.Rd +++ b/vosonSML/man/vosonSML-package.Rd @@ -7,24 +7,22 @@ \title{Collection and network analysis of social media data} \description{ The goal of the vosonSML package is to provide a suite of easy-to-use tools for collecting data from social media -sources (Instagram, Facebook, Twitter, and Youtube) and generating different types of networks suited to +sources (Instagram, Facebook, Twitter, Youtube, and Reddit) and generating different types of networks suited to Social Network Analysis (SNA) and text analytics. It offers tools to create unimodal, multimodal, semantic, and -dynamic networks. It draws on excellent packages such as \pkg{twitteR}, \pkg{instaR}, \pkg{Rfacebook}, and -\pkg{igraph} in order to provide an integrated 'work flow' for collecting different types of social media data and -creating different types of networks out of these data. Creating networks from social media data is often -non-trivial and time consuming. This package simplifies such tasks so users can focus on analysis. +dynamic networks. It draws on excellent packages such as \pkg{twitteR}, \pkg{instaR}, \pkg{Rfacebook}, +\pkg{RedditExtractoR} and \pkg{igraph} in order to provide an integrated 'work flow' for collecting different types +of social media data and creating different types of networks out of these data. Creating networks from social media +data is often non-trivial and time consuming. This package simplifies such tasks so users can focus on analysis. } \details{ vosonSML uses a straightforward S3 class system. Data collected with this package produces \code{data.table} objects (extension of class \code{data.frame}), which are assigned the class \code{dataSource}. Additionally, -\code{dataSource} objects are assigned a class identifying the source of data, e.g. \code{facebook} or -\code{youtube}. In this way, \code{dataSource} objects are fast, easy to work with, and can be used as input to -easily construct different types of networks. For example, the function \code{\link{Collect}} can be used to collect -Twitter data, which is then 'piped' to the \code{\link{Create}} function, resulting in a network (an igraph object) -that is ready for analysis. +\code{dataSource} objects are assigned a class identifying the source of data, e.g. \code{facebook} or \code{youtube} +. In this way, \code{dataSource} objects are fast, easy to work with, and can be used as input to easily construct +different types of networks. For example, the function \code{\link{Collect}} can be used to collect Twitter data, +which is then 'piped' to the \code{\link{Create}} function, resulting in a network (an igraph object) that is ready +for analysis. } \author{ -Timothy Graham & Robert Ackland, with contribution Chung-hong Chan & Bryan Gertzel - -Maintainer: Bryan Gertzel +Created by Timothy Graham and Robert Ackland, with major contributions by Chung-hong Chan and Bryan Gertzel. }