5-read.qmd

---
title: "Reading Data"
author: "Thomas W. Valente and George G. Vega Yon"
---

```{r setup, echo=FALSE, warning=FALSE, message=FALSE}
knitr::opts_chunk$set(comment = "#")
library(netdiffuseR)
library(igraph)
library(networkDynamic)
```

# Diffusion Network Object (diffnet)

- __netdiffuseR__ has its own class of objects: `diffnet`.

- Most of the package's functions accept different types of graphs:
    * Static: `matrix`, `dgCMatrix` (from the __Matrix__ pkg), 
    * Dynamic: `list` + `dgCMatrix`, `array`, `diffnet`
  
- But `diffnet` is the class from which you get the most.

- From __netdiffuseR__'s perspective, network data comes in three classes:

    1. Raw R network data: Datasets with edgelist, attributes, survey data, etc.
    2. Already R data: already read into R using igraph, statnet, etc. (`igraph_to_diffnet`, `network_to_diffnet`, etc.)
    3. Graph files: DL, UCINET, pajek, etc. (`read_pajek`, `read_dl`, `read_ucinet`, etc.)

- In this presentation we will show focus on 1.

# What is a `diffnet` object

A diffusion network, a.k.a. `diffnet` object, is a `list` that holds the following objects:

- `graph`: A `list` with $t$ `dgCMatrix` matrices of size $n\times n$,
- `toa`: An integer vector of length $n$,
- `adopt`: A matrix of size $n\times t$,
- `cumadopt`: A matrix of size $n\times t$,
- `vertex.static.attrs`: A `data.frame` of size $n\times k$,
- `vertex.dyn.attrs`: A list with $t$ dataframes of size $n\times k$,
- `graph.attrs`: Currently ignored..., and
- `meta`: A list with metadata about the object.

These are created using `new_diffnet` (or its wrappers).


# Basic Diffusion Network

*   To create `diffnet` objects we only need a network and times of adoption:

    ```{r 2-static-net}
    set.seed(9)
    
    # Network
    net <- rgraph_ws(500, 4, .2)
    
    # Times of adoption
    toa <- sample(c(NA, 2010:2014), 500, TRUE)
    
    diffnet_static <- as_diffnet(net, toa)
    summary(diffnet_static)
    ```

# Static survey

*   netdiffuseR can also read survey (nomination) data:
    
    ```{r 2-static-survey}
    data("fakesurvey")
    fakesurvey
    ```

*   In group one 4 nominates id 6, who does not show in the data, and in group two 1 nominates 3, 4, and 8, also individuals who don't show up in the survey.
    
    ```{r 2-static-survey-cont1}
    d1 <- survey_to_diffnet(
      dat      = fakesurvey,                # Dataset
      idvar    = "id",                      # The name of the idvar
      netvars  = c("net1", "net2", "net3"), # Name of the nomination variables
      groupvar = "group",                   # Group variable (if any)
      toavar   = "toa"                      # Name of the time of adoption variable
      ); d1
    ```
    
*   If you want to include those, you can use the option `no.unsurveyed`
    
    ```{r}
    d2 <- survey_to_diffnet(
      dat      = fakesurvey,
      idvar    = "id",
      netvars  = c("net1", "net2", "net3"),
      groupvar = "group",
      toavar   = "toa",
      no.unsurveyed = FALSE
      ); d2
    ```

*   We can also check the difference
    
    
    ```{r}
    d2 - d1
    rownames(d2 - d1)
    ```

# Dynamic survey

```{r 2-static-dynsurvey}
data("fakesurveyDyn")
fakesurveyDyn
```

```{r 2-static-dynsurvey-cont1}
diffnet_dynsurvey <- survey_to_diffnet(
  dat      = fakesurveyDyn,
  idvar    = "id",
  netvars  = c("net1", "net2", "net3"),
  groupvar = "group",
  toavar   = "toa",
  timevar  = "time"
  )

plot_diffnet(diffnet_dynsurvey)
```

# Other formats

*   The package also supports working with other network formats.

*   Besides of `.net` (Pajek), and `ml` (UCINET), netdiffuseR can actually
    convert between classes: `igraph`, `network`, and `networkDynamic`.
    
        
    ```{r foreign, warnings=FALSE, messages=FALSE, cache=TRUE}
    data("medInnovationsDiffNet")
    dn_ig  <- diffnet_to_igraph(medInnovationsDiffNet)
    # dn_ig # For some issue with lazy eval, knitr won't print this
    
    dn_net <- diffnet_to_network(medInnovationsDiffNet)
    dn_net[[1]]
    
    dn_ndy <- diffnet_to_networkDynamic(medInnovationsDiffNet)
    dn_ndy
    ```
    
    First two examples it creates a list of objects, the later actually
    creates a single object
      
    ```{r networkDyn}
    networkDynamic_to_diffnet(dn_ndy, toavar = "toa")
    ```

# Problems

1.  Using the rda file [read.rda](files/read.rda), read in the edgelist `net_edgelist` and the adjacency
    matrix `net_list` as a diffnet objects. In both cases you should use the data.frame `X`
    which has the time of adoption variable.
    (<a href="files/read-solutions.r" target="_blank">solution script</a>)
    
```{r 2-generates-data, echo=FALSE, cache=TRUE, warning=FALSE, message=FALSE, results='hide'}
library(igraph)
library(networkDynamic)
# source("diffnet_to_network.r")
set.seed(81231)

# Generating random data
net_list     <- rgraph_er(20, 5)
net_edgelist <- adjmat_to_edgelist(net_list)

X <- data.frame(
  idnum     = rep(1:20, 5),
  var1      = rnorm(20*5),
  YearAdopt = rep(sample(c(NA, 1:5), 20, TRUE), 5),
  year      = sort(rep(1:5, 20))
  )


# Storing the data
save(X, net_edgelist, net_list, file = "files/read.rda")
```

<!-- 2.  With the new diffnet object, apply the same analysis as before. -->
<!--     Which strategy maximizes adoption? -->