forked from InseeFrLab/rtauargus
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
167 lines (122 loc) · 5.34 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
```
# rtauargus <a href='https://inseefrlab.github.io/rtauargus/'><img src='man/figures/rtauargus_logo_small.png' align="right" width="120" /></a>
<!-- badges: start -->
<!-- [![pipeline status](https://gitlab.insee.fr/outilsconfidentialite/rtauargus/badges/master/pipeline.svg)](https://gitlab.insee.fr/outilsconfidentialite/rtauargus/-/pipelines) -->
<!-- badges: end -->
<!--![](vignettes/R_logo_small.png) ![](vignettes/TauBall2_small.png)-->
## Run τ-Argus from R
The *rtauargus* package provides an **R** interface for **τ-Argus**.
It allows to:
- create inputs (rda, arb, hst and tab files) from data in R format ;
- generate the sequence of instructions to be executed in batch mode (arb file);
- launch a τ-Argus batch in command line;
- retrieve the results in R.
These different operations can be executed in one go, but also in a modular way.
They allow to integrate the tasks performed by τ-Argus in a processing chain written in R.
The package presents other **additional functionalities**, such as:
- managing the protection of several tables at once;
- creating a hierarchical variable from correspondence table.
It's possible to choose a tabular or microdata approach, but the tabular
one is, from now on, encouraged.
## Installation
* **most recent stable version** (recommended)
- For Insee agents:
```{r eval=FALSE}
install.packages(
"rtauargus",
repos = "https://nexus.insee.fr/repository/r-public",
type = "source"
)
```
- Elsewhere:
```{r eval=FALSE}
install.packages("remotes")
remotes::install_github(
"InseeFrLab/rtauargus",
build_vignettes = FALSE,
upgrade = "never"
)
```
* **version in development**
To install a specific version, add to the directory a reference
([commit](https://github.com/inseefrlab/rtauargus/commits/master) or
[tag](https://github.com/inseefrlab/rtauargus/tags)),
for example `"inseefrlab/[email protected]"`.
## Simple example
When loading the package, the console displays some information:
```{r}
library(rtauargus)
```
In particular, a plausible location for the τ-Argus software is
predefined. This can be changed for the duration of the R session, as follows:
```{r}
loc_tauargus <- "Y:/Logiciels/TauArgus/TauArgus4.2.2b1/TauArgus.exe"
options(rtauargus.tauargus_exe = loc_tauargus)
```
With this small adjustment done, the package is ready to be used.
For the following demonstration, a fictitious table will be used:
```{r}
act_size <-
data.frame(
ACTIVITY = c("01","01","01","02","02","02","06","06","06","Total","Total","Total"),
SIZE = c("tr1","tr2","Total","tr1","tr2","Total","tr1","tr2","Total","tr1","tr2","Total"),
VAL = c(100,50,150,30,20,50,60,40,100,190,110,300),
N_OBS = c(10,5,15,2,5,7,8,6,14,20,16,36),
MAX = c(20,15,20,20,10,20,16,38,38,20,38,38)
)
```
As primary rules, we use the two following ones:
- The n-k dominance rule with n=1 and k = 85
- The minimum frequency rule with n = 3 and a safety range of 10.
To get the results for the dominance rule, we need to specify the largest
contributor to each cell, corresponding to the `MAX` variable in the tabular data.
```{r}
ex1 <- tab_rtauargus(
act_size,
dir_name = "tauargus_files",
files_name = "ex1",
explanatory_vars = c("ACTIVITY","SIZE"),
safety_rules = "FREQ(3,10)|NK(1,85)",
value = "VAL",
freq = "N_OBS",
maxscore = "MAX",
totcode = c(ACTIVITY="Total",SIZE="Total")
)
```
By default, the function displays in the console the logbook content in which
user can read all steps run by τ-Argus. This can be retrieved in the logbook.txt file.
With `verbose = FALSE`, the function can be silenced.
By default, the function returns the original dataset with one variable more,
called `Status`, directly resulting from τ-Argus and describing the status of
each cell as follows:
-`A`: primary secret cell because of frequency rule;
-`B`: primary secret cell because of dominance rule (1st contributor);
-`C`: primary secret cell because of frequency rule (more contributors in case when n>1);
-`D`: secondary secret cell;
-`V`: valid cells - no need to mask.
```{r}
ex1
```
All the files generated by the function are written in the specified directory
(`dir_name` argument).
The default format for the protected table is csv but it can be changed.
All the τ-Argus files (.tab, .rda, .arb and .txt) are written in the
same directory, too. To go further, you can consult the latest version of the
τ-Argus manual is downloadable here:
[https://research.cbs.nl/casc/Software/TauManualV4.1.pdf](https://research.cbs.nl/casc/Software/TauManualV4.1.pdf).
**A detailed overview is available via `vignette("rtauargus")`.**
## Important notes
The functions of *rtauargus* calling τ-Argus require that this software be accessible from the workstation. The download of τ-Argus is done on the [dedicated page](https://github.com/sdcTools/tauargus/releases) of the *sdcTools* git repository.
_The package was developed on the basis of open source versions of τ-Argus
(versions 4.2 and above), in particular the version used for this version is
τ-Argus 4.2.3. It is not compatible with version 3.5.**_