Skip to content

πŸ™ˆ Go package for anonymizing text. Removes all kinds of PII: names, places, phone numbers, etc.

License

Notifications You must be signed in to change notification settings

orsinium-labs/anonymizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

anonymizer

Go package for anonymizing text. It removes all kinds of PII: names, places, phone numbers, etc.

The main design principle is "better safe than sorry": if it's not sure if a word should be anonymized, it gets anonymized. It includes all non-dictionary words and words starting with a capital letter (which aren't at the beginning of a sentence).

Example

Input:

Good morning, doctor. My name is Gram. I live in amsterdam, at kerkstraat 42. My social number is 123-456.

Output:

Good morning, doctor. My name is β–ˆβ–„β–„β–„. I live in β–„β–„β–„β–„β–„β–„β–„β–„β–„, at β–„β–„β–„β–„β–„β–„β–„β–„β–„β–„ 00. My social number is 000-000.

Installation

go get github.com/orsinium-labs/anonymizer

Make sure you have dictionaries installed for the language you're going to anonymize. For example, for American English:

sudo apt install wamerican

To list dictionaries that you already have installed:

ls /usr/share/dict

To list all dictionaries that can be installed:

sudo apt install aptitude
aptitude search '?provides(wordlist)'

If the language is not found or not provided, the default one will be used. Run sudo select-default-wordlist to change the system default.

Usage

input := "Hi, my name is Gram."
dict, err := anonymizer.LoadDict("en")
if err != nil {
    panic(err)
}
a := anonymizer.New(dict)
a.Language = "en"
output := a.Anonymize(input)
fmt.Println(output)
// Output: Hi, my name is Xxxx.