-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' of https://github.com/c2huc2hu/utek-2018-progra…
- Loading branch information
Showing
1 changed file
with
33 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Automated Cipher Cracking | ||
|
||
This code implements the Caesar, Vigenere and substitution ciphers and automatically decrypts text written in them. | ||
[Katz's backoff model](https://en.wikipedia.org/wiki/Katz%27s_back-off_model) is used to score candidate decryptions. | ||
The Caesar cipher is can be brute forced easily, but the Vigenere and substitution ciphers use simulated annealing to find a solution. | ||
|
||
This was a prototype to investigate the feasibility of [UTEK 2018's programming challenge](https://github.com/utek-2018/competition-package), hence the simplicity. | ||
|
||
## Example | ||
``` | ||
DWQI IOKDOKTO QI L ELQZJY DYBQTLJ GKO LKU IWGSJU VO OLIY DG UOTZYBD. -> | ||
LVTUOE*WQ**J*KGB*ZIDS***Y* | THIS SENTENCE IS A FAIRLY TYPICAL ONE AND SHOULD BE EASY TO DECRYPT. | ||
``` | ||
|
||
## Notes | ||
* For the Caesar and Vigenere ciphers, surprisingly little training data is needed. | ||
For the Caesar cipher, a few lines of text was enough. For the Vigenere cipher, a couple paragraphs sufficed. | ||
* Seeding the substitution cipher with a frequency attack didn't help much. | ||
|
||
## Possible Improvements | ||
* Optimize hyperparameters (number of iterations, speed of temperature drop) | ||
* Try heuristic-based search: weight decrypting common letters in the ciphertext as common letters in English | ||
* Try more organized search: swap keys from sequential indices instead of randomly | ||
* Improve the model: use a dictionary, neural network etc. | ||
|
||
## Running | ||
Expects test cases in the `input` folder and writes to the `output` folder. | ||
|
||
``` | ||
npm install | ||
git clone https://github.com/utek-2018/competition-package | ||
python3 competition-package/tester.py --ref=competition-package/ | ||
``` |