-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace timer based RNG seed with ADC reference based one #116
base: master
Are you sure you want to change the base?
Conversation
…duino.h" dependency
I don't think that that function actually generates 8 bits of entropy, so it should probably do more than 32 calls. I tried 256 calls and it was still much much faster than the clock-based code, so a large number of calls definitely wouldn't be problematic. It might also be a good idea to skip the ad-hoc shifting and masking code used in that function and feed everything directly into the hashing function. |
Unfortunately I got no AVR board to test the entropy. |
When I tried it, I didn't mask |
I had a look into the hydro_hash_update function, it seems that the input is XORed with the state and then fed into the |
I agree that we should feed everything to the hash function instead of shifting&masking. How many unbiased bits can we safely assume here? With If you feel like it's not, |
Having an unsafe option is not a good idea. |
I can't give a definite answer on the number of unbiased bits from the code alone. |
The Uno's serial USB connection is unfortunately very slow, but I have a few hundred KiB of output data so far. Testing with nConversions=20, nConversions=100, nConversions=1, and in a bit I'll change the code to remove the shifting and looping completely and try that. |
A basic test could be to simply create the average of all random uint8_t outputs of the generator function and observe if it converges towards 127. |
That is true. Right now I'm testing it with a global instead so that it still pulls up and down for each call, without looping internally. Also, seems like the main slowdown was actually random_rbyte, not the serial interface, so I have a very large amount of data for N=1 and what I described in the previous sentence. |
Doing some more testing, any prescaler option other than 64 seems to produce fewer unique readings from the ADC. Also, regardless of settings, at least half of all readings are 0. |
I'd say that the 0 readings are expected, ever 2nd itteration the ADC gets grounded to be able to recharge it. |
I generated 17MiB with this code uint8_t
random_rbyte()
{
uint8_t rnd = 0;
ADMUX = 0b01001111; // Low (0V)
ADMUX = 0b01001110; // High (1.1V)
ADCSRA |= 1 << ADSC; // Start a conversion
// ADSC is cleared when the conversion finishes
while ((ADCSRA >> ADSC) & 1) {
// wait for the conversion to complete
}
rnd = ADCL; // do not swap this sequence. Low has to be fetched first.
rnd ^= ADCH; // the value is always between 0-3
return rnd;
} and got the following byte value distribution
|
How frequently do you get the same value twice in a row? What distribution do you get with only the 2 or 4 last bits? And with a Von Neumann extractor? |
What kind of tools are good for this sort of analysis? I've just been stringing commands together in bash, but that's starting to feel quite inadequate. Looking at every individual pair of bytes, 158261 have the same byte twice in a row. Here's the actual file I generated https://0x0.st/-tyR.txt (it's raw binary data, not sure why 0x0 added the .txt) if you want to do your own analysis on it. |
Last 4 bits:
Last 2 bits:
|
I'm doubt that 1 instruction is not enough to discharge the internal capacitor to have a sufficient long charge time and range for a good distribution
|
I changed it so that it sets it to low right at the end of the function, and sets it to high right before sampling. This seems to increase the number of unique readings. |
I generated 1.6GiB. It has 100 different distinct byte values. I'm currently writing a tool in Zig to extract and analyze it. |
I ran a Von Neumann extractor on just the lowest bit of each byte, and the output was pretty even according to
It also passes a few dieharder tests, but the size (30M) is not big enough for most of the tests. I'm currently running another arduino sketch that generates 4096 samples and then resets, to see if there are any right-after-boot corelations between runs. |
I have now done Von Neumann extraction on each bit position.
According to ent [1] each of these files is high entropy. Combined with the decrease in file sizes, I interpret this as meaning that while the high bits are very biased, they are still quite decorrelated. This doesn't mean that they're independent, but I think it's a good sign. I think an iterated Von Neumann extractor might be useful to get more examinable data out of the raw data, but I'm not sure. [1] https://paste.sr.ht/~lonjil/b6814c2139c2f55c9d37abece27aab44bc568823 |
This is great! How long does it take to extract 256 bits from the lower bit? |
On my phone so my numbers are inexact, but I believe it took around 1707
input bits for Von Neumann to produce 256 bits.
|
As a point of reference regarding performance, this hydro_random_init [1] implementation takes ~517 milliseconds to run [1] https://paste.sr.ht/~lonjil/4a2cd787141af107fb5ac3b7eb3914b9d2034eef |
I see that you removed the ADC converisontime "randomisation". |
I'm not sure it would. Changing the scaler to 2 makes it not random at all,
and changing it to 128 makes it much slower while also reducing the entropy
per reading (at a glance anyway, I didn't do a rigorous test). I think the
other possible selections other than 64 all produced less entropy as well,
so any extra randomness from an unpredictable ADC clock might just be lost
from a more predictable sample output in some of the cases.
Also, the clock scaler needs to be set to a different value depending on
the chip clock rate. An atmega328p at 8mhz should probably use 32, one at
4mhz should probably use 16, and so on, and one at 20mhz might do best at
128. The random selection would make that aspect harder to reason about,
and not possible at all on some configurations.
|
I ran an implemention of NIST 800-90B non-IID min-entropy estimator on 2 million samples, and it gave 0.315568 bits per sample. |
I've tested the codes with a ATmega2561 in a comercial appliance I had at hand.
|
Since different AVR micros differ so strongly on so many points, it may be necessary to have bespoke routines for every model :/ |
This is sad :( And we won't be able to test on every AVR model. |
I guess the code will have to have preprocessor directives for the target chip, and an error instructing people to open an issue for un-implemented models. Also, I wonder if there is a published in-depth description of the ADC hardware anywhere, so that someone with the appropriate physics knowledge could figure out a lower bound on the min-entropy, rather than the upper bound we get from sample testing. |
I think we won't be able to determine a definite min-entropy bound for that type of source. I'm currently testing an other (WDT + Timer based) RNG which would be independent of external / user influencable circumstances. The randomnes here will come from the "low accuracy" of the internal 128kHz RC oscillator for the WDT.
|
@DeeFuse would you mind posting the code for that? |
Old PR and discussion, but still relevant :) What should we do? |
I could clean up my GIST with the updated code and push it to my PR. |
The implementation looks fine, but I don't have the hardware to test it on. |
If you want to have a look into the randomnes of this implementation I can send you a bin file for analysis. |
I have a lot of different boards available and can help out if needed. |
I created a 10k number sequence as well as 10 shorter dumps after booting the MCU: random.zip |
Shall we backup the states of the used perihpherals (WDT / TIM1) and restore them after the random number generation? |
That sounds like a good idea. Applications may not expect them to be changed. |
I got a question about the coding style: |
Runtime. |
I'm thinking about disabling the function 'hydro_random_reseed' when building on an AVR unless the user explicitly enables it. |
Remove "Arduino.h" dependency
Fixes #115