diff options
Diffstat (limited to 'doc/rand.doc')
-rw-r--r-- | doc/rand.doc | 141 |
1 files changed, 141 insertions, 0 deletions
diff --git a/doc/rand.doc b/doc/rand.doc new file mode 100644 index 0000000000..da02a07f64 --- /dev/null +++ b/doc/rand.doc @@ -0,0 +1,141 @@ +My Random number library. + +These routines can be used to generate pseudo random numbers and can be +used to 'seed' the pseudo random number generator (RNG). The RNG make no +effort to reproduce the same random number stream with each execution. +Various other routines in the SSLeay library 'seed' the RNG when suitable +'random' input data is available. Read the section at the end for details +on the design of the RNG. + +void RAND_bytes( +unsigned char *buf, +int num); + This routine puts 'num' random bytes into 'buf'. One should make + sure RAND_seed() has been called before using this routine. + +void RAND_seed( +unsigned char *buf, +int num); + This routine adds more 'seed' data the RNG state. 'num' bytes + are added to the RNG state, they are taken from 'buf'. This + routine can be called with sensitive data such as user entered + passwords. This sensitive data is in no way recoverable from + the RAND library routines or state. Try to pass as much data + from 'random' sources as possible into the RNG via this function. + Also strongly consider using the RAND_load_file() and + RAND_write_file() routines. + +void RAND_cleanup(); + When a program has finished with the RAND library, if it so + desires, it can 'zero' all RNG state. + +The following 3 routines are convenience routines that can be used to +'save' and 'restore' data from/to the RNG and it's state. +Since the more 'random' data that is feed as seed data the better, why not +keep it around between executions of the program? Of course the +application should pass more 'random' data in via RAND_seed() and +make sure no-one can read the 'random' data file. + +char *RAND_file_name( +char *buf, +int size); + This routine returns a 'default' name for the location of a 'rand' + file. The 'rand' file should keep a sequence of random bytes used + to initialise the RNG. The filename is put in 'buf'. Buf is 'size' + bytes long. Buf is returned if things go well, if they do not, + NULL is returned. The 'rand' file name is generated in the + following way. First, if there is a 'RANDFILE' environment + variable, it is returned. Second, if there is a 'HOME' environment + variable, $HOME/.rand is returned. Third, NULL is returned. NULL + is also returned if a buf would overflow. + +int RAND_load_file( +char *file, +long number); + This function 'adds' the 'file' into the RNG state. It does this by + doing a RAND_seed() on the value returned from a stat() system call + on the file and if 'number' is non-zero, upto 'number' bytes read + from the file. The number of bytes passed to RAND_seed() is returned. + +int RAND_write_file( +char *file), + RAND_write_file() writes N random bytes to the file 'file', where + N is the size of the internal RND state (currently 1k). + This is a suitable method of saving RNG state for reloading via + RAND_load_file(). + +What follows is a description of this RNG and a description of the rational +behind it's design. + +It should be noted that this RNG is intended to be used to generate +'random' keys for various ciphers including generation of DH and RSA keys. + +It should also be noted that I have just created a system that I am happy with. +It may be overkill but that does not worry me. I have not spent that much +time on this algorithm so if there are glaring errors, please let me know. +Speed has not been a consideration in the design of these routines. + +First up I will state the things I believe I need for a good RNG. +1) A good hashing algorithm to mix things up and to convert the RNG 'state' + to random numbers. +2) An initial source of random 'state'. +3) The state should be very large. If the RNG is being used to generate + 4096 bit RSA keys, 2 2048 bit random strings are required (at a minimum). + If your RNG state only has 128 bits, you are obviously limiting the + search space to 128 bits, not 2048. I'm probably getting a little + carried away on this last point but it does indicate that it may not be + a bad idea to keep quite a lot of RNG state. It should be easier to + break a cipher than guess the RNG seed data. +4) Any RNG seed data should influence all subsequent random numbers + generated. This implies that any random seed data entered will have + an influence on all subsequent random numbers generated. +5) When using data to seed the RNG state, the data used should not be + extractable from the RNG state. I believe this should be a + requirement because one possible source of 'secret' semi random + data would be a private key or a password. This data must + not be disclosed by either subsequent random numbers or a + 'core' dump left by a program crash. +6) Given the same initial 'state', 2 systems should deviate in their RNG state + (and hence the random numbers generated) over time if at all possible. +7) Given the random number output stream, it should not be possible to determine + the RNG state or the next random number. + + +The algorithm is as follows. + +There is global state made up of a 1023 byte buffer (the 'state'), a +working message digest ('md') and a counter ('count'). + +Whenever seed data is added, it is inserted into the 'state' as +follows. + The input is chopped up into units of 16 bytes (or less for + the last block). Each of these blocks is run through the MD5 + message digest. The data passed to the MD5 digest is the + current 'md', the same number of bytes from the 'state' + (the location determined by in incremented looping index) as + the current 'block' and the new key data 'block'. The result + of this is kept in 'md' and also xored into the 'state' at the + same locations that were used as input into the MD5. + I believe this system addresses points 1 (MD5), 3 (the 'state'), + 4 (via the 'md'), 5 (by the use of MD5 and xor). + +When bytes are extracted from the RNG, the following process is used. +For each group of 8 bytes (or less), we do the following, + Input into MD5, the top 8 bytes from 'md', the byte that are + to be overwritten by the random bytes and bytes from the + 'state' (incrementing looping index). From this digest output + (which is kept in 'md'), the top (upto) 8 bytes are + returned to the caller and the bottom (upto) 8 bytes are xored + into the 'state'. + Finally, after we have finished 'generation' random bytes for the + called, 'count' (which is incremented) and 'md' are fed into MD5 and + the results are kept in 'md'. + I believe the above addressed points 1 (use of MD5), 6 (by + hashing into the 'state' the 'old' data from the caller that + is about to be overwritten) and 7 (by not using the 8 bytes + given to the caller to update the 'state', but they are used + to update 'md'). + +So of the points raised, only 2 is not addressed, but sources of +random data will always be a problem. + |