From: Chris Hanson Date: Wed, 26 Nov 2003 07:00:40 +0000 (+0000) Subject: Fix broken behavior of RANDOM when given modulus that exceeds B. The X-Git-Tag: 20090517-FFI~1748 X-Git-Url: https://birchwood-abbey.net/git?a=commitdiff_plain;h=6df95968d8077af63d2e55a8c5a9d389d94bbda5;p=mit-scheme.git Fix broken behavior of RANDOM when given modulus that exceeds B. The old implementation just scaled a random element (uniformly distributed integer between 0 and B-1 inclusive) into the given range; this strategy works fine for a modulus <= B but breaks pretty badly for larger B. In addition, RANDOM now generates an error if the modulus is a real number but neither an exact positive integer nor an inexact real. The old behavior in this case was arbitrary, not terribly useful, and likely to be at odds with the user's expectations. Here are some tests using the "ent" program that show the problem with the old RANDOM implementation. The first example is a 128MB file generated by repeatedly calling (RANDOM (EXPT 2 64)), slicing each random number into bytes, and writing the bytes to the file. The result is appalling: Entropy = 7.500650 bits per byte. Optimum compression would reduce the size of this 134217728 byte file by 6 percent. Chi square distribution for 134217728 samples is 515675588.87, and randomly would exceed this value 0.01 percent of the times. Arithmetic mean value of data bytes is 111.9516 (127.5 = random). Monte Carlo value for Pi is 3.365650585 (error 7.13 percent). Serial correlation coefficient is -0.031868 (totally uncorrelated = 0.0). In contrast, here is the result from a file of the same length generated using (RANDOM 256). This throws away 75% of each random element, but shows the quality of the underlying generator: Entropy = 7.999999 bits per byte. Optimum compression would reduce the size of this 134217728 byte file by 0 percent. Chi square distribution for 134217728 samples is 235.11, and randomly would exceed this value 75.00 percent of the times. Arithmetic mean value of data bytes is 127.5060 (127.5 = random). Monte Carlo value for Pi is 3.141120183 (error 0.02 percent). Serial correlation coefficient is -0.000131 (totally uncorrelated = 0.0). The new design uses enough random elements to guarantee a uniform distribution, no matter what the size of the modulus, by iteratively adding and scaling the elements. This preserves the quality of the underlying generator, as shown by this result: Entropy = 7.999999 bits per byte. Optimum compression would reduce the size of this 134217728 byte file by 0 percent. Chi square distribution for 134217728 samples is 263.59, and randomly would exceed this value 50.00 percent of the times. Arithmetic mean value of data bytes is 127.5114 (127.5 = random). Monte Carlo value for Pi is 3.141132700 (error 0.01 percent). Serial correlation coefficient is -0.000044 (totally uncorrelated = 0.0). --- diff --git a/v7/src/runtime/random.scm b/v7/src/runtime/random.scm index db79b939f..6fc8e3ab3 100644 --- a/v7/src/runtime/random.scm +++ b/v7/src/runtime/random.scm @@ -1,6 +1,6 @@ #| -*-Scheme-*- -$Id: random.scm,v 14.30 2003/09/19 00:39:21 cph Exp $ +$Id: random.scm,v 14.31 2003/11/26 07:00:40 cph Exp $ Copyright 1988,1989,1993,1994,1995,1996 Massachusetts Institute of Technology Copyright 1998,1999,2000,2001,2003 Massachusetts Institute of Technology @@ -28,7 +28,7 @@ USA. ;;; package: (runtime random-number) (declare (usual-integrations)) - + ;;; A "subtract-with-borrow" RNG, based on the algorithm from "A New ;;; Class of Random Number Generators", George Marsaglia and Arif ;;; Zaman, The Annals of Applied Probability, Vol. 1, No. 3, 1991. @@ -44,29 +44,54 @@ USA. ;;; here are taken from the article and are claimed to represent ;;; "hundreds of hours" of compute time. The period of this generator ;;; is (- (EXPT B R) (EXPT B S)), which is approximately (EXPT 10 414). - + (define-integrable r 43) (define-integrable s 22) (define-integrable b 4294967291 #|(- (expt 2 32) 5)|#) (define-integrable b. 4294967291. #|(exact->inexact b)|#) (define (random modulus #!optional state) - (let ((element - (flo:random-unit - (guarantee-random-state (if (default-object? state) #f state) - 'RANDOM)))) + (let ((state + (guarantee-random-state (if (default-object? state) #f state) + 'RANDOM))) ;; Kludge: an exact integer modulus means that result is an exact ;; integer. Otherwise, the result is a real number. - (cond ((and (flo:flonum? modulus) (flo:< 0. modulus)) - (flo:* element modulus)) - ((and (int:integer? modulus) (int:< 0 modulus)) - (flo:truncate->exact (flo:* element (int:->flonum modulus)))) - ((and (real? modulus) (< 0 modulus)) - (* (inexact->exact element) modulus)) + (cond ((int:integer? modulus) + ;; This complexity guarantees that we use enough random + ;; bits to uniformly cover the output interval. + (if (int:> modulus 0) + (let loop ((modulus modulus)) + (if (int:> modulus b) + (let ((qr (int:divide modulus b))) + (int:+ (int:* (car qr) + (flo:truncate->exact + (flo:random-element state))) + (loop (cdr qr)))) + (flo:truncate->exact + ;; This operation order minimizes error. + (flo:/ (flo:* (flo:random-element state) + (int:->flonum modulus)) + b.)))) + (error:bad-range-argument modulus 'RANDOM))) + ((flo:flonum? modulus) + (if (flo:> modulus 0.) + (flo:* (flo:random-unit state) modulus) + (error:bad-range-argument modulus 'RANDOM))) + ((real? modulus) + ;; I can't think of the correct thing to do here. The old + ;; code scaled a random element into the appropriate range, + ;; which gave one of B evenly-distributed values. But this + ;; is arbitrary and not necessarily what the caller wants. + ;; If you have an idea what should happen here, let me + ;; know. -- cph + (error "Unsupported modulus:" modulus)) (else - (error:wrong-type-argument modulus "positive real" 'RANDOM))))) + (error:wrong-type-argument modulus "real number" 'RANDOM))))) (define (flo:random-unit state) + (flo:/ (flo:random-element state) b.)) + +(define (flo:random-element state) (let ((mask (set-interrupt-enables! interrupt-mask/gc-ok))) (let ((index (random-state-index state)) (vector (random-state-vector state))) @@ -90,7 +115,7 @@ USA. 0 (fix:+ index 1)))) (set-interrupt-enables! mask) - (flo:/ element b.))))) + element)))) (define (random-byte-vector n #!optional state) (let ((state (if (default-object? state) #f state))