NOTE: Most of the tests in DIEHARD return a p-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those p-values are obtained by p=F(X), where F is the assumed distribution of the sample random variable X---often normal. But that assumed F is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional p-values near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p's of 0 or 1 to six or more places. By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has "failed the test at the .05 level". Such p's happen among the hundreds that DIEHARD produces, even with good RNG's. So keep in mind that " p happens". ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BIRTHDAY SPACINGS TEST :: :: Choose m birthdays in a year of n days. List the spacings :: :: between the birthdays. If j is the number of values that :: :: occur more than once in that list, then j is asymptotically :: :: Poisson distributed with mean m^3/(4n). Experience shows n :: :: must be quite large, say n>=2^18, for comparing the results :: :: to the Poisson distribution with that mean. This test uses :: :: n=2^24 and m=2^9, so that the underlying distribution for j :: :: is taken to be Poisson with lambda=2^27/(2^26)=2. A sample :: :: of 500 j's is taken, and a chi-square goodness of fit test :: :: provides a p value. The first test uses bits 1-24 (counting :: :: from the left) from integers in the specified file. :: :: Then the file is closed and reopened. Next, bits 2-25 are :: :: used to provide birthdays, then 3-26 and so on to bits 9-32. :: :: Each set of bits provides a p-value, and the nine p-values :: :: provide a sample for a KSTEST. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: BIRTHDAY SPACINGS TEST, M= 512 N=2**24 LAMBDA= 2.0000 Results for taco.bin For a sample of size 500: mean taco.bin using bits 1 to 24 3.002 duplicate number number spacings observed expected 0 30. 67.668 1 66. 135.335 2 131. 135.335 3 104. 90.224 4 77. 45.112 5 40. 18.045 6 to INF 52. 8.282 Chisquare with 6 d.o.f. = 338.77 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 2 to 25 2.660 duplicate number number spacings observed expected 0 46. 67.668 1 79. 135.335 2 129. 135.335 3 95. 90.224 4 88. 45.112 5 38. 18.045 6 to INF 25. 8.282 Chisquare with 6 d.o.f. = 127.53 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 3 to 26 2.518 duplicate number number spacings observed expected 0 47. 67.668 1 93. 135.335 2 130. 135.335 3 121. 90.224 4 51. 45.112 5 29. 18.045 6 to INF 29. 8.282 Chisquare with 6 d.o.f. = 89.51 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 4 to 27 2.616 duplicate number number spacings observed expected 0 30. 67.668 1 99. 135.335 2 132. 135.335 3 106. 90.224 4 73. 45.112 5 33. 18.045 6 to INF 27. 8.282 Chisquare with 6 d.o.f. = 105.51 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 5 to 28 2.708 duplicate number number spacings observed expected 0 28. 67.668 1 95. 135.335 2 135. 135.335 3 94. 90.224 4 82. 45.112 5 37. 18.045 6 to INF 29. 8.282 Chisquare with 6 d.o.f. = 137.34 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 6 to 29 2.736 duplicate number number spacings observed expected 0 33. 67.668 1 97. 135.335 2 108. 135.335 3 114. 90.224 4 80. 45.112 5 33. 18.045 6 to INF 35. 8.282 Chisquare with 6 d.o.f. = 165.98 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 7 to 30 2.604 duplicate number number spacings observed expected 0 24. 67.668 1 110. 135.335 2 127. 135.335 3 117. 90.224 4 59. 45.112 5 43. 18.045 6 to INF 20. 8.282 Chisquare with 6 d.o.f. = 96.75 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 8 to 31 2.900 duplicate number number spacings observed expected 0 27. 67.668 1 84. 135.335 2 109. 135.335 3 119. 90.224 4 75. 45.112 5 44. 18.045 6 to INF 42. 8.282 Chisquare with 6 d.o.f. = 252.63 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean taco.bin using bits 9 to 32 3.078 duplicate number number spacings observed expected 0 31. 67.668 1 67. 135.335 2 105. 135.335 3 106. 90.224 4 94. 45.112 5 54. 18.045 6 to INF 43. 8.282 Chisquare with 6 d.o.f. = 334.10 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: The 9 p-values were 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 A KSTEST for the 9 p-values yields 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE OVERLAPPING 5-PERMUTATION TEST :: :: This is the OPERM5 test. It looks at a sequence of one mill- :: :: ion 32-bit random integers. Each set of five consecutive :: :: integers can be in one of 120 states, for the 5! possible or- :: :: derings of five numbers. Thus the 5th, 6th, 7th,...numbers :: :: each provide a state. As many thousands of state transitions :: :: are observed, cumulative counts are made of the number of :: :: occurences of each state. Then the quadratic form in the :: :: weak inverse of the 120x120 covariance matrix yields a test :: :: equivalent to the likelihood ratio test that the 120 cell :: :: counts came from the specified (asymptotically) normal dis- :: :: tribution with the specified 120x120 covariance matrix (with :: :: rank 99). This version uses 1,000,000 integers, twice. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: OPERM5 test for file taco.bin For a sample of 1,000,000 consecutive 5-tuples, chisquare for 99 degrees of freedom=163.828; p-value= .999953 OPERM5 test for file taco.bin For a sample of 1,000,000 consecutive 5-tuples, chisquare for 99 degrees of freedom=197.946; p-value=1.000000 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 31x31 matrices. The leftmost :: :: 31 bits of 31 random integers from the test sequence are used :: :: to form a 31x31 binary matrix over the field {0,1}. The rank :: :: is determined. That rank can be from 0 to 31, but ranks< 28 :: :: are rare, and their counts are pooled with those for rank 28. :: :: Ranks are found for 40,000 such random matrices and a chisqua-:: :: re test is performed on counts for ranks 31,30,29 and <=28. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary rank test for taco.bin Rank test for 31x31 binary matrices: rows from leftmost 31 bits of each 32-bit integer rank observed expected (o-e)^2/e sum 28 1055 211.4********** 3365.988 29 11976 5134.0**********12484.170 30 23269 23103.0 1.19207012485.360 31 3700 11551.5**********17822.010 chisquare=****** for 3 d. of f.; p-value=1.000000 -------------------------------------------------------------- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 32x32 matrices. A random 32x :: :: 32 binary matrix is formed, each row a 32-bit random integer. :: :: The rank is determined. That rank can be from 0 to 32, ranks :: :: less than 29 are rare, and their counts are pooled with those :: :: for rank 29. Ranks are found for 40,000 such random matrices :: :: and a chisquare test is performed on counts for ranks 32,31, :: :: 30 and <=29. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary rank test for taco.bin Rank test for 32x32 binary matrices: rows from leftmost 32 bits of each 32-bit integer rank observed expected (o-e)^2/e sum 29 1112 211.4********** 3836.229 30 11790 5134.0**********12465.390 31 23200 23103.0 .40686912465.800 32 3898 11551.5**********17536.680 chisquare=****** for 3 d. of f.; p-value=1.000000 -------------------------------------------------------------- $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 6x8 matrices. From each of :: :: six random 32-bit integers from the generator under test, a :: :: specified byte is chosen, and the resulting six bytes form a :: :: 6x8 binary matrix whose rank is determined. That rank can be :: :: from 0 to 6, but ranks 0,1,2,3 are rare; their counts are :: :: pooled with those for rank 4. Ranks are found for 100,000 :: :: random matrices, and a chi-square test is performed on :: :: counts for ranks 6,5 and <=4. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary Rank Test for taco.bin Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 1 to 8 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1605 944.3 462.271 462.271 r =5 25971 21743.9 821.765 1284.035 r =6 72424 77311.8 309.017 1593.052 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 2 to 9 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1595 944.3 448.383 448.383 r =5 25862 21743.9 779.931 1228.314 r =6 72543 77311.8 294.153 1522.467 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 3 to 10 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1588 944.3 438.788 438.788 r =5 25746 21743.9 736.611 1175.399 r =6 72666 77311.8 279.175 1454.574 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 4 to 11 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1581 944.3 429.297 429.297 r =5 25982 21743.9 826.047 1255.344 r =6 72437 77311.8 307.375 1562.719 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 5 to 12 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1574 944.3 419.909 419.909 r =5 25586 21743.9 678.891 1098.799 r =6 72840 77311.8 258.654 1357.454 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 6 to 13 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1599 944.3 453.913 453.913 r =5 25670 21743.9 708.900 1162.813 r =6 72731 77311.8 271.417 1434.231 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 7 to 14 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1591 944.3 442.888 442.888 r =5 25634 21743.9 695.960 1138.847 r =6 72775 77311.8 266.228 1405.075 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 8 to 15 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1599 944.3 453.913 453.913 r =5 25535 21743.9 660.987 1114.900 r =6 72866 77311.8 255.655 1370.555 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 9 to 16 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1608 944.3 466.478 466.478 r =5 25694 21743.9 717.594 1184.072 r =6 72698 77311.8 275.342 1459.414 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 10 to 17 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1637 944.3 508.134 508.134 r =5 25854 21743.9 776.904 1285.038 r =6 72509 77311.8 298.362 1583.400 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 11 to 18 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1643 944.3 516.975 516.975 r =5 25881 21743.9 787.145 1304.119 r =6 72476 77311.8 302.477 1606.596 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 12 to 19 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1691 944.3 590.446 590.446 r =5 26018 21743.9 840.140 1430.586 r =6 72291 77311.8 326.063 1756.649 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 13 to 20 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1602 944.3 458.082 458.082 r =5 25924 21743.9 803.592 1261.675 r =6 72474 77311.8 302.727 1564.401 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 14 to 21 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1610 944.3 469.294 469.294 r =5 25896 21743.9 792.863 1262.157 r =6 72494 77311.8 300.229 1562.386 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 15 to 22 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1566 944.3 409.307 409.307 r =5 25812 21743.9 761.107 1170.414 r =6 72622 77311.8 284.488 1454.902 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 16 to 23 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1633 944.3 502.282 502.282 r =5 25916 21743.9 800.519 1302.802 r =6 72451 77311.8 305.612 1608.414 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 17 to 24 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1627 944.3 493.569 493.569 r =5 25944 21743.9 811.300 1304.869 r =6 72429 77311.8 308.385 1613.254 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 18 to 25 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1603 944.3 459.476 459.476 r =5 25544 21743.9 664.129 1123.605 r =6 72853 77311.8 257.153 1380.758 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 19 to 26 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1628 944.3 495.016 495.016 r =5 25657 21743.9 704.214 1199.229 r =6 72715 77311.8 273.317 1472.546 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 20 to 27 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1636 944.3 506.668 506.668 r =5 25951 21743.9 814.007 1320.675 r =6 72413 77311.8 310.409 1631.084 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 21 to 28 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1645 944.3 519.939 519.939 r =5 25852 21743.9 776.148 1296.087 r =6 72503 77311.8 299.108 1595.195 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 22 to 29 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1597 944.3 451.144 451.144 r =5 25927 21743.9 804.746 1255.890 r =6 72476 77311.8 302.477 1558.367 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 23 to 30 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1587 944.3 437.426 437.426 r =5 25892 21743.9 791.336 1228.762 r =6 72521 77311.8 296.873 1525.635 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 24 to 31 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1570 944.3 414.591 414.591 r =5 26165 21743.9 898.924 1313.516 r =6 72265 77311.8 329.448 1642.964 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG taco.bin b-rank test for bits 25 to 32 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1589 944.3 440.152 440.152 r =5 26033 21743.9 846.048 1286.200 r =6 72378 77311.8 314.860 1601.060 p=1-exp(-SUM/2)=1.00000 TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices These should be 25 uniform [0,1] random variables: 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 brank test summary for taco.bin The KS test for those 25 supposed UNI's yields KS p-value=1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE BITSTREAM TEST :: :: The file under test is viewed as a stream of bits. Call them :: :: b1,b2,... . Consider an alphabet with two "letters", 0 and 1 :: :: and think of the stream of bits as a succession of 20-letter :: :: "words", overlapping. Thus the first word is b1b2...b20, the :: :: second is b2b3...b21, and so on. The bitstream test counts :: :: the number of missing 20-letter (20-bit) words in a string of :: :: 2^21 overlapping 20-letter words. There are 2^20 possible 20 :: :: letter words. For a truly random string of 2^21+19 bits, the :: :: number of missing words j should be (very close to) normally :: :: distributed with mean 141,909 and sigma 428. Thus :: :: (j-141909)/428 should be a standard normal variate (z score) :: :: that leads to a uniform [0,1) p value. The test is repeated :: :: twenty times. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: THE OVERLAPPING 20-tuples BITSTREAM TEST, 20 BITS PER WORD, N words This test uses N=2^21 and samples the bitstream 20 times. No. missing words should average 141909. with sigma=428. --------------------------------------------------------- tst no 1: 150363 missing words, 19.75 sigmas from mean, p-value=1.00000 tst no 2: 151714 missing words, 22.91 sigmas from mean, p-value=1.00000 tst no 3: 150572 missing words, 20.24 sigmas from mean, p-value=1.00000 tst no 4: 150806 missing words, 20.79 sigmas from mean, p-value=1.00000 tst no 5: 150757 missing words, 20.67 sigmas from mean, p-value=1.00000 tst no 6: 151316 missing words, 21.98 sigmas from mean, p-value=1.00000 tst no 7: 151265 missing words, 21.86 sigmas from mean, p-value=1.00000 tst no 8: 151201 missing words, 21.71 sigmas from mean, p-value=1.00000 tst no 9: 150986 missing words, 21.21 sigmas from mean, p-value=1.00000 tst no 10: 151127 missing words, 21.54 sigmas from mean, p-value=1.00000 tst no 11: 150887 missing words, 20.98 sigmas from mean, p-value=1.00000 tst no 12: 151253 missing words, 21.83 sigmas from mean, p-value=1.00000 tst no 13: 150389 missing words, 19.81 sigmas from mean, p-value=1.00000 tst no 14: 151847 missing words, 23.22 sigmas from mean, p-value=1.00000 tst no 15: 150319 missing words, 19.65 sigmas from mean, p-value=1.00000 tst no 16: 150644 missing words, 20.41 sigmas from mean, p-value=1.00000 tst no 17: 150753 missing words, 20.66 sigmas from mean, p-value=1.00000 tst no 18: 151493 missing words, 22.39 sigmas from mean, p-value=1.00000 tst no 19: 150420 missing words, 19.88 sigmas from mean, p-value=1.00000 tst no 20: 151420 missing words, 22.22 sigmas from mean, p-value=1.00000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: The tests OPSO, OQSO and DNA :: :: OPSO means Overlapping-Pairs-Sparse-Occupancy :: :: The OPSO test considers 2-letter words from an alphabet of :: :: 1024 letters. Each letter is determined by a specified ten :: :: bits from a 32-bit integer in the sequence to be tested. OPSO :: :: generates 2^21 (overlapping) 2-letter words (from 2^21+1 :: :: "keystrokes") and counts the number of missing words---that :: :: is 2-letter words which do not appear in the entire sequence. :: :: That count should be very close to normally distributed with :: :: mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should :: :: be a standard normal variable. The OPSO test takes 32 bits at :: :: a time from the test file and uses a designated set of ten :: :: consecutive bits. It then restarts the file for the next de- :: :: signated 10 bits, and so on. :: :: :: :: OQSO means Overlapping-Quadruples-Sparse-Occupancy :: :: The test OQSO is similar, except that it considers 4-letter :: :: words from an alphabet of 32 letters, each letter determined :: :: by a designated string of 5 consecutive bits from the test :: :: file, elements of which are assumed 32-bit random integers. :: :: The mean number of missing words in a sequence of 2^21 four- :: :: letter words, (2^21+3 "keystrokes"), is again 141909, with :: :: sigma = 295. The mean is based on theory; sigma comes from :: :: extensive simulation. :: :: :: :: The DNA test considers an alphabet of 4 letters:: C,G,A,T,:: :: determined by two designated bits in the sequence of random :: :: integers being tested. It considers 10-letter words, so that :: :: as in OPSO and OQSO, there are 2^20 possible words, and the :: :: mean number of missing words from a string of 2^21 (over- :: :: lapping) 10-letter words (2^21+9 "keystrokes") is 141909. :: :: The standard deviation sigma=339 was determined as for OQSO :: :: by simulation. (Sigma for OPSO, 290, is the true value (to :: :: three places), not determined by simulation. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: OPSO test for generator taco.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p OPSO for taco.bin using bits 23 to 32 193952179.457 1.0000 OPSO for taco.bin using bits 22 to 31 189933165.599 1.0000 OPSO for taco.bin using bits 21 to 30 188694161.326 1.0000 OPSO for taco.bin using bits 20 to 29 188457160.509 1.0000 OPSO for taco.bin using bits 19 to 28 189189163.033 1.0000 OPSO for taco.bin using bits 18 to 27 191188169.926 1.0000 OPSO for taco.bin using bits 17 to 26 246718361.409 1.0000 OPSO for taco.bin using bits 16 to 25 246885361.985 1.0000 OPSO for taco.bin using bits 15 to 24 194474181.257 1.0000 OPSO for taco.bin using bits 14 to 23 190101166.178 1.0000 OPSO for taco.bin using bits 13 to 22 189195163.054 1.0000 OPSO for taco.bin using bits 12 to 21 189517164.164 1.0000 OPSO for taco.bin using bits 11 to 20 189277163.337 1.0000 OPSO for taco.bin using bits 10 to 19 191136169.747 1.0000 OPSO for taco.bin using bits 9 to 18 246320360.037 1.0000 OPSO for taco.bin using bits 8 to 17 246173359.530 1.0000 OPSO for taco.bin using bits 7 to 16 193392177.526 1.0000 OPSO for taco.bin using bits 6 to 15 189313163.461 1.0000 OPSO for taco.bin using bits 5 to 14 187794158.223 1.0000 OPSO for taco.bin using bits 4 to 13 187411156.902 1.0000 OPSO for taco.bin using bits 3 to 12 188061159.144 1.0000 OPSO for taco.bin using bits 2 to 11 190821168.661 1.0000 OPSO for taco.bin using bits 1 to 10 246508360.685 1.0000 OQSO test for generator taco.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p OQSO for taco.bin using bits 28 to 32 147273 18.182 1.0000 OQSO for taco.bin using bits 27 to 31 146774 16.490 1.0000 OQSO for taco.bin using bits 26 to 30 147288 18.233 1.0000 OQSO for taco.bin using bits 25 to 29 222402272.857 1.0000 OQSO for taco.bin using bits 24 to 28 220792267.399 1.0000 OQSO for taco.bin using bits 23 to 27 221293269.097 1.0000 OQSO for taco.bin using bits 22 to 26 220700267.087 1.0000 OQSO for taco.bin using bits 21 to 25 221112268.484 1.0000 OQSO for taco.bin using bits 20 to 24 147122 17.670 1.0000 OQSO for taco.bin using bits 19 to 23 147260 18.138 1.0000 OQSO for taco.bin using bits 18 to 22 147055 17.443 1.0000 OQSO for taco.bin using bits 17 to 21 223211275.599 1.0000 OQSO for taco.bin using bits 16 to 20 221857271.009 1.0000 OQSO for taco.bin using bits 15 to 19 221496269.785 1.0000 OQSO for taco.bin using bits 14 to 18 221581270.073 1.0000 OQSO for taco.bin using bits 13 to 17 222408272.877 1.0000 OQSO for taco.bin using bits 12 to 16 146993 17.233 1.0000 OQSO for taco.bin using bits 11 to 15 147460 18.816 1.0000 OQSO for taco.bin using bits 10 to 14 146955 17.104 1.0000 OQSO for taco.bin using bits 9 to 13 221837270.941 1.0000 OQSO for taco.bin using bits 8 to 12 220367265.958 1.0000 OQSO for taco.bin using bits 7 to 11 220340265.867 1.0000 OQSO for taco.bin using bits 6 to 10 219588263.318 1.0000 OQSO for taco.bin using bits 5 to 9 220815267.477 1.0000 OQSO for taco.bin using bits 4 to 8 146703 16.250 1.0000 OQSO for taco.bin using bits 3 to 7 146854 16.762 1.0000 OQSO for taco.bin using bits 2 to 6 147532 19.060 1.0000 OQSO for taco.bin using bits 1 to 5 224613280.351 1.0000 DNA test for generator taco.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p DNA for taco.bin using bits 31 to 32 144080 6.403 1.0000 DNA for taco.bin using bits 30 to 31 142869 2.831 .9977 DNA for taco.bin using bits 29 to 30 143872 5.790 1.0000 DNA for taco.bin using bits 28 to 29 142723 2.400 .9918 DNA for taco.bin using bits 27 to 28 144113 6.501 1.0000 DNA for taco.bin using bits 26 to 27 143000 3.217 .9994 DNA for taco.bin using bits 25 to 26 232167266.247 1.0000 DNA for taco.bin using bits 24 to 25 232753267.975 1.0000 DNA for taco.bin using bits 23 to 24 144173 6.677 1.0000 DNA for taco.bin using bits 22 to 23 142468 1.648 .9503 DNA for taco.bin using bits 21 to 22 144040 6.285 1.0000 DNA for taco.bin using bits 20 to 21 143903 5.881 1.0000 DNA for taco.bin using bits 19 to 20 143576 4.916 1.0000 DNA for taco.bin using bits 18 to 19 143227 3.887 .9999 DNA for taco.bin using bits 17 to 18 233801271.067 1.0000 DNA for taco.bin using bits 16 to 17 233451270.034 1.0000 DNA for taco.bin using bits 15 to 16 143604 4.999 1.0000 DNA for taco.bin using bits 14 to 15 142145 .695 .7565 DNA for taco.bin using bits 13 to 14 143105 3.527 .9998 DNA for taco.bin using bits 12 to 13 143868 5.778 1.0000 DNA for taco.bin using bits 11 to 12 143905 5.887 1.0000 DNA for taco.bin using bits 10 to 11 143105 3.527 .9998 DNA for taco.bin using bits 9 to 10 231141263.220 1.0000 DNA for taco.bin using bits 8 to 9 232023265.822 1.0000 DNA for taco.bin using bits 7 to 8 143367 4.300 1.0000 DNA for taco.bin using bits 6 to 7 143019 3.273 .9995 DNA for taco.bin using bits 5 to 6 144096 6.450 1.0000 DNA for taco.bin using bits 4 to 5 144097 6.453 1.0000 DNA for taco.bin using bits 3 to 4 144069 6.371 1.0000 DNA for taco.bin using bits 2 to 3 143958 6.043 1.0000 DNA for taco.bin using bits 1 to 2 234859274.188 1.0000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the COUNT-THE-1's TEST on a stream of bytes. :: :: Consider the file under test as a stream of bytes (four per :: :: 32 bit integer). Each byte can contain from 0 to 8 1's, :: :: with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let :: :: the stream of bytes provide a string of overlapping 5-letter :: :: words, each "letter" taking values A,B,C,D,E. The letters are :: :: determined by the number of 1's in a byte:: 0,1,or 2 yield A,:: :: 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus :: :: we have a monkey at a typewriter hitting five keys with vari- :: :: ous probabilities (37,56,70,56,37 over 256). There are 5^5 :: :: possible 5-letter words, and from a string of 256,000 (over- :: :: lapping) 5-letter words, counts are made on the frequencies :: :: for each word. The quadratic form in the weak inverse of :: :: the covariance matrix of the cell counts provides a chisquare :: :: test:: Q5-Q4, the difference of the naive Pearson sums of :: :: (OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Test results for taco.bin Chi-square with 5^5-5^4=2500 d.of f. for sample size:2560000 chisquare equiv normal p-value Results fo COUNT-THE-1's in successive bytes: byte stream for taco.bin 53436.09 720.345 1.000000 byte stream for taco.bin 53105.73 715.673 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the COUNT-THE-1's TEST for specific bytes. :: :: Consider the file under test as a stream of 32-bit integers. :: :: From each integer, a specific byte is chosen , say the left- :: :: most:: bits 1 to 8. Each byte can contain from 0 to 8 1's, :: :: with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let :: :: the specified bytes from successive integers provide a string :: :: of (overlapping) 5-letter words, each "letter" taking values :: :: A,B,C,D,E. The letters are determined by the number of 1's, :: :: in that byte:: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,:: :: and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter :: :: hitting five keys with with various probabilities:: 37,56,70,:: :: 56,37 over 256. There are 5^5 possible 5-letter words, and :: :: from a string of 256,000 (overlapping) 5-letter words, counts :: :: are made on the frequencies for each word. The quadratic form :: :: in the weak inverse of the covariance matrix of the cell :: :: counts provides a chisquare test:: Q5-Q4, the difference of :: :: the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- :: :: and 4-letter cell counts. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Chi-square with 5^5-5^4=2500 d.of f. for sample size: 256000 chisquare equiv normal p value Results for COUNT-THE-1's in specified bytes: bits 1 to 8 6629.47 58.400 1.000000 bits 2 to 9 6135.87 51.419 1.000000 bits 3 to 10 6186.48 52.135 1.000000 bits 4 to 11 5972.80 49.113 1.000000 bits 5 to 12 5932.05 48.536 1.000000 bits 6 to 13 5973.81 49.127 1.000000 bits 7 to 14 6305.69 53.821 1.000000 bits 8 to 15 6189.37 52.176 1.000000 bits 9 to 16 5826.49 47.044 1.000000 bits 10 to 17 6297.90 53.710 1.000000 bits 11 to 18 6639.71 58.544 1.000000 bits 12 to 19 6578.83 57.683 1.000000 bits 13 to 20 6818.56 61.074 1.000000 bits 14 to 21 6832.88 61.276 1.000000 bits 15 to 22 6683.43 59.163 1.000000 bits 16 to 23 6776.84 60.484 1.000000 bits 17 to 24 6557.44 57.381 1.000000 bits 18 to 25 6419.29 55.427 1.000000 bits 19 to 26 6712.82 59.578 1.000000 bits 20 to 27 6516.66 56.804 1.000000 bits 21 to 28 6748.00 60.076 1.000000 bits 22 to 29 6555.39 57.352 1.000000 bits 23 to 30 6476.00 56.229 1.000000 bits 24 to 31 6269.48 53.308 1.000000 bits 25 to 32 6075.75 50.569 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THIS IS A PARKING LOT TEST :: :: In a square of side 100, randomly "park" a car---a circle of :: :: radius 1. Then try to park a 2nd, a 3rd, and so on, each :: :: time parking "by ear". That is, if an attempt to park a car :: :: causes a crash with one already parked, try again at a new :: :: random location. (To avoid path problems, consider parking :: :: helicopters rather than cars.) Each attempt leads to either :: :: a crash or a success, the latter followed by an increment to :: :: the list of cars already parked. If we plot n: the number of :: :: attempts, versus k:: the number successfully parked, we get a:: :: curve that should be similar to those provided by a perfect :: :: random number generator. Theory for the behavior of such a :: :: random curve seems beyond reach, and as graphics displays are :: :: not available for this battery of tests, a simple characteriz :: :: ation of the random experiment is used: k, the number of cars :: :: successfully parked after n=12,000 attempts. Simulation shows :: :: that k should average 3523 with sigma 21.9 and is very close :: :: to normally distributed. Thus (k-3523)/21.9 should be a st- :: :: andard normal variable, which, converted to a uniform varia- :: :: ble, provides input to a KSTEST based on a sample of 10. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CDPARK: result of ten tests on file taco.bin Of 12,000 tries, the average no. of successes should be 3523 with sigma=21.9 Successes: 3481 z-score: -1.918 p-value: .027568 Successes: 3342 z-score: -8.265 p-value: .000000 Successes: 3407 z-score: -5.297 p-value: .000000 Successes: 3432 z-score: -4.155 p-value: .000016 Successes: 3399 z-score: -5.662 p-value: .000000 Successes: 3406 z-score: -5.342 p-value: .000000 Successes: 3416 z-score: -4.886 p-value: .000001 Successes: 3351 z-score: -7.854 p-value: .000000 Successes: 3380 z-score: -6.530 p-value: .000000 Successes: 3428 z-score: -4.338 p-value: .000007 square size avg. no. parked sample sigma 100. 3404.200 38.335 KSTEST for the above 10: p= 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE MINIMUM DISTANCE TEST :: :: It does this 100 times:: choose n=8000 random points in a :: :: square of side 10000. Find d, the minimum distance between :: :: the (n^2-n)/2 pairs of points. If the points are truly inde- :: :: pendent uniform, then d^2, the square of the minimum distance :: :: should be (very close to) exponentially distributed with mean :: :: .995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and :: :: a KSTEST on the resulting 100 values serves as a test of uni- :: :: formity for random points in the square. Test numbers=0 mod 5 :: :: are printed but the KSTEST is based on the full set of 100 :: :: random choices of 8000 points in the 10000x10000 square. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: This is the MINIMUM DISTANCE test for random integers in the file taco.bin Sample no. d^2 avg equiv uni 5 .9333 .3705 .608585 10 .0218 .2620 .021636 15 .4004 .2421 .331277 20 .1033 .2458 .098608 25 1.6287 .2845 .805413 30 2.0371 .3521 .870924 35 .0085 .3225 .008523 40 .0314 .3189 .031052 45 .0856 .3002 .082419 50 .0337 .3311 .033343 55 .7027 .3243 .506522 60 .0047 .3159 .004666 65 .0082 .3183 .008167 70 1.0758 .3313 .660821 75 .1899 .3340 .173776 80 .7066 .3508 .508453 85 .9153 .3500 .601441 90 .0004 .3446 .000423 95 .0517 .3362 .050627 100 .3550 .3321 .300048 MINIMUM DISTANCE TEST for taco.bin Result of KS test on 20 transformed mindist^2's: p-value=1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE 3DSPHERES TEST :: :: Choose 4000 random points in a cube of edge 1000. At each :: :: point, center a sphere large enough to reach the next closest :: :: point. Then the volume of the smallest such sphere is (very :: :: close to) exponentially distributed with mean 120pi/3. Thus :: :: the radius cubed is exponential with mean 30. (The mean is :: :: obtained by extensive simulation). The 3DSPHERES test gener- :: :: ates 4000 such spheres 20 times. Each min radius cubed leads :: :: to a uniform variable by means of 1-exp(-r^3/30.), then a :: :: KSTEST is done on the 20 p-values. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: The 3DSPHERES test for file taco.bin sample no: 1 r^3= 30.123 p-value= .63362 sample no: 2 r^3= 24.531 p-value= .55855 sample no: 3 r^3= 56.597 p-value= .84841 sample no: 4 r^3= 1.141 p-value= .03733 sample no: 5 r^3= 34.142 p-value= .67956 sample no: 6 r^3= 12.991 p-value= .35146 sample no: 7 r^3= 1.812 p-value= .05862 sample no: 8 r^3= 18.368 p-value= .45789 sample no: 9 r^3= 72.677 p-value= .91131 sample no: 10 r^3= 24.705 p-value= .56111 sample no: 11 r^3= 11.812 p-value= .32547 sample no: 12 r^3= 12.105 p-value= .33202 sample no: 13 r^3= 26.635 p-value= .58846 sample no: 14 r^3= 8.519 p-value= .24720 sample no: 15 r^3= 9.899 p-value= .28105 sample no: 16 r^3= 40.203 p-value= .73818 sample no: 17 r^3= 34.110 p-value= .67922 sample no: 18 r^3= .437 p-value= .01446 sample no: 19 r^3= 3.985 p-value= .12438 sample no: 20 r^3= 53.355 p-value= .83111 A KS test is applied to those 20 p-values. --------------------------------------------------------- 3DSPHERES test for file taco.bin p-value= .204112 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the SQEEZE test :: :: Random integers are floated to get uniforms on [0,1). Start- :: :: ing with k=2^31=2147483647, the test finds j, the number of :: :: iterations necessary to reduce k to 1, using the reduction :: :: k=ceiling(k*U), with U provided by floating integers from :: :: the file being tested. Such j's are found 100,000 times, :: :: then counts for the number of times j was <=6,7,...,47,>=48 :: :: are used to provide a chi-square test for cell frequencies. :: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::