NOTE: Most of the tests in DIEHARD return a p-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those p-values are obtained by p=F(X), where F is the assumed distribution of the sample random variable X---often normal. But that assumed F is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional p-values near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p's of 0 or 1 to six or more places. By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has "failed the test at the .05 level". Such p's happen among the hundreds that DIEHARD produces, even with good RNG's. So keep in mind that " p happens". ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BIRTHDAY SPACINGS TEST :: :: Choose m birthdays in a year of n days. List the spacings :: :: between the birthdays. If j is the number of values that :: :: occur more than once in that list, then j is asymptotically :: :: Poisson distributed with mean m^3/(4n). Experience shows n :: :: must be quite large, say n>=2^18, for comparing the results :: :: to the Poisson distribution with that mean. This test uses :: :: n=2^24 and m=2^9, so that the underlying distribution for j :: :: is taken to be Poisson with lambda=2^27/(2^26)=2. A sample :: :: of 500 j's is taken, and a chi-square goodness of fit test :: :: provides a p value. The first test uses bits 1-24 (counting :: :: from the left) from integers in the specified file. :: :: Then the file is closed and reopened. Next, bits 2-25 are :: :: used to provide birthdays, then 3-26 and so on to bits 9-32. :: :: Each set of bits provides a p-value, and the nine p-values :: :: provide a sample for a KSTEST. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: BIRTHDAY SPACINGS TEST, M= 512 N=2**24 LAMBDA= 2.0000 Results for tacopm.bin For a sample of size 500: mean tacopm.bin using bits 1 to 24 2.782 duplicate number number spacings observed expected 0 28. 67.668 1 95. 135.335 2 122. 135.335 3 110. 90.224 4 69. 45.112 5 45. 18.045 6 to INF 31. 8.282 Chisquare with 6 d.o.f. = 156.16 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 2 to 25 2.472 duplicate number number spacings observed expected 0 50. 67.668 1 101. 135.335 2 119. 135.335 3 114. 90.224 4 61. 45.112 5 33. 18.045 6 to INF 22. 8.282 Chisquare with 6 d.o.f. = 62.28 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 3 to 26 2.446 duplicate number number spacings observed expected 0 48. 67.668 1 106. 135.335 2 133. 135.335 3 98. 90.224 4 57. 45.112 5 32. 18.045 6 to INF 26. 8.282 Chisquare with 6 d.o.f. = 64.62 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 4 to 27 2.522 duplicate number number spacings observed expected 0 52. 67.668 1 93. 135.335 2 127. 135.335 3 93. 90.224 4 73. 45.112 5 45. 18.045 6 to INF 17. 8.282 Chisquare with 6 d.o.f. = 84.15 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 5 to 28 2.360 duplicate number number spacings observed expected 0 46. 67.668 1 111. 135.335 2 136. 135.335 3 95. 90.224 4 71. 45.112 5 23. 18.045 6 to INF 18. 8.282 Chisquare with 6 d.o.f. = 39.19 p-value= .999999 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 6 to 29 2.292 duplicate number number spacings observed expected 0 57. 67.668 1 102. 135.335 2 131. 135.335 3 111. 90.224 4 62. 45.112 5 23. 18.045 6 to INF 14. 8.282 Chisquare with 6 d.o.f. = 26.45 p-value= .999816 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 7 to 30 2.428 duplicate number number spacings observed expected 0 39. 67.668 1 117. 135.335 2 124. 135.335 3 102. 90.224 4 71. 45.112 5 29. 18.045 6 to INF 18. 8.282 Chisquare with 6 d.o.f. = 50.03 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 8 to 31 2.484 duplicate number number spacings observed expected 0 46. 67.668 1 110. 135.335 2 122. 135.335 3 102. 90.224 4 61. 45.112 5 31. 18.045 6 to INF 28. 8.282 Chisquare with 6 d.o.f. = 76.38 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: For a sample of size 500: mean tacopm.bin using bits 9 to 32 2.754 duplicate number number spacings observed expected 0 46. 67.668 1 89. 135.335 2 110. 135.335 3 99. 90.224 4 68. 45.112 5 50. 18.045 6 to INF 38. 8.282 Chisquare with 6 d.o.f. = 203.24 p-value= 1.000000 ::::::::::::::::::::::::::::::::::::::::: The 9 p-values were 1.000000 1.000000 1.000000 1.000000 .999999 .999816 1.000000 1.000000 1.000000 A KSTEST for the 9 p-values yields 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE OVERLAPPING 5-PERMUTATION TEST :: :: This is the OPERM5 test. It looks at a sequence of one mill- :: :: ion 32-bit random integers. Each set of five consecutive :: :: integers can be in one of 120 states, for the 5! possible or- :: :: derings of five numbers. Thus the 5th, 6th, 7th,...numbers :: :: each provide a state. As many thousands of state transitions :: :: are observed, cumulative counts are made of the number of :: :: occurences of each state. Then the quadratic form in the :: :: weak inverse of the 120x120 covariance matrix yields a test :: :: equivalent to the likelihood ratio test that the 120 cell :: :: counts came from the specified (asymptotically) normal dis- :: :: tribution with the specified 120x120 covariance matrix (with :: :: rank 99). This version uses 1,000,000 integers, twice. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: OPERM5 test for file tacopm.bin For a sample of 1,000,000 consecutive 5-tuples, chisquare for 99 degrees of freedom=155.308; p-value= .999735 OPERM5 test for file tacopm.bin For a sample of 1,000,000 consecutive 5-tuples, chisquare for 99 degrees of freedom=108.581; p-value= .760336 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 31x31 matrices. The leftmost :: :: 31 bits of 31 random integers from the test sequence are used :: :: to form a 31x31 binary matrix over the field {0,1}. The rank :: :: is determined. That rank can be from 0 to 31, but ranks< 28 :: :: are rare, and their counts are pooled with those for rank 28. :: :: Ranks are found for 40,000 such random matrices and a chisqua-:: :: re test is performed on counts for ranks 31,30,29 and <=28. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary rank test for tacopm.bin Rank test for 31x31 binary matrices: rows from leftmost 31 bits of each 32-bit integer rank observed expected (o-e)^2/e sum 28 1066 211.4********** 3454.343 29 12052 5134.0**********12776.210 30 23129 23103.0 .02915512776.240 31 3753 11551.5**********18041.090 chisquare=****** for 3 d. of f.; p-value=1.000000 -------------------------------------------------------------- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 32x32 matrices. A random 32x :: :: 32 binary matrix is formed, each row a 32-bit random integer. :: :: The rank is determined. That rank can be from 0 to 32, ranks :: :: less than 29 are rare, and their counts are pooled with those :: :: for rank 29. Ranks are found for 40,000 such random matrices :: :: and a chisquare test is performed on counts for ranks 32,31, :: :: 30 and <=29. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary rank test for tacopm.bin Rank test for 32x32 binary matrices: rows from leftmost 32 bits of each 32-bit integer rank observed expected (o-e)^2/e sum 29 1074 211.4********** 3519.320 30 11895 5134.0**********12422.880 31 23232 23103.0 .71977112423.600 32 3799 11551.5**********17626.520 chisquare=****** for 3 d. of f.; p-value=1.000000 -------------------------------------------------------------- $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the BINARY RANK TEST for 6x8 matrices. From each of :: :: six random 32-bit integers from the generator under test, a :: :: specified byte is chosen, and the resulting six bytes form a :: :: 6x8 binary matrix whose rank is determined. That rank can be :: :: from 0 to 6, but ranks 0,1,2,3 are rare; their counts are :: :: pooled with those for rank 4. Ranks are found for 100,000 :: :: random matrices, and a chi-square test is performed on :: :: counts for ranks 6,5 and <=4. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Binary Rank Test for tacopm.bin Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 1 to 8 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1576 944.3 422.580 422.580 r =5 25678 21743.9 711.792 1134.373 r =6 72746 77311.8 269.643 1404.016 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 2 to 9 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1625 944.3 490.681 490.681 r =5 26011 21743.9 837.391 1328.072 r =6 72364 77311.8 316.650 1644.722 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 3 to 10 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1558 944.3 398.841 398.841 r =5 25988 21743.9 828.388 1227.229 r =6 72454 77311.8 305.235 1532.464 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 4 to 11 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1665 944.3 550.043 550.043 r =5 25784 21743.9 750.666 1300.709 r =6 72551 77311.8 293.167 1593.876 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 5 to 12 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1585 944.3 434.708 434.708 r =5 25970 21743.9 821.376 1256.083 r =6 72445 77311.8 306.367 1562.451 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 6 to 13 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1580 944.3 427.949 427.949 r =5 26114 21743.9 878.305 1306.254 r =6 72306 77311.8 324.117 1630.371 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 7 to 14 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1634 944.3 503.742 503.742 r =5 26130 21743.9 884.748 1388.490 r =6 72236 77311.8 333.245 1721.735 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 8 to 15 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1723 944.3 642.138 642.138 r =5 26015 21743.9 838.961 1481.099 r =6 72262 77311.8 329.840 1810.939 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 9 to 16 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1702 944.3 607.971 607.971 r =5 26136 21743.9 887.170 1495.141 r =6 72162 77311.8 343.033 1838.174 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 10 to 17 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1622 944.3 486.366 486.366 r =5 26012 21743.9 837.783 1324.149 r =6 72366 77311.8 316.394 1640.543 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 11 to 18 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1597 944.3 451.144 451.144 r =5 25934 21743.9 807.442 1258.586 r =6 72469 77311.8 303.353 1561.938 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 12 to 19 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1660 944.3 542.438 542.438 r =5 26021 21743.9 841.320 1383.758 r =6 72319 77311.8 322.436 1706.194 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 13 to 20 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1629 944.3 496.465 496.465 r =5 25878 21743.9 786.003 1282.468 r =6 72493 77311.8 300.354 1582.822 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 14 to 21 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1589 944.3 440.152 440.152 r =5 25777 21743.9 748.067 1188.219 r =6 72634 77311.8 283.034 1471.253 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 15 to 22 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1644 944.3 518.456 518.456 r =5 26029 21743.9 844.470 1362.926 r =6 72327 77311.8 321.403 1684.329 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 16 to 23 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1526 944.3 358.332 358.332 r =5 26025 21743.9 842.894 1201.227 r =6 72449 77311.8 305.864 1507.090 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 17 to 24 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1612 944.3 472.118 472.118 r =5 25959 21743.9 817.106 1289.224 r =6 72429 77311.8 308.385 1597.608 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 18 to 25 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1556 944.3 396.246 396.246 r =5 25523 21743.9 656.809 1053.055 r =6 72921 77311.8 249.369 1302.424 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 19 to 26 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1610 944.3 469.294 469.294 r =5 25496 21743.9 647.458 1116.751 r =6 72894 77311.8 252.445 1369.197 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 20 to 27 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1602 944.3 458.082 458.082 r =5 25947 21743.9 812.460 1270.542 r =6 72451 77311.8 305.612 1576.154 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 21 to 28 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1555 944.3 394.951 394.951 r =5 25660 21743.9 705.294 1100.245 r =6 72785 77311.8 265.056 1365.301 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 22 to 29 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1607 944.3 465.074 465.074 r =5 25868 21743.9 782.205 1247.279 r =6 72525 77311.8 296.378 1543.657 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 23 to 30 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1616 944.3 477.792 477.792 r =5 25791 21743.9 753.269 1231.061 r =6 72593 77311.8 288.017 1519.078 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 24 to 31 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1604 944.3 460.872 460.872 r =5 25721 21743.9 727.437 1188.310 r =6 72675 77311.8 278.094 1466.404 p=1-exp(-SUM/2)=1.00000 Rank of a 6x8 binary matrix, rows formed from eight bits of the RNG tacopm.bin b-rank test for bits 25 to 32 OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1644 944.3 518.456 518.456 r =5 25780 21743.9 749.180 1267.636 r =6 72576 77311.8 290.096 1557.732 p=1-exp(-SUM/2)=1.00000 TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices These should be 25 uniform [0,1] random variables: 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 brank test summary for tacopm.bin The KS test for those 25 supposed UNI's yields KS p-value=1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE BITSTREAM TEST :: :: The file under test is viewed as a stream of bits. Call them :: :: b1,b2,... . Consider an alphabet with two "letters", 0 and 1 :: :: and think of the stream of bits as a succession of 20-letter :: :: "words", overlapping. Thus the first word is b1b2...b20, the :: :: second is b2b3...b21, and so on. The bitstream test counts :: :: the number of missing 20-letter (20-bit) words in a string of :: :: 2^21 overlapping 20-letter words. There are 2^20 possible 20 :: :: letter words. For a truly random string of 2^21+19 bits, the :: :: number of missing words j should be (very close to) normally :: :: distributed with mean 141,909 and sigma 428. Thus :: :: (j-141909)/428 should be a standard normal variate (z score) :: :: that leads to a uniform [0,1) p value. The test is repeated :: :: twenty times. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: THE OVERLAPPING 20-tuples BITSTREAM TEST, 20 BITS PER WORD, N words This test uses N=2^21 and samples the bitstream 20 times. No. missing words should average 141909. with sigma=428. --------------------------------------------------------- tst no 1: 150330 missing words, 19.67 sigmas from mean, p-value=1.00000 tst no 2: 151374 missing words, 22.11 sigmas from mean, p-value=1.00000 tst no 3: 151353 missing words, 22.06 sigmas from mean, p-value=1.00000 tst no 4: 150704 missing words, 20.55 sigmas from mean, p-value=1.00000 tst no 5: 151262 missing words, 21.85 sigmas from mean, p-value=1.00000 tst no 6: 150003 missing words, 18.91 sigmas from mean, p-value=1.00000 tst no 7: 151092 missing words, 21.45 sigmas from mean, p-value=1.00000 tst no 8: 150869 missing words, 20.93 sigmas from mean, p-value=1.00000 tst no 9: 150682 missing words, 20.50 sigmas from mean, p-value=1.00000 tst no 10: 151202 missing words, 21.71 sigmas from mean, p-value=1.00000 tst no 11: 150746 missing words, 20.65 sigmas from mean, p-value=1.00000 tst no 12: 151312 missing words, 21.97 sigmas from mean, p-value=1.00000 tst no 13: 151364 missing words, 22.09 sigmas from mean, p-value=1.00000 tst no 14: 151685 missing words, 22.84 sigmas from mean, p-value=1.00000 tst no 15: 152002 missing words, 23.58 sigmas from mean, p-value=1.00000 tst no 16: 151515 missing words, 22.44 sigmas from mean, p-value=1.00000 tst no 17: 150591 missing words, 20.28 sigmas from mean, p-value=1.00000 tst no 18: 151179 missing words, 21.66 sigmas from mean, p-value=1.00000 tst no 19: 151342 missing words, 22.04 sigmas from mean, p-value=1.00000 tst no 20: 151192 missing words, 21.69 sigmas from mean, p-value=1.00000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: The tests OPSO, OQSO and DNA :: :: OPSO means Overlapping-Pairs-Sparse-Occupancy :: :: The OPSO test considers 2-letter words from an alphabet of :: :: 1024 letters. Each letter is determined by a specified ten :: :: bits from a 32-bit integer in the sequence to be tested. OPSO :: :: generates 2^21 (overlapping) 2-letter words (from 2^21+1 :: :: "keystrokes") and counts the number of missing words---that :: :: is 2-letter words which do not appear in the entire sequence. :: :: That count should be very close to normally distributed with :: :: mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should :: :: be a standard normal variable. The OPSO test takes 32 bits at :: :: a time from the test file and uses a designated set of ten :: :: consecutive bits. It then restarts the file for the next de- :: :: signated 10 bits, and so on. :: :: :: :: OQSO means Overlapping-Quadruples-Sparse-Occupancy :: :: The test OQSO is similar, except that it considers 4-letter :: :: words from an alphabet of 32 letters, each letter determined :: :: by a designated string of 5 consecutive bits from the test :: :: file, elements of which are assumed 32-bit random integers. :: :: The mean number of missing words in a sequence of 2^21 four- :: :: letter words, (2^21+3 "keystrokes"), is again 141909, with :: :: sigma = 295. The mean is based on theory; sigma comes from :: :: extensive simulation. :: :: :: :: The DNA test considers an alphabet of 4 letters:: C,G,A,T,:: :: determined by two designated bits in the sequence of random :: :: integers being tested. It considers 10-letter words, so that :: :: as in OPSO and OQSO, there are 2^20 possible words, and the :: :: mean number of missing words from a string of 2^21 (over- :: :: lapping) 10-letter words (2^21+9 "keystrokes") is 141909. :: :: The standard deviation sigma=339 was determined as for OQSO :: :: by simulation. (Sigma for OPSO, 290, is the true value (to :: :: three places), not determined by simulation. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: OPSO test for generator tacopm.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p OPSO for tacopm.bin using bits 23 to 32 194013179.668 1.0000 OPSO for tacopm.bin using bits 22 to 31 189616164.506 1.0000 OPSO for tacopm.bin using bits 21 to 30 188588160.961 1.0000 OPSO for tacopm.bin using bits 20 to 29 188536160.782 1.0000 OPSO for tacopm.bin using bits 19 to 28 189222163.147 1.0000 OPSO for tacopm.bin using bits 18 to 27 191136169.747 1.0000 OPSO for tacopm.bin using bits 17 to 26 245029355.585 1.0000 OPSO for tacopm.bin using bits 16 to 25 245018355.547 1.0000 OPSO for tacopm.bin using bits 15 to 24 192409174.137 1.0000 OPSO for tacopm.bin using bits 14 to 23 188676161.264 1.0000 OPSO for tacopm.bin using bits 13 to 22 188205159.640 1.0000 OPSO for tacopm.bin using bits 12 to 21 187842158.389 1.0000 OPSO for tacopm.bin using bits 11 to 20 188221159.695 1.0000 OPSO for tacopm.bin using bits 10 to 19 190467167.440 1.0000 OPSO for tacopm.bin using bits 9 to 18 247649364.620 1.0000 OPSO for tacopm.bin using bits 8 to 17 247479364.033 1.0000 OPSO for tacopm.bin using bits 7 to 16 195961186.385 1.0000 OPSO for tacopm.bin using bits 6 to 15 192062172.940 1.0000 OPSO for tacopm.bin using bits 5 to 14 190935169.054 1.0000 OPSO for tacopm.bin using bits 4 to 13 190589167.861 1.0000 OPSO for tacopm.bin using bits 3 to 12 191203169.978 1.0000 OPSO for tacopm.bin using bits 2 to 11 193134176.637 1.0000 OPSO for tacopm.bin using bits 1 to 10 249326370.402 1.0000 OQSO test for generator tacopm.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p OQSO for tacopm.bin using bits 28 to 32 147239 18.067 1.0000 OQSO for tacopm.bin using bits 27 to 31 147097 17.585 1.0000 OQSO for tacopm.bin using bits 26 to 30 147194 17.914 1.0000 OQSO for tacopm.bin using bits 25 to 29 222134271.948 1.0000 OQSO for tacopm.bin using bits 24 to 28 220947267.924 1.0000 OQSO for tacopm.bin using bits 23 to 27 220824267.507 1.0000 OQSO for tacopm.bin using bits 22 to 26 221364269.338 1.0000 OQSO for tacopm.bin using bits 21 to 25 221761270.684 1.0000 OQSO for tacopm.bin using bits 20 to 24 146394 15.202 1.0000 OQSO for tacopm.bin using bits 19 to 23 147007 17.280 1.0000 OQSO for tacopm.bin using bits 18 to 22 146638 16.029 1.0000 OQSO for tacopm.bin using bits 17 to 21 220424266.151 1.0000 OQSO for tacopm.bin using bits 16 to 20 219501263.023 1.0000 OQSO for tacopm.bin using bits 15 to 19 219415262.731 1.0000 OQSO for tacopm.bin using bits 14 to 18 218867260.873 1.0000 OQSO for tacopm.bin using bits 13 to 17 219291262.311 1.0000 OQSO for tacopm.bin using bits 12 to 16 146722 16.314 1.0000 OQSO for tacopm.bin using bits 11 to 15 146968 17.148 1.0000 OQSO for tacopm.bin using bits 10 to 14 147001 17.260 1.0000 OQSO for tacopm.bin using bits 9 to 13 225834284.490 1.0000 OQSO for tacopm.bin using bits 8 to 12 223626277.006 1.0000 OQSO for tacopm.bin using bits 7 to 11 223615276.968 1.0000 OQSO for tacopm.bin using bits 6 to 10 224713280.690 1.0000 OQSO for tacopm.bin using bits 5 to 9 224609280.338 1.0000 OQSO for tacopm.bin using bits 4 to 8 146420 15.290 1.0000 OQSO for tacopm.bin using bits 3 to 7 147043 17.402 1.0000 OQSO for tacopm.bin using bits 2 to 6 147219 17.999 1.0000 OQSO for tacopm.bin using bits 1 to 5 221111268.480 1.0000 DNA test for generator tacopm.bin Output: No. missing words (mw), equiv normal variate (z), p-value (p) mw z p DNA for tacopm.bin using bits 31 to 32 143590 4.958 1.0000 DNA for tacopm.bin using bits 30 to 31 142330 1.241 .8927 DNA for tacopm.bin using bits 29 to 30 143978 6.102 1.0000 DNA for tacopm.bin using bits 28 to 29 144038 6.279 1.0000 DNA for tacopm.bin using bits 27 to 28 143163 3.698 .9999 DNA for tacopm.bin using bits 26 to 27 143617 5.037 1.0000 DNA for tacopm.bin using bits 25 to 26 232116266.096 1.0000 DNA for tacopm.bin using bits 24 to 25 232981268.648 1.0000 DNA for tacopm.bin using bits 23 to 24 142925 2.996 .9986 DNA for tacopm.bin using bits 22 to 23 142601 2.040 .9793 DNA for tacopm.bin using bits 21 to 22 144016 6.214 1.0000 DNA for tacopm.bin using bits 20 to 21 143466 4.592 1.0000 DNA for tacopm.bin using bits 19 to 20 143611 5.020 1.0000 DNA for tacopm.bin using bits 18 to 19 143602 4.993 1.0000 DNA for tacopm.bin using bits 17 to 18 230904262.521 1.0000 DNA for tacopm.bin using bits 16 to 17 231076263.029 1.0000 DNA for tacopm.bin using bits 15 to 16 144019 6.223 1.0000 DNA for tacopm.bin using bits 14 to 15 142427 1.527 .9366 DNA for tacopm.bin using bits 13 to 14 142619 2.093 .9818 DNA for tacopm.bin using bits 12 to 13 143281 4.046 1.0000 DNA for tacopm.bin using bits 11 to 12 143713 5.321 1.0000 DNA for tacopm.bin using bits 10 to 11 143733 5.380 1.0000 DNA for tacopm.bin using bits 9 to 10 236172278.061 1.0000 DNA for tacopm.bin using bits 8 to 9 236227278.223 1.0000 DNA for tacopm.bin using bits 7 to 8 144086 6.421 1.0000 DNA for tacopm.bin using bits 6 to 7 141559 -1.033 .1507 DNA for tacopm.bin using bits 5 to 6 143738 5.394 1.0000 DNA for tacopm.bin using bits 4 to 5 143765 5.474 1.0000 DNA for tacopm.bin using bits 3 to 4 144116 6.509 1.0000 DNA for tacopm.bin using bits 2 to 3 142826 2.704 .9966 DNA for tacopm.bin using bits 1 to 2 231671264.784 1.0000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the COUNT-THE-1's TEST on a stream of bytes. :: :: Consider the file under test as a stream of bytes (four per :: :: 32 bit integer). Each byte can contain from 0 to 8 1's, :: :: with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let :: :: the stream of bytes provide a string of overlapping 5-letter :: :: words, each "letter" taking values A,B,C,D,E. The letters are :: :: determined by the number of 1's in a byte:: 0,1,or 2 yield A,:: :: 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus :: :: we have a monkey at a typewriter hitting five keys with vari- :: :: ous probabilities (37,56,70,56,37 over 256). There are 5^5 :: :: possible 5-letter words, and from a string of 256,000 (over- :: :: lapping) 5-letter words, counts are made on the frequencies :: :: for each word. The quadratic form in the weak inverse of :: :: the covariance matrix of the cell counts provides a chisquare :: :: test:: Q5-Q4, the difference of the naive Pearson sums of :: :: (OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Test results for tacopm.bin Chi-square with 5^5-5^4=2500 d.of f. for sample size:2560000 chisquare equiv normal p-value Results fo COUNT-THE-1's in successive bytes: byte stream for tacopm.bin 53987.56 728.144 1.000000 byte stream for tacopm.bin 52400.63 705.701 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the COUNT-THE-1's TEST for specific bytes. :: :: Consider the file under test as a stream of 32-bit integers. :: :: From each integer, a specific byte is chosen , say the left- :: :: most:: bits 1 to 8. Each byte can contain from 0 to 8 1's, :: :: with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let :: :: the specified bytes from successive integers provide a string :: :: of (overlapping) 5-letter words, each "letter" taking values :: :: A,B,C,D,E. The letters are determined by the number of 1's, :: :: in that byte:: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,:: :: and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter :: :: hitting five keys with with various probabilities:: 37,56,70,:: :: 56,37 over 256. There are 5^5 possible 5-letter words, and :: :: from a string of 256,000 (overlapping) 5-letter words, counts :: :: are made on the frequencies for each word. The quadratic form :: :: in the weak inverse of the covariance matrix of the cell :: :: counts provides a chisquare test:: Q5-Q4, the difference of :: :: the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- :: :: and 4-letter cell counts. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Chi-square with 5^5-5^4=2500 d.of f. for sample size: 256000 chisquare equiv normal p value Results for COUNT-THE-1's in specified bytes: bits 1 to 8 5810.39 46.816 1.000000 bits 2 to 9 6462.01 56.031 1.000000 bits 3 to 10 6855.96 61.603 1.000000 bits 4 to 11 6584.09 57.758 1.000000 bits 5 to 12 6485.87 56.369 1.000000 bits 6 to 13 6530.28 56.997 1.000000 bits 7 to 14 6650.90 58.703 1.000000 bits 8 to 15 6731.42 59.841 1.000000 bits 9 to 16 6216.08 52.553 1.000000 bits 10 to 17 6502.87 56.609 1.000000 bits 11 to 18 6447.87 55.831 1.000000 bits 12 to 19 6594.12 57.900 1.000000 bits 13 to 20 6701.60 59.420 1.000000 bits 14 to 21 6644.01 58.605 1.000000 bits 15 to 22 6649.01 58.676 1.000000 bits 16 to 23 6572.08 57.588 1.000000 bits 17 to 24 6256.97 53.132 1.000000 bits 18 to 25 6114.81 51.121 1.000000 bits 19 to 26 6465.08 56.075 1.000000 bits 20 to 27 6095.99 50.855 1.000000 bits 21 to 28 6252.26 53.065 1.000000 bits 22 to 29 6140.93 51.490 1.000000 bits 23 to 30 6498.32 56.545 1.000000 bits 24 to 31 6033.50 49.971 1.000000 bits 25 to 32 6064.94 50.416 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THIS IS A PARKING LOT TEST :: :: In a square of side 100, randomly "park" a car---a circle of :: :: radius 1. Then try to park a 2nd, a 3rd, and so on, each :: :: time parking "by ear". That is, if an attempt to park a car :: :: causes a crash with one already parked, try again at a new :: :: random location. (To avoid path problems, consider parking :: :: helicopters rather than cars.) Each attempt leads to either :: :: a crash or a success, the latter followed by an increment to :: :: the list of cars already parked. If we plot n: the number of :: :: attempts, versus k:: the number successfully parked, we get a:: :: curve that should be similar to those provided by a perfect :: :: random number generator. Theory for the behavior of such a :: :: random curve seems beyond reach, and as graphics displays are :: :: not available for this battery of tests, a simple characteriz :: :: ation of the random experiment is used: k, the number of cars :: :: successfully parked after n=12,000 attempts. Simulation shows :: :: that k should average 3523 with sigma 21.9 and is very close :: :: to normally distributed. Thus (k-3523)/21.9 should be a st- :: :: andard normal variable, which, converted to a uniform varia- :: :: ble, provides input to a KSTEST based on a sample of 10. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CDPARK: result of ten tests on file tacopm.bin Of 12,000 tries, the average no. of successes should be 3523 with sigma=21.9 Successes: 3390 z-score: -6.073 p-value: .000000 Successes: 3408 z-score: -5.251 p-value: .000000 Successes: 3485 z-score: -1.735 p-value: .041356 Successes: 3363 z-score: -7.306 p-value: .000000 Successes: 3412 z-score: -5.068 p-value: .000000 Successes: 3428 z-score: -4.338 p-value: .000007 Successes: 3418 z-score: -4.795 p-value: .000001 Successes: 3414 z-score: -4.977 p-value: .000000 Successes: 3478 z-score: -2.055 p-value: .019949 Successes: 3421 z-score: -4.658 p-value: .000002 square size avg. no. parked sample sigma 100. 3421.700 34.645 KSTEST for the above 10: p= 1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE MINIMUM DISTANCE TEST :: :: It does this 100 times:: choose n=8000 random points in a :: :: square of side 10000. Find d, the minimum distance between :: :: the (n^2-n)/2 pairs of points. If the points are truly inde- :: :: pendent uniform, then d^2, the square of the minimum distance :: :: should be (very close to) exponentially distributed with mean :: :: .995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and :: :: a KSTEST on the resulting 100 values serves as a test of uni- :: :: formity for random points in the square. Test numbers=0 mod 5 :: :: are printed but the KSTEST is based on the full set of 100 :: :: random choices of 8000 points in the 10000x10000 square. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: This is the MINIMUM DISTANCE test for random integers in the file tacopm.bin Sample no. d^2 avg equiv uni 5 .0000 .1140 .000000 10 .0000 .1143 .000000 15 .1038 .1906 .099071 20 .2233 .1731 .200983 25 .0232 .2208 .023067 30 .4038 .2465 .333549 35 .4469 .2347 .361810 40 .7118 .2774 .511000 45 .0876 .2802 .084235 50 .0129 .2743 .012924 55 .2905 .3127 .253223 60 .3651 .2932 .307143 65 1.4298 .3328 .762368 70 .1215 .3140 .114989 75 .5172 .3158 .405333 80 .2115 .3104 .191453 85 .1824 .3234 .167533 90 .9449 .3337 .613140 95 .1468 .3283 .137208 100 .1223 .3223 .115666 MINIMUM DISTANCE TEST for tacopm.bin Result of KS test on 20 transformed mindist^2's: p-value=1.000000 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: THE 3DSPHERES TEST :: :: Choose 4000 random points in a cube of edge 1000. At each :: :: point, center a sphere large enough to reach the next closest :: :: point. Then the volume of the smallest such sphere is (very :: :: close to) exponentially distributed with mean 120pi/3. Thus :: :: the radius cubed is exponential with mean 30. (The mean is :: :: obtained by extensive simulation). The 3DSPHERES test gener- :: :: ates 4000 such spheres 20 times. Each min radius cubed leads :: :: to a uniform variable by means of 1-exp(-r^3/30.), then a :: :: KSTEST is done on the 20 p-values. :: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: The 3DSPHERES test for file tacopm.bin sample no: 1 r^3= 14.014 p-value= .37321 sample no: 2 r^3= 30.496 p-value= .63815 sample no: 3 r^3= 108.887 p-value= .97347 sample no: 4 r^3= .028 p-value= .00095 sample no: 5 r^3= 16.520 p-value= .42343 sample no: 6 r^3= 12.419 p-value= .33898 sample no: 7 r^3= 17.148 p-value= .43537 sample no: 8 r^3= 1.538 p-value= .04999 sample no: 9 r^3= 28.532 p-value= .61367 sample no: 10 r^3= 7.272 p-value= .21526 sample no: 11 r^3= 6.753 p-value= .20157 sample no: 12 r^3= 1.587 p-value= .05152 sample no: 13 r^3= 2.628 p-value= .08388 sample no: 14 r^3= 17.306 p-value= .43834 sample no: 15 r^3= 10.310 p-value= .29082 sample no: 16 r^3= 57.462 p-value= .85272 sample no: 17 r^3= 3.523 p-value= .11080 sample no: 18 r^3= 17.524 p-value= .44241 sample no: 19 r^3= 17.649 p-value= .44473 sample no: 20 r^3= .136 p-value= .00452 A KS test is applied to those 20 p-values. --------------------------------------------------------- 3DSPHERES test for file tacopm.bin p-value= .982408 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: This is the SQEEZE test :: :: Random integers are floated to get uniforms on [0,1). Start- :: :: ing with k=2^31=2147483647, the test finds j, the number of :: :: iterations necessary to reduce k to 1, using the reduction :: :: k=ceiling(k*U), with U provided by floating integers from :: :: the file being tested. Such j's are found 100,000 times, :: :: then counts for the number of times j was <=6,7,...,47,>=48 :: :: are used to provide a chi-square test for cell frequencies. :: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::