まったく意味がないことを承知で捏ね繰り廻す。何をやっているかは、シェルスクリプトを解読し給へ。
nitobe@debian64:~/itchyny$ cat count1.sh #!/bin/sh start=1000001 step=1000000 end=10000001 for i in `seq $start $step $end` do echo -n "$i, " for k in `seq 0 9` do sed -n -e "2, $i p" pi1g.txt| tr -dc $k| wc -c| tr '¥n' ',' done echo done nitobe@debian64:~/itchyny$ chmod +x count1.sh nitobe@debian64:~/itchyny$ time ./count1.sh 1000001, 9999922,10002475,10001092,9998442,10003863,9993478,9999417,9999610,10002180,9999521, 2000001, 19997437,20003774,20002185,20001410,19999846,19993031,19999161,20000287,20002307,20000562, 3000001, 29998356,30000582,30006337,29999867,29999810,29993099,29998913,29999071,30003683,30000282, 4000001, 39996048,39997375,40011791,39995030,40001014,39992123,40001899,40000314,40005735,39998671, 5000001, 49995279,50000437,50011436,49992409,50005121,49990678,49998820,50000320,50006632,49998868, 6000001, 59991725,59997597,60008591,59992558,60007991,59990211,60003895,59998772,60010958,59997702, 7000001, 69989891,69997755,70006497,69994028,70009581,69994537,70003795,69997014,70005161,70001741, 8000001, 79991897,79997003,80003316,79989651,80016073,79996120,80004148,79995109,80002933,80003750, 9000001, 89991208,89998381,90000968,89990083,90013132,89996086,90006412,89995658,90001979,90006093, 10000001, 99993942,99997334,100002410,99986911,100011958,99998885,100010387,99996061,100001839,100000273, real4m12.549s user5m20.528s sys1m45.579s nitobe@debian64:~/itchyny$一億桁づつの数字の累積出現回数だぜぃ。累積度数分布というのかな?
nitobe@debian64:~/itchyny$ cat count2.sh #!/bin/sh size=1000000 for i in `seq 2 $size 10000001` do j=$((i+size-1)) echo -n "$i, $j, " for k in `seq 0 9` do sed -n -e "$i,$j p" pi1g.txt| tr -dc $k| wc -c| tr '¥n' ',' done echo done nitobe@debian64:~/itchyny$ chmod +x count2.sh nitobe@debian64:~/itchyny$ time ./count2.sh 2, 1000001, 9999922,10002475,10001092,9998442,10003863,9993478,9999417,9999610,10002180,9999521, 1000002, 2000001, 9997515,10001299,10001093,10002968,9995983,9999553,9999744,10000677,10000127,10001041, 2000002, 3000001, 10000919,9996808,10004152,9998457,9999964,10000068,9999752,9998784,10001376,9999720, 3000002, 4000001, 9997692,9996793,10005454,9995163,10001204,9999024,10002986,10001243,10002052,9998389, 4000002, 5000001, 9999231,10003062,9999645,9997379,10004107,9998555,9996921,10000006,10000897,10000197, 5000002, 6000001, 9996446,9997160,9997155,10000149,10002870,9999533,10005075,9998452,10004326,9998834, 6000002, 7000001, 9998166,10000158,9997906,10001470,10001590,10004326,9999900,9998242,9994203,10004039, 7000002, 8000001, 10002006,9999248,9996819,9995623,10006492,10001583,10000353,9998095,9997772,10002009, 8000002, 9000001, 9999311,10001378,9997652,10000432,9997059,9999966,10002264,10000549,9999046,10002343, 9000002, 10000001, 10002734,9998953,10001442,9996828,9998826,10002799,10003975,10000403,9999860,9994180, real2m49.034s user2m44.614s sys0m36.742s nitobe@debian64:~/itchyny$こっちは一億桁づつの数字の区間出現回数だぜぃ。区間度数分布?
ちっともワイルドじゃない。
コピペして、Windows上のテキストファイルにして、excelで取り込んでグラフ化してみた。もうひと捻りほしいところだ。宿題。
2013/01/10追加:平均値との差
注:3.14・・・の最初の3はカウントしていないから、1足してね。
xlsx ファイル置いときます。好きにして頂戴。2013/01/10 count1.xlsx差し替え。
添付ファイル: count1.xlsx
count2.xlsx
Comments