Penultimate donut post: the function compute() is safe
By pascal on Monday, September 19 2011, 14:35 - Permalink
Do two jobs, and do them well
In the previous post, I used the command:
$ cat log? | grep "N " | sort | uniq
This may be an inefficient way to get a list of unique lines containing "N ". The command
sort does not know that it is sending its results to
uniq, so it has to keep all duplicates in memory, or at least keep a count for each of them.
As pointed out by my colleague Benjamin Monate, you can make
sort's job easier by telling it directly that you are only interested in unique lines. The option is
sort -u. This can only make
sort's job simpler, even if it has an optimization for dealing with duplicate lines efficiently.
Update on the donut verification
The computations for the remaining difficult 5% of the search space are now finished. I was hoping not to have to do a third pass, so I split each side of the examined subcube in 4. In other words, I divided each subcube in 4⁴ or 256 subsubcubes. A little birdie told me that splitting each side in two wouldn't be sufficient. That birdie had been wrong before, but this time, it was absolutely sure— it got the tip straight from the subcube's mouth.
I have kept the entire logs for this second pass, since I didn't know yet whether I would need the additional information they contain. I would have needed them if a third pass had been necessary, but it won't be. It is now official: function
compute() is safe for all arguments between -3.25 and 3.25.
Now I must write the concluding post quickly, because the full logs are taking 85GB on our computations server.