42: The Answer To Life, The Universe and Everything
The story started with a Twitter Post about
the JDK method bitCount
which is available for Long
and Integer
types.
If you take a look into the time line, there was a reply https://twitter.com/ScottSelikoff/status/1254185742760280064 by
@ScottSelikoff (funny comment of course) who stated 42
as the answer to life, the universe and
everything to the magic method which has been followed by
Lukas Eder:
For what percentage of longs would that be the correct result?
This thread has inspired me to this article.
So let us summarize that question:
For what percentage of long values is
bitCount()
== 42?
The first question which arises: What does bitCount
do? Let us take a look into the java doc of the
function (Integer variant):
Returns the number of one-bits in the two's complement binary representation of the specified int value. This function is sometimes referred to as the population count.
So in the end bitCount
counts the number (as the name implies) of 1
s which are in a given value.
Here are some exemplary values for the type Integer
:
Integer i = 0b1100_0000_0000_0000_0000_0000_0000_0000
bitCount=2
Integer i = 0b0000_0000_0000_0000_0000_0000_0000_0011
bitCount=2
Integer i = 0b0000_0000_0000_0000_0000_0000_0000_1111
bitCount=4
Integer i = 0b1111_1111_1111_0000_0000_0000_0000_0000
bitCount=12
Integer i = 0b1010_1010_1010_1010_1010_1010_0000_0000
bitCount=12
Integer i = 0b1111_1111_1111_1111_1111_1111_1111_1111
bitCount=32
A very naive way of trying to solve that question would be to write code like this:
1BigDecimal pow = BigDecimal.valueOf(2L).pow(64);
2
3Map<Integer, Long> collect = LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE)
4 .boxed()
5 .collect(groupingBy(Long::bitCount, counting()));
6
7collect.forEach((k, v) -> System.out.printf("k = %4s v=%15s %7.5f\n", k, v, BigDecimal.valueOf(v)
8 .divide(pow)
9 .multiply(BigDecimal.valueOf(100L))));
The first line will create 2^64
which is equal to 100% and the
BigDecimal.valueOf(v).divide(pow).multiply(BigDecimal.valueOf(100L))
will just calculate the percentage
of the value (number of values for appropriate bit count) of the resulting map which contains the number.
Ok now we can try that.... Really? Never. This code will run a very very long time...
A long time ago in a galaxy far, far away...
That might be a little long to wait for the result. Ok let's think a bit about the code. Ah! Yes of course I got it.
We should use parallel()
for the stream to get it faster.
1BigDecimal pow = BigDecimal.valueOf(2L).pow(64);
2
3Map<Integer, Long> collect = LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE)
4 .boxed()
5 .parallel()
6 .collect(groupingBy(Long::bitCount, counting()));
7
8collect.forEach((k, v) -> System.out.printf("k = %4s v=%15s %7.5f\n", k, v, BigDecimal.valueOf(v)
9 .divide(pow)
10 .multiply(BigDecimal.valueOf(100L))));
That version will be faster than the previous one and will take... Sorry but I simply don't know cause I have not tested it ;-).
Can we make a simpler and faster solution to get a result in the end?
Ok we change the Long
into Integer
? Now the code looks like this. As you see I already added the parallel()
in the code to get faster also you see I'm using BigDecimal.valueOf(2L).pow(32)
(2^32) instead of 2^64
based on the usage of Integer
:
1BigDecimal pow = BigDecimal.valueOf(2L).pow(32);
2
3Map<Integer, Long> collect = IntStream.range(Integer.MIN_VALUE, Integer.MAX_VALUE)
4 .boxed()
5 .parallel()
6 .collect(groupingBy(Integer::bitCount, counting()));
7
8collect.forEach((k, v) -> System.out.printf("k = %4s v=%15s %7.3f\n", k, v, BigDecimal.valueOf(v)
9 .divide(pow)
10 .multiply(BigDecimal.valueOf(100L))));
So this code will run in ca. 15 seconds (Hexa Core CPU's) with the following result:
1k = 0 v= 1 0.000
2k = 1 v= 32 0.000
3k = 2 v= 496 0.000
4k = 3 v= 4960 0.000
5k = 4 v= 35960 0.001
6k = 5 v= 201376 0.005
7k = 6 v= 906192 0.021
8k = 7 v= 3365856 0.078
9k = 8 v= 10518300 0.245
10k = 9 v= 28048800 0.653
11k = 10 v= 64512240 1.502
12k = 11 v= 129024480 3.004
13k = 12 v= 225792840 5.257
14k = 13 v= 347373600 8.088
15k = 14 v= 471435600 10.976
16k = 15 v= 565722720 13.172
17k = 16 v= 601080390 13.995
18k = 17 v= 565722720 13.172
19k = 18 v= 471435600 10.976
20k = 19 v= 347373600 8.088
21k = 20 v= 225792840 5.257
22k = 21 v= 129024480 3.004
23k = 22 v= 64512240 1.502
24k = 23 v= 28048800 0.653
25k = 24 v= 10518300 0.245
26k = 25 v= 3365856 0.078
27k = 26 v= 906192 0.021
28k = 27 v= 201376 0.005
29k = 28 v= 35960 0.001
30k = 29 v= 4960 0.000
31k = 30 v= 496 0.000
32k = 31 v= 31 0.000
33k = 32 v= 1 0.000
The first column k=
is the bitCount
so the first line means:
We have a single value v=1
where the bitCount==0
and 0.000
percent.
Let use take a value on the line for k=16
and v=601,080,390
and 13,995
percent of the values of Integer.
So it means for bitCount=16
we have 601,080,390
values which are 13,995
percent of all integer values.
One interesting observation which can be made here is that the number of values is rising up to a maximum value
at k=16
(bitCount=16
) which you might not have expected. Another thing is that the number of values is going
down to the higher number with bitCount=32
etc.
Based on the runtime of this example you could roughly estimate the runtime for the variant with Long:
15 s * 2^32 = 64.424.509.440 seconds / 86400 seconds / day = 745.654 days / 365 days / year
which
results in ca. 2,000 years. So having a machine which has 1000 times more CPUs it could be dropped down
to about 2 years (theoretically) or maybe you could use more power by using GPU's on AWS cloud but in the end
no realizable solution.
So we need to reconsider if there is a faster or easier solution to answer the question? Yes there is one.
The answer can be found by using some mathematics.
Let us take a known example from the above result output:
We have 32 bits (Integer). Now we have 16 bits which should be 0
and 16 bits which should be 1
. By expressing that
like this:
$$ \frac{32!}{16!\cdot 16!} = 601080390$$
You can calculate that via WolframAlpha and the result
is: 601080390
which is exact the number in the above example. Let us check some other values:
$$\frac{32!}{9!\cdot 23!} = 28048800$$
The resulting value will be two times in the table one for k=9
(bitCount=9
) and for k=23
(bitCount=23
).
The bitCount=9
means 9 1
s (and 23 0
s) are in the integer value and bitCount=23
means
23 1
s (and 9 0
) are in the integer values.
So based on the mathematics we could really answer the question via:
$$\frac{64!}{42!\cdot 22!} = 80,347,448,443,237,920$$
So this means we have 80,347,448,443,237,920 numbers where the bitCount=42
and this means
$$ \frac{80,347,448,443,237,920}{2^{64}}\cdot 100 = 0,435 \% $$
So 0,435
percent of all long values having a bitCount=42
.