This is a follow-on from the original “Math Genome Fun” thread here
I think it’s time to test out GAs on a bigger problem, the old one was at the boundary of home computation for an exhaustive search (OMD I said “search!” get on that Mung) – but I’m going to propose a *much* bigger search / smaller target.
Friends, you target is Pi
I’ve chosen Pi because of its qualities, being irrational and transcendental you cannot describe it in an equation, only approximate it.
Our target is the first 150 digits of Pi.
Our vocabulary is the same numbers and operators as before: [1,2,3,4,5,6,7,8,9,+,-,*,/], but this time they can all be used as often as required and the genome can be anywhere from 1 to 1000 characters in length.
The design approximation is now harder due to the decimal place and the absence of zeros in our vocabulary (but there’s quite a few in Pi).
I’ve done no computation on this myself, can a GA find that proverbial needle in a haystack?
Maybe only practically but not formally since it can be algorithmically described. It’s best not to use idiosyncratic definitions that no one else uses if one is trying to communicate.
In the literature, Pi is a computable number but not terminating.
Sorry to disagree old friend.
Nice work Dazz. What mutations are you using? Your genome isn’t getting longer…
Rich,
I don’t think Dazz is using a GA. He/she wrote:
I have two versions of the GA, pi_hard and pi_easy.
The pi_hard version approximates pi (or any other constant) using the (harder) rules of the previous thread — genomes are exactly thirteen characters long, with one instance of each of [1,2,3,4,5,6,7,8,9,+,-,*,/].
Using pi_hard, the best genome I’ve found is 312/678*9+4-5 . The GA usually finds it in less than five minutes.
The pi_easy version approximates pi (or any other constant) using genomes of 1-1000 characters taken from [1,2,3,4,5,6,7,8,9,+,-,*,/].
It finds a solution in seconds, and it achieves the necessary precision by adding digits to the numerator and denominator of a fraction (which seems to be different each time). Example:
Winning genome: 8385577787772322/2669212311211131
value = 3.14159265358979311599796346854418516159057617187500000000000000000
target = 3.14159265358979311599796346854418516159057617187500000000000000000
Note that the target is pi, represented as a 64-bit double-precision floating-point number.
keiths,
Great work KeithS. There seem to be no mathematical operators in your winning genome?
What was your GA population?
Rich,
There’s a “/”. In general, the GA produces an expression that involves a fraction, then fine-tunes (OMG – I said “fine tune”! Get on that, Mung) the numerator and denominator to achieve the needed precision.
Here are some winning genomes from successive runs:
2*2/72389288135998399*56854413951661833
582926767643923993/92775676531111219/2
176923666738774596/56316552222838257
5/1-64131322211214211*7/241561278987437498
8/23*75417246669186771/8349932244717811
Ten, with four survivors each generation. I didn’t try to fine-tune (OMG) it, though.
The continued fraction of π gives the best approximation as a fraction of integers:
[3,7,15,1,292,1,1,1,2,1,3,1,14,2,1,1,2,2,2,2,1,84,2,1,1,15,3,13] =
5,706,674,932,067,741/1,816,491,048,114,374
The difference to π is less than 2.5E-31:
Fraction ∼3.141 592 653 589 793 238 462 643 383 279 736…
π ∼3.141 592 653 589 793 238 462 643 383 279 502…
My aim was going to be to get to enough decimal places (specitivity) for the UPB – 150. It looks like the evaluation part of the procedure will make that impossible, although the GAs may also have a hard time. At some point I suspect the GA may destroy more precision than it incrementally adds, but I don’t know. Of course PI is selected as the only winning hand before the fact, which is a problem for ID.
Rich,
It’s quite possible with the use of an arbitrary-precision math library, but there’s no point in going through the exercise. The same algorithm that takes you from…
21/7 = 3.0 to
219/71 = 3.085
…can also take you from…
1828/582 = 3.1409 to
18286/5821 = 3.1414
…and can also take you from…
1642172919/522719875 = 3.14159265324 to
16421729195/5227198751 = 3.14159265359
Same mutation type. Different precision.
The GA wouldn’t have any trouble. It would just keep adding digits to the numerator and denominator until it reached the desired precision. Every step is an incremental improvement. It’s quite easy for a GA.
The limiting factor is the precision of the arithmetic, not the ability of the GA.
So how much information was smuggled in Richard?
Mung,
You tell us, Mung. And show your work.
Ah yes, anti-skeptic keiths and his alternate reality.
Surely he can show where I claimed that information was being smuggled in.
Or not.
Moved some comments to guano.
Huh?
That was easy!
From the OP:
Silly question. Of course a GA can find that proverbial needle in a haystack, and if you’d ever actually written a GA you’d know it. I’m probably wrong though, because I don’t know squat about GA’s.
Here’s the truth about GA’s:
1. If there is only one needle in the haystack it can’t be found by a GA.
2. GA’s are not search algorithms anyways, therefore a GA cannot find a needle in a haystack.
Mung,
Insightful. Do they work? Can they create information?
Ah, but Mung did not make a direct, explicit claim to the effect that information had been smuggled in. Rather, Mung asked a question which was built on the implied, unstated presupposition that information had been smuggled in. Clearly not the same thing at all.
In much the same way that the question “So how much information was smuggled in Richard?” does not constitute a claim that information was smuggled in, so, too, does the question “Mung, have you stopped selling crack to high-school students?” not constitute a claim that Mung sells illicit drugs to teenagers.
Keiths is correct, I just re-used the threaded script for the previous exercise to brute force this thing, for comparison, and to learn more about threading in Python.
Interestingly, using eval concurrently my threads dead-locked after a while, not sure why. I managed to fix it using a custom “eval” function I found at stackoverflow using the ast module.
Anyway, I may write my own GA at some point, but for the time being, all my results are brute-force, sequential evaluations of combinations of symbols.
The longest I’ve run it is 8 hours and a half and this is what I got after 14108060638 expressions checked: not much improvement after the first few minutes
Best = [’15/339*71′, ‘355/565*5’, ‘426/678*5’, ‘5/226*142’, ‘5/565*355′, ’71/113*5′, ’25/565*71’, ‘284/452*5’, ‘355/791*7’, ‘4/452*355’, ‘5/452*284’, ‘2/226*355′, ’35/791*71’, ‘3/339*355’, ‘497/791*5’, ‘5/339*213’, ‘5/678*426’, ‘5/113*71’, ‘142/226*5’, ‘1/113*355’, ‘213/339*5’, ‘5/791*497’, ‘6/678*355’, ‘7/791*355’]
Current PI = 3.141592920353982076875354 found at: 00:13:35
Mininum difference = 0.000000266764188960877391
Total Expressions = 14108060638 – 656.957768117 %
Valid Expressions = 756879
Expressions/s ( /core ) = 461456 ( 115364 )
Avg. Expressions/s ( /core ) = 461456 ( 115364 )
Last Expression = 143-2*6831
Dead Threads = []
I gave my GA a harder problem to work on, which is to approximate pi using only the digit ‘1’ plus the four operators +,-,*,/.
It’s been going for a couple of hours. The best genome so far is
1+1+11*11*11*11/1111111*11+1
…for an error of 0.0034 .
keiths,
Awesome! I see you remembered to smuggle the CSI in.
ichRay,
ixNay on the ugglesmay. I think Mung is on to us.
Tried it using twos instead of ones.
Best genome after two hours:
2+2/2+2222222/2/22222222+2/22
Error = 0.0006
If I keep smuggling information into the GA, will it eventually fill up?
Best genome using threes, after nine and a half hours:
3/333+3+33*33/33333+333/3333
Error = 0.000003
Best genome using fours, after four hours:
4-4/4+4444444/444/444/444+4/44
Error 0.00009
keiths,
Beds will be shat when you get to “6”.
I should probably restrict that one to using combinations of 666, for maximum evilutionist effect, but I can’t be arsed. Yet.
Indeed I did not. Nor was I the one to raise the question. But it’s ok with me if you can’t be bothered with the facts. I never assumed otherwise.
ETA: But at least you managed to utter a true statement, even if it was an attempt at implying the opposite.
This is false. I was not the one to raise the question of “smuggled in” information. I was merely following up on a claim made by someone else. Once you figure out who that was, you will understand.
We gave up on you learning GAs just after you did, Mung.
5-555555/5555/55-555/555/5/5
Error = 0.000046
6-66/6/6-6/6/6/6-6/6
Error = 0.0027
Interestingly, the error with the sixes is considerably higher than most of the others so far, perhaps due to Satanic influences.
7/7/7-7-7/77-77/777/77+777/77
Error = 0.00002
88/8-8+88/8/8/888/8*88+8/8/8
Error = 0.00044
999/9/9-9+9/9999*999-999/99+9
Error = 0.000021