Sunday, April 24, 2016

Appendix 7.3 Modelling Evolution

The coin toss simulation is not a model of evolution but a model of the algorithm of evolution which allows exploration of the strength of natural selection to overcome the normal destructive effects of random mutation. Since natural selection does change the probability of a required outcome it would be nice to find an equation for that probability. In Appendix 5.1 I discovered a number which was effectively a boundary condition the 2nd Law imposes on the random assembly of any coded string from a finite alphabet. An equation for probability of evolving a given 'gene' after any number of generations including the effect of natural selection would allow me to relate the improbability of the state to the limit imposed by the Second Law.

I have an equation which on preliminary testing shows agreement with some statistical modelling, Fred Hoyle's results [The Mathematics of Evolution] and recently published papers like this..

[http://dx.doi.org/10.1371/journal.pone.0000096].. noting..
"Although a great deal is known about the landscape structure near the fitness peaks of native proteins [5][7][9][15], little is known about structures near the bottom, which contain information regarding primordial protein evolution." and..
"Although it was shown to be possible for a single arbitrarily chosen polypeptide to evolve infectivity, the evolution stagnated after the 7th generation, which was probably due to the small mutant library size at each generation."

While such modelling does show some increase in fitness as complexity (sequence length) increases it effectively stagnates at some limiting value dependent upon the "mutant library size".. This is precisely what my Second Law boundary condition predicts..

The equation is proving difficult to verify and I initially had problems with software handling very large/small numbers (now solved).. I started defining selection success 'Sn' as the probability that positive mutations on average will succeed.. ie not die or get eaten before they can reproduce.  Sn varies like this..

Sn = 0  (no selection)   to   Sn = 1  (100% selection)

Some program confirmation of stagnation occurred for all values of Sn for large enough gene lengths. However the extreme sensitivity as Sn dropped even minutely below 1 for me urges caution so I'll hold that result until fully verified.