• Nenhum resultado encontrado

New insights on neutral binary representations for evolutionary optimization

N/A
N/A
Protected

Academic year: 2021

Share "New insights on neutral binary representations for evolutionary optimization"

Copied!
22
0
0

Texto

(1)

Contents lists available atScienceDirect

Theoretical

Computer

Science

www.elsevier.com/locate/tcs

New

insights

on

neutral

binary

representations

for

evolutionary

optimization

Marisol

B. Correia

a

,

b

,

aESGHT,UniversityoftheAlgarve,Faro,Portugal bCEG-IST,UniversidadedeLisboa,Lisbon,Portugal

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory: Received21May2015

Receivedinrevisedform7March2016 Accepted21May2016

Availableonline26May2016 CommunicatedbyI.Petre Keywords:

Redundantbinaryrepresentations Neutrality

Errorcontrolcodes Propertiesofrepresentations Evolutionaryalgorithms NK fitnesslandscapes

ThispaperstudiesafamilyofredundantbinaryrepresentationsNNg(,k),whicharebased onthemathematicalformulationoferrorcontrolcodes,inparticular,onlinearblockcodes, whichareusedtoaddredundancyandneutralitytotherepresentations.Theanalysisofthe propertiesofuniformity,connectivity,synonymity, localityand topologyofthe NNg(,k) representationsis presented,aswellas theway an(1+1)-ES canbemodeledusingMarkov chainsandappliedtoNK fitnesslandscapeswithadjacentneighborhood.

The results show that it is possible todesign synonymouslyredundant representations that allow an increase of the connectivity between phenotypes. For easy problems, synonymouslyNNg(,k)representations,withhighlocality,andwhereitisnotnecessary topresenthigh valuesofconnectivityarethemostsuitableforanefficient evolutionary search. On the contrary, for difficult problems, NNg(,k) representations with low locality,whichpresentconnectivitybetweenintermediatetohighandwithintermediate values ofsynonymity are thebest ones.Theseresults allow toconclude thatNNg(,k) representationswithbetterperformanceinNK fitnesslandscapeswithadjacent neighbor-hood do not exhibit extreme values of any of the properties commonly considered in the literature ofevolutionary computation. This conclusion is contrary to what one would expectwhen takinginto accountthe literature recommendations. Thismay help understandthecurrentdifficultytoformulateredundantrepresentations,whichareproven tobesuccessfulinevolutionarycomputation.

©2016ElsevierB.V.Allrightsreserved.

1. Introduction

Theinfluenceofredundancy andneutralityon the behavioroftheevolutionarysearch hasbeendiscussedinthe evo-lutionary computation (EC) field. Some studies consider that the capacity that random mutations can contribute to the improvementoftheevolutionarysearch canbeachievedusingredundantrepresentations

[1,2]

,whileothersalertthatthe additionofrandomredundancydoesnotappeartoimprovethatcapacity

[3]

.

TheneutraltheoryofmolecularevolutionproposedbytheJapanesescientistKimura

[4]

andtheexistenceofredundancy and neutrality in thegenetic code which appears in several waysare considered to be responsible for the evolutionary capacityoflivingbeings.Kimuraobservedthatinnaturethenumberofdifferentgenotypeswhichstorethegeneticmaterial

*

Correspondenceto:ESGHT,UniversityoftheAlgarve,Faro,Portugal. E-mailaddress:mcorreia@ualg.pt.

http://dx.doi.org/10.1016/j.tcs.2016.05.033 0304-3975/©2016ElsevierB.V.Allrightsreserved.

(2)

of an individual greatly exceeds the number of differentphenotypes, thus, the representation which describeshow the genotypesareassignedtothephenotypesmustberedundant,andneutralmutationsbecomepossible.

Amutationisneutralifitsapplicationtoagenotypedoesnotresultinachangeofthecorrespondingphenotype.Aslarge parts ofthegenotype havenoactual effecton thephenotype,evolution canuse themasa storeforgenetic information thatwas necessaryforsurvivalinthepast,andasaplaygroundfordevelopingnewpropertiesintheindividualthatcould become advantageous in the future [5]. This has motivated the proposals of redundant representations in evolutionary computation

[6–8]

.

Thedynamicsofevolutionarychangeinbiologyandofsearchinevolutionaryoptimizationcanbevisualizedasa popu-lationclimbingthepeaksofafitnesslandscape

[9]

.Geneticchangessuchasmutationscanbevisualizedasmovementsona dimensionalgeographicalmapscaleduptoadimensionalitycorrespondingtothenumberofgenes.Somefitnesslandscapes maycontain ‘ridges’orlevelconnectedpathwaysofthesame fitness,whichareconnectedvia single potentialmutations andthatareconsideredneutralnetworks

[10]

.

There is a great interest in how redundantrepresentations andneutral search spaces influencethe behavior andthe evolvability ofevolutionary algorithms (EAs). Diverseredundant representations were usedand, withsome ofthese, the performance oftheEAs improvedandthe evolvability,whichisdefinedastheabilityofrandomvariations tosometimes produce improvements andasthegenomes abilityto produceadaptive variantswhen actedupon by thegenetic system

[11],isincreased

[2,1,5]

.Withothers,asexplainedin

[12,3]

,addingredundancyappearedtoreduceperformancewithmost oftheproblemsapplied.Thelossofdiversityinthepopulation,thelargersizeofthegenotypicsearchspaceandthefact thatdifferentgenotypesthatrepresentthesamephenotypecompeteagainsteachother,aresomeofthereasonsstatedfor the deterioration ofEAsperformance. Onthe contrary,theability ofapopulation to adaptafterchanges (redundantand neutralrepresentationsallowapopulationtochangethegenotypewithoutchangingthephenotype)isoneofthereasons givenfortheincreasedperformanceofEAs.

Theresearchexplainedinthispaperisanimprovementoftheresearchpresentedin

[13,14]

,wheretheeffectof redun-dancy andneutrality inthebehavior ofa simpleevolutionaryalgorithmusingtwo differentfamilies ofredundantbinary representations were studied. These families of representations proposed in [15–17], consist ofneutral binary represen-tations, denoted as NNg

(,

k

)

, based on the mathematical formulation of errorcontrol codes and of non-neutral binary

representations, denotedasNonNNZ

(



,

k

)

,basedonlineartransformationsandwhichdevelopment wasperformed

incre-mentally, startingwitha non-codingredundantrepresentation,andincorporatingadifferentproperty ineachphase, first polygeny,followedbypleiotropy.

ThisresearchgivesnewinsightsintothewaytheenumerationofNNg

(,

k

)

representationscanbedonethroughthe

clo-sureofan initialcanonicalrepresentationunderaneighborhoodrelationship,aswell aspresentsastudyoftheproperties ofrepresentations,suchas,uniformity,connectivity,synonymity,localityandtopology,whicharecommonlyanalyzedinthe literatureofECfield,andwhicharealsostudiedforthefamilyofNNg

(,

k

)

representations.Theperformanceofthese

repre-sentationsinNK fitnesslandscapeswithadjacentneighborhoodareanalyzedusingasimple(1+1)-ES (EvolutionaryStrategy)

[18,19],whichismodeledusingMarkovchains.Thiscanbedonebecause(1+1)-ES correspondstoastochastichillclimbing

[20],whereanindividualparentgeneratesanindividualchildandthebestofbothbecomestheparentinthenextiteration. Theoutlineofthispaperisasfollows.Section2explainshowredundancyandneutralityaremanifestedinbiology,how they areusedinECfield andthewaysasredundancy andneutralitycanbeincorporatedinredundantbinary representa-tions. Insection3somepropertiesofthistype ofrepresentationsareexposed.Theerrorcontrolcodesusedto definethe neutral binaryrepresentations arepresented insection 4.The familyofneutralbinary representationsNNg

(,

k

)

are

pre-sentedinsection5,includingthealgorithmsofcodinganddecodingandthedefinitionofcanonicalneutralrepresentation. Section 6explainsthedefinitionsandalgorithmsusedtoenumeratethedifferentneutralrepresentationsthatcanbe gen-erated.The propertiesexplainedinsection3are extendedtotheNNg

(,

k

)

representationsandtherelationshipsbetween

thesepropertiesare exhibitedinsection7.A briefexplanationaboutthe NK fitnesslandscapesisgiveninsection 8.The waysasNNg

(,

k

)

representationscanchangetheruggednessofNK fitnesslandscapesareanalyzedinsection9.Section10

presents how Markovchains can beused to modelan (1+1)-ES,presenting thetransitionmatrix foreach type of repre-sentation andtheprobability of reaching theglobaloptimum ofthe NK fitness landscapes.Section 11 presentshow the performance ofan(1+1)-ES is consideredasthe expectedvalue toachievethebestsolutionandshowsan analysisofthe performance ofNNg

(,

k

)

representations versus theproperties oftheserepresentations. Finally,section 12 discusses the

conclusionsandpresentssomeguidelinesforfuturework.

2. Redundantbinaryrepresentations

CharlesDarwin’stheoryofnaturalselection

[21]

assumesthatnaturalselectionisthedrivingforceofevolutionandthat randomlyadvantageousmutationsarefixedduetonaturalselectionandcanbepropagatedfromgenerationtogeneration. Ontheother hand,theneutraltheoryofmolecularevolutionassumesthatthedrivingforceofmolecularevolutionisthe randomfixationofneutralmutationratherthanthefixationofadvantageousmutationsbynaturalselection

[5]

.

Inbiology,thegenotypecorresponds totheinformationstoredinchromosomes,allowing todescribe individualsatthe levelofgenes,whilethephenotypedescribestheappearanceoftheindividual.Inthegeneticcode,therearemanifestations of redundancy andneutrality at severallevels. Forexample,the genetic codeis considered redundantbecause thereare morecombinationsofnucleotidesthanaminoacidsformed,thisiswhatiscalledthedegeneracyofthegeneticcode

[22]

.

(3)

Also, single secondary structure isgenerally representedby manydifferent sequences, indicatinga highly neutral redun-dantgenotype–phenotypemappingbecausemutations indifferentpositions inthesequencesdonot affectthesecondary structure

[23,24]

.

Themappinggenotype–phenotypeorrepresentationusesthegenotypicinformationtobuildthephenotype. Representa-tionsareconsideredredundantifthenumberofgenotypesexceedsthenumberofphenotypes

[5]

.Thenotionofneutrality andneutralnetworkshaveattractedtheattentionofseveralresearchers,duetothepotentialthatneutralnetworkshavein establishingalternativewaysfortheevolutionofthepopulation,allowingtheimprovementofthequalitysearch.

Thereisnotasingledefinitionofneutralityanditisimplicitlydefinedasthepresenceofneutralnetworksinthesearch space. Aneutralnetworkcan bedefinedasapath throughgenotypespacevia mutations,whichleave fitnessunchanged

[10], oraspoints inthesearch spacethat are connectedthrough neutralpoint-mutations,where thefitness isthesame forallthepointsinsuchnetwork

[7]

,orasasetofgenotypesconnectedby singlepointmutationsthatmap tothesame phenotype

[2]

.Thelatterperspectiveistheoneusedinthisresearch.

Whenusingabinaryrepresentation,wheregenotypesandphenotypes arecodedasstringswithlength



,agenotype– phenotypemapping fgofanon-redundantrepresentationisdefinedasfollows:

fg

: {

0

,

1

}



→ {

0

,

1

}



where:

∀(

x0, . . . ,x−1)

∈ {

0

,

1

}



,

∀(

y0, . . . ,y−1)

∈ {

0

,

1

}



,

fg(x0, . . . ,x−1)

=

fg(y0, . . . ,y−1)

⇒ (

x0, . . . ,x−1)

= (

y0, . . . ,y−1)

A metric hasto be definedon the search space



when usingsearch algorithms. Based on the metric, the distance

d

(

xa

,

xb

)

betweentwoindividualsxa

∈ 

andxb

∈ 

describeshowsimilarthetwoindividualsare.Thelargerthedistance,

themore differenttwo individualsare [5].In general, differentmetricscan be definedforthesame searchspace, where differentmetrics result indifferentdistances anddifferent measurements ofthe similarityof solutions.In thisway, two individualsareneighborsifthedistancebetweentwoindividualsisminimal.Forexample,whenusingtheHammingmetric

[25] for binary strings, theminimal distancebetweentwo individuals isd

=

1.Therefore, two individuals xa and xb are

neighborsifthedistanced

(

xa

,

xb

)

=

1.

A waytoproduce a redundantcode isto increase the numberof genotypes,allowing for each phenotypeto be rep-resentedby severalgenotypes. Tousea redundantrepresentation forsolving anoptimization problem, itis necessaryto define agenotype–phenotypemapping fg anda phenotype–fitnessmapping fp.Denoting



g asthegenotypicspaceand



p asthephenotypicspace,agenotype–phenotypemapping fg determineswhichphenotypesresultfromthedecodingof

eachgenotypesandcanbedefinedasfollows:

fg(xg)

: 

g

→ 

p

xg

→

xp

=

fg(xg)

wherexg

∈ 

g denotesagenotypeandxp

∈ 

p denotesthecorresponding phenotype.The genotype–phenotypemapping

fgonlyallowstodeterminewhichgenotypesrepresentaparticularphenotype,butdoesnotgiveanyinformationaboutthe

similaritybetweenthesolutions.Thephenotype–fitnessmappingassignsafitnessvalue fp

(

xp

)

toeachphenotypexp

∈ 

p:

fp

(

xp)

: 

p

→ R

Themetricsusedmaybedifferentinthegenotypicsearchspaceandinthephenotypicsearchspace.Themetricofthe phenotypicsearchspaceisdeterminedbytheproblemtobesolved,i.e.,dependsonthenatureofthedecisionvariables.For example,ifthe problemtobe optimizedistheNK fitnesslandscape,theHammingmetriccan beusedinthephenotypic search space. Onthe contrary, thegenotypic search spacemetricdependsnot only onthe space, butalso onthe search operatorthatisusedonthegenotypes.The searchoperatorandthemetricusedinthegenotypicsearch spacedetermine eachotherandcannotbechosenindependentlyofoneanother.

3. Propertiesofredundantbinaryrepresentations

Inorderto clarifysome ofthequestionsin theliteratureaboutunderwhichcircumstances whichtypesofredundant representationscanbebeneficialforEA,Rothlauf

[5]

identifiesthefollowingproperties:

Uniformity Arepresentationisuniformlyredundantifallphenotypesarerepresentedbythesamenumberofgenotypes; Connectivity Arepresentationhashighconnectivityifthenumberofphenotypeswhichareaccessiblefromagiven

phe-notypebysinglebitmutationsishigh;

Synonymity Arepresentation issynonymouslyredundantif thegenotypes that are assignedto thesame phenotypeare similartoeachother;

(4)

To understand theproperties presented,it is importantto clarify thenotion of phenotypicneighborhood. For a non-redundant binary representation,wherein the phenotype is of length k and inthat the reachable phenotypes from any phenotype are all different, the connectivityis k. While inredundantrepresentations, the phenotypicneighborhood of a phenotype corresponds to thephenotypes that are reachable froma particularphenotype by single bitmutations of the genotypes thatrepresentit,connectivitycorrespondsto thenumberofdifferentphenotypes, whichconstitutethe pheno-typic neighborhoodofaphenotype.Forexample,foranuniformredundantrepresentation,wheregenotypeshavelength



andphenotypeshavelengthk,ifphenotypexp

=

fg

(

xg

)

isrepresentedbythefollowingm genotypes,

{

xg1

,

. . . ,

xgm

}

andas

each genotypehas



neighbors,thenthesetofneighborsofthem genotypesis

{

xg11

,

. . . ,

xg1

,

. . . ,

xgm1

,

. . . ,

xgm

}

.As each

genotyperepresentaphenotype,thesetofphenotypesofthissetis:

fg

(

xg11

), . . . ,

fg

(

xg1

), . . . ,

fg

(

xgm1

), . . . ,

fg

(

xgm

)

Thenumberofdifferentelementsofthissetcorrespondstotheconnectivity.

Accordingto

[5]

,arepresentationissynonymouslyredundantifthegenotypicdistancesbetweenall xg

∈ 

xgp aresmall

forall differentxp.Inotherwords,ifforall phenotypesthesumoverthedistancesbetweenallgenotypes thatrepresent

thesamephenotypeisreasonablysmall,i.e.:



xp∈p 1 2

 

xg1∈xpg



xg2∈xpg d

(

xg1

,

xg2

)



(1)

wherexg1

=

xg2,d

(

xg1

,

xg2

)

representsthedistancebetweenthegenotypexg1 andgenotypexg2 and



xp

g denotesthesetof

genotypesthatrepresentthephenotypexp.The 12 fractionarisesbecausetheHammingdistanceissymmetric.

Furthermore, thelocality ofa representationdescribeshow well thegenotypic neighborhoodcorresponds to the phe-notypic neighborhood.Thelocalityofarepresentationisapropertythatcharacterizes bothredundantandnon-redundant representations.Thelocalitydependsontherepresentationused fg,onthemetricdefinedin



gandonthemetricdefined

in



p.Rothlauf

[5]

proposesthefollowingexpressionforthelocalitydm:

dm

=



d(xg1,xg2)=dg min





d

(

xp1

,

xp2

)

d p min





(2)

where d

(

xp1

,

xp2

)

denotes the distance betweenthe xp1 and xp2 phenotypes,d

(

xg1

,

xg2

)

describes the distancebetween the corresponding genotypes xg1 and xg2,d

g

min isthe minimumdistancebetweentwo neighboringgenotypesanddpmin

indicates theminimumdistancebetweentwo neighboringphenotypes.When theHammingmetricisusedinphenotypic andgenotypicspaces,dgmin

=

1 anddpmin

=

1,respectively. Furthermore,dm

=

0 indicatesthat forall genotypesthat are

neighbors,theirphenotypesarealsoneighborsandtherepresentationisconsideredperfectaccordingtothelocality.When

dm is low, therepresentation hashigh locality andthis meansthat smallchanges in thegenotypic spacecorrespond to

smallchangesinthephenotypicspaceandvice-versa.

It is fairto say that theimportance and therelationships between theseproperties arenot yet fullyunderstood. For example,Rothlauf

[5]

statedthat usingneutralsearch spaces,wheretheconnectivitybetweenthephenotypes isstrongly increased by the use ofredundant representation,allows many differentphenotypes to be reachedin one single search step, however, using non-synonymously redundant representations results in random search which decreases the effi-ciency of EAs. The author claimed that when using synonymously redundant representations, the connectivity between the phenotypes is not increased. Thisconclusion is not consistent withthe results obtainedwith theneutral redundant representations presented in [14], ascan be verified in section 7.6of thispaper, where synonymouslyredundant repre-sentations allowincreasingthe connectivitybetweenphenotypes, whencomparedwiththecorresponding non-redundant representation.

Next,theerrorcontrolcodes

[26,27]

areexplainedaswell asthe wayasthese codesareusedtoadd redundancyand neutralitytoabinaryrepresentation.

4. Errorcontrolcodesusedinbinaryrepresentations

Inthebinarycase,thedivisionofthegenotypicspaceintosubspacesinwhichtheminimumHammingdistancebetween genotypes inthe samespaceis,atleast, two,wouldguarantee thatsingle bit mutationsproducegenotypes insubspaces other thanthat oftheoriginal genotype.Thisis anessential characteristicoferrorcontrol codesdesign,andresultsfrom thisareawouldallowtoobtainneutralbinaryrepresentationswiththefollowingcharacteristics:

1. Splittheredundantgenotypicspaceofcardinality2into2k subspaces,eachwithcardinality2k,insuchawaythat asinglebitchangeinthegenotypecausesatransitionfromonesubspacetoanother;

2. Definebijectivemappingsbetweeneachgenotypicsubspaceandthephenotypicspace,insuchawaythatallgenotypes whichrepresentthesamephenotypeformaconnectedneutralnetwork.

(5)

Table 1

Generationofthecodepolynomialsusingtheg(x)=x3+x+1=1011 generatorpolynomial.

U U(x) x3U(x) C(x)=x3U(x)mod g(x) V 0000 0 0 0 0000000 0001 1 x3 x+1 0001011 0010 x x4 x2+x 0010110 0011 x+1 x4+x3 x2+1 0011101 0100 x2 x5 x2+x+1 0100111 0101 x2+1 x5+x3 x2 0101100 0110 x2+x x5+x4 1 0110001 0111 x2+x+1 x5+x4+x3 x 0111010 1000 x3 x6 x2+1 1000101 1001 x3+1 x6+x3 x2+x 1001110 1010 x3+x x6+x4 x+1 1010011 1011 x3+x+1 x6+x4+x3 0 1011000 1100 x3+x2 x6+x5 x 1100010 1101 x3+x2+1 x6+x5+x3 1 1101001 1110 x3+x2+x x6+x5+x4 x2 1110100 1111 x3+x2+x+1 x6+x5+x4+x3 x2+x+1 1111111

Theidea ofusingerrorcontrol codesistoadd redundancyto themessagetobetransmittedinawaythat allowsthe receivertodetectand,ifpossible,tocorrecterrorsthatmayhaveoccurredduringthetransmission.Amongthemanyerror controlcodesavailable, C

(,

k

)

blockcodesaredefinedonabinaryalphabetwhichconsistsofasetof2k codewordswith fixedlength



,that resultfromtheencoding ofthek bits ofthe datatobetransmitted,aftertheadditionof



k check

bitsandwherethecodewordsforma vectorsubspaceofdimensionk.Inparticular,linearblock codesmaybedefinedas follows

[26]

:

Definition1(Linearblockcode).Ablockcodeoflength



and2kcodewordsiscalledalinear

(,

k

)

codeifandonlyifits2k codewordsformak-dimensionalsubspaceofthevectorspaceofallthe



tuples overthefieldG F

(

2

)

.

Cycliccodes

[27]

areanimportantsubclassoflinearblockcodesandaregenerallydefinedusingthepolynomialnotation. Inthiscase,acycliccodeC

(,

k

)

canbesetasfollows:

V

(

x

)

=

uk−1x−1

. . .

+

u0xk

+

ck−1xk−1

+ . . . +

c0

=

xkU

(

x

)

+

C

(

x

)

(3)

whereV

(

x

)

correspondstothecodepolynomial,U

(

x

)

isthemessagepolynomialconsistingofthek bitsofthemessageto betransmittedandC

(

x

)

correspondstotheparity-checkpolynomial.Themessagepolynomial U

(

x

)

hasdegreeuptok

1 andbecauseitisabinarycode,itscoefficientsare0 or1.Thecodepolynomial V

(

x

)

hasdegreeupto



1.

Acycliccodecanbedefinedusingthegeneratorpolynomial g

(

x

)

whichhavetobeafactorofx

+

1 andthedegreeof

g

(

x

)

isequalto



k,thenumberofparity-checkbits.Asaconsequenceoftheseproperties,anycodepolynomialismultiple ofthe generatorpolynomial andthegenerator polynomialis alsoa codepolynomial.On theother hand,anypolynomial factorofx

+

1 withdegree



k couldbeusedasageneratorpolynomial,however,notallpolynomialsintheseconditions seemtoproducecodeswiththedesirablefeatures.Todefineanon-systematiccycliccode,thecodepolynomialsV

(

x

)

are obtainedmultiplyingthemessagepolynomialsU

(

x

)

bythegeneratorpolynomial g

(

x

)

,i.e.:

V

(

x

)

=

U

(

x

)

g

(

x

)

(4)

Amorepopularapproachconsistsingeneratingthecodepolynomialsbymultiplying themessagepolynomialsbyxk

anddividingtheresultsby g

(

x

)

,whichallowsthecodetobegeneratedinsystematic form(when themessagepolynomial canbeobtainedbydiscardinggiven



k componentsfromthecorrespondingcodepolynomial):

V

(

x

)

=

xkU

(

x

)

+

xkU

(

x

)

mod g

(

x

)

(5)

Anexampleofthegenerationofthecodepolynomialsusingthegeneratorpolynomial g

(

x

)

=

x3

+

x

+

1

=

1011 isgiven in

Table 1

.

Using equation (4), it is possible to verify that only the code polynomialsare divisible by the generator polynomial. When the received polynomial, which corresponds to the word received after transmission, is divided by the generator polynomial,iftheremainderisnull,thereceivedpolynomialcorrespondstoacodepolynomial,otherwise,thatpolynomial doesnotcorrespondtoacodepolynomialandanerroroccurredduringtransmission.

Incycliccodes,thesyndromepolynomialcanbedefinedas:

(6)

Table 2

TableofsyndromepolynomialsanderrorpolynomialsofthecycliccodedefinedinTable 1.

e e(x) s S(x) 1000000 x6 101 x2+1 0100000 x5 111 x2+x+1 0010000 x4 110 x2+x 0001000 x3 011 x+1 0000100 x2 100 x2 0000010 x 010 x 0000001 1 001 1

where Z

(

x

)

correspondstothereceivedpolynomial,whichmightnotbeequaltoanyofthecodepolynomialsV

(

x

)

ifsome errorsoccurredduringtransmission.Inthiscase:

Z

(

x

)

=

V

(

x

)

+

e

(

x

)

(7)

wheree

(

x

)

istheerrorpolynomial.Thesyndromepolynomialcanbewrittenas:

S

(

x

)

= [

V

(

x

)

+

e

(

x

)

]

mod g

(

x

)

(8)

When Z

(

x

)

corresponds to a code polynomial, the errorpolynomial isnull, i.e., e

(

x

)

=

0.As V

(

x

)

mod g

(

x

)

=

0, the syndromepolynomialisequaltotheremainderofthedivisionoftheerrorpolynomiale

(

x

)

by g

(

x

)

:

S

(

x

)

= [

V

(

x

)

+

e

(

x

)

]

mod g

(

x

)

=

V

(

x

)

mod g

(

x

)

+

e

(

x

)

mod g

(

x

)

=

e

(

x

)

mod g

(

x

)

(9)

The syndromeofcycliccodesdependsonthe errorpatternanddoesnotdepend onthereceived polynomial Z

(

x

)

.As

g

(

x

)

hasdegreeof



k,thedegreeofthesyndromeis,atmost,



k

1,andcanbedefinedas:

S

(

x

)

=

sk−1xk−1

+

sk−2xk−2

. . .

+

s1x

+

s0 (10)

Itisalsopossibletousea tableofsyndromepolynomialsanderrorpolynomialsfordecoding.

Table 2

showsthistable forthecycliccodepresentedin

Table 1

.Usingthistable,thesyndromepolynomialhastobecalculatedusingequation

(6)

, theerrorpolynomiale

(

x

)

iscalculatedfromthesyndromepolynomialandusingequation

(7)

thecodepolynomialcan be determined.

Providedthat g

(

x

)

isa primitive(oreven merelyirreducible) polynomial,it iseasyto showthat thecodepolynomial

V

(

x

)

cannot be oftype xi (otherwise g

(

x

)

itselfwouldhaveto be ofthesametype andwouldnot be primitiveoreven irreducible). Dueto the linearity of the code,neither can the difference betweentwo code polynomialshave that form, whichimpliesthattheminimumdistancebetweendifferentcodepolynomialsmustbe,atleast,two.

Thus, expression

(4)

provides a methodofmappingarbitraryphenotypesonto alinearclass C of codepolynomialsof degreeupto



1.Furthermore,itsuggeststhatanother2k

1 classesofcodepolynomialsmaybedefinedasfollows:

V

(

x

)

=

U

(

x

)

g

(

x

)

+

zi

(

x

)

(11)

foreverynon-zeropolynomialzi

(

x

)

ofdegreeupto



k

1.Clearly,theseclassesarenolongerlinearbecausethesumof

two polynomialsinclass Ci isnot inCi,butsincetheycan beobtainedfromthelinearclassC throughtheadditionofa

constant polynomial,theyarecalledthecosets of C .Moreover,changingasingle coefficientinapolynomial V

(

x

)

C will

havetheeffectofproducingamemberofoneofthecosetsCi.Thisisthepropertywhichisexploredinerrorcontrolcodes

andprovidesasuitablemechanismformakingthesetsofreachablephenotypesvaryfordifferentgenotypicrepresentations ofthesamephenotype.

5. NeutralbinaryrepresentationNNg

(,

k

)

Ifagenotypeisseenasthecodepolynomial V

(

x

)

,ofdegree



1,ofalinearcodeC

(,

k

)

,generatedbythegenerator polynomial g

(

x

)

,andthephenotypeasthemessagepolynomial U

(

x

)

,ofdegreek

1,expression

(5)

allowstodefine the genotypebasedonthephenotype.Asgenotype V

(

x

)

belongstothe0 cosetdefinedbythegeneratorpolynomial g

(

x

)

,the representationofU

(

x

)

ineachofthe2kcosetscanbedefinedbyselecting2kcosetleadersz

i

(

x

)

:

Vi(x

)

=

V

(

x

)

+

zi(x

)

(12)

ToallowthatthegenotypesVi

(

x

)

,whichrepresentaU

(

x

)

,formaneutralconnectednetwork,itisenoughthatthecoset

leaders zi

(

x

)

definethemselves aconnectedgraphby linksresultingfromone bitchange.When U

(

x

)

corresponds tothe

0 phenotype, V

(

x

)

isalsoequaltozeroandthe2k polynomialsV

i

(

x

)

correspondtothe2k cosetleaderszi

(

x

)

.Inthis

context,andforthisreason,thecosetleaderswillbedesignatedbythezeros oftherepresentationandtheneutralnetwork definedbythecosetleadersformsaneutralrepresentationoftheNNg

(,

k

)

(NeutralNetwork)family:

(7)

Definition2(Neutralrepresentation).Aneutralrepresentationinthe NNg

(,

k

)

familyconsistsofamapping fromasetof

binarystringsoflength



(thegenotypicspace)tothesetofbinarystringsoflengthk (thephenotypicspace),withk

< 

, whichischaracterizedbythegeneratorpolynomial, g

(

x

)

,ofdegree



k,andbythesetof2kgenotypes:

R

= {

z0,z1, . . . ,z2k1

},

zi

Ci,i

=

0

, . . . ,

2k

1

whereeach Ci denotesthesubsetofallgenotypeswithsyndromei,andeach zi isconsideredthecosetleaderofCiandis

calledzero.Foreachpair

(

zi

,

zj

)

,eitherd

(

zi

,

zj

)

=

1,orthereisasetofgenotypes

{

w1

,

. . . ,

wp

}

R, p

=

d

(

zi

,

zj

)

1,such

that:

d

(

zi,w1)

=

1

d

(

wp,zj)

=

1

d

(

wt

,

wt+1)

=

1

,

t

,

1

t

<

p

ThecodingofphenotypeU

(

x

)

tothegenotypes Vi

(

x

)

,where0

i

2k

1,giventhegeneratorpolynomial g

(

x

)

and

theneutralrepresentation

{

z0

,

. . .

,z2−k1

}

ofafamilyNNg

(,

k

)

,isexplainedin

Algorithm 1

: Algorithm1Coding.

1: Input:U(x),g(x),{z0,. . . ,z2−k1}∈NNg(,k)

2: Output:AllVi(x)

3: CalculateV0(x)=xkU(x)+xkU(x)mod g(x)whichcorrespondstothecodepolynomial(‘coset’0 withsyndrome0). 4: for i=0 to2k1 do

5: CalculateVi(x)whichresultsfromtheadditionofzero zi(x)withpolynomialV0(x) 6: end for

Theprocess ofdecoding fromgenotype V

(

x

)

tophenotype U

(

x

)

,giventhegenerator polynomial g

(

x

)

andthe neutral representation

{

z0

,

. . .

,z2−k1

}

ofNNg

(,

k

)

,isperformedby

Algorithm 2

:

Algorithm2Decoding.

1: Input:V(x),g(x),{z0,. . . ,z2−k1}∈NNg(,k)

2: Output:U(x)

3: DeterminethesyndromeS(x)=V(x)mod g(x)

4: AddV(x)withthezero zi(x)oftheneutralrepresentationwhichhassyndromeS(x)

5: CalculatethequotientofthedivisionofV(x)+zi(x)byxktoobtainthephenotypeU(x)anddismisstheremainder

As an example,it ispresented thedecoding ofgenotype V

(

x

)

=

x5

+

x4

+

x2 or v

=

0110100

=

52, usingthe neutral representation

{

0

,

1

,

9

,

77

,

4

,

14

,

6

,

73

}

ofthefamilyNNg

(

7

,

4

)

,generatedbythegeneratorpolynomial g

(

x

)

=

x3

+

x

+

1.The

syndromeis S

(

x

)

=

V

(

x

)

mod g

(

x

)

=

x2

+

1,andthezero withthesamesyndrome is zi

(

x

)

=

x3

+

x2

+

x.Thepolynomial

additionofV

(

x

)

+

zi

(

x

)

= (

x5

+

x4

+

x2

)

+ (

x3

+

x2

+

x

)

=

x5

+

x4

+

x3

+

x andthequotientofthedivisionofV

(

x

)

+

zi

(

x

)

by

xk allowstoobtainthephenotypeU

(

x

)

=

x2

+

x

+

1.

5.1. Equivalentrepresentations

Havinginmindthat thevalues ofthegenotypesare not evaluateddirectly, itisassumedthat thevalue ofeach gene (0or1) isnotinherentlyimportant.Inthissense,itispossibletoestablishan equivalencerelationshipbetweenthe rep-resentationdefined by R1

= {

z0

,

z1

,

. . . ,

z2−k1

}

NNg

(,

k

)

and R2

= {

z0

+

c

,

z1

+

c

,

. . . ,

z2−k1

+

c

}

NNg

(,

k

)

, forany

constant c.In fact,foreach genotype xg1 thatrepresents xp

=

fg1

(

xg1

)

, thereisa corresponding xg2 genotype,such that

xp

=

fg2

(

xg2

)

,andanymutationofxg1 correspondstothesamephenotypethat thesamemutationapplied toxg2,which meansthatthephenotypicneighborhoodofthetworepresentationsareequal.

Torepresent theset ofrepresentations whosezeros differ only by a constant c,it was decided to choosea canonical representation.

Definition3(Canonicalneutralrepresentation).Therepresentationdefinedbythesetofzeros R

= {

z0

,

z1

,

. . . ,

z2−k1

}

,where the index i corresponds to thesyndrome of each zi, is a canonical representationof NNg

(,

k

)

if andonly ifthe vector

(

z0

,

z1

,

. . . ,

z2−k1

)

istheminimumlexicographicofthesetofvectorswhichcorrespondtoall equivalentrepresentations

{

z0

+

c

,

z1

+

c

,

. . . ,

z2−k1

+

c

}

, c

∈ {

0

,

1

}

. The syndrome of zi

+

c can be different from the syndrome of zi and the

componentsofthevectorsareorderedbythesyndromeofzi

+

c.

GivenarepresentationR

= {

z0

,

z1

,

. . . ,

z2−k1

}

,thecanonicalequivalentrepresentationcanbeeasilydeterminedamong

the2krepresentationsconstructedasR

i

= {

z0

+

zi

,

z1

+

zi

,

. . . ,

z2−k1

+

zi

}

,foreachzi

R.Forexample,

Table 3

presents

therepresentationequivalentsto

{

0

,

1

,

9

,

77

,

4

,

14

,

6

,

73

}

and

Table 4

showstheequivalentsrepresentationswithzeros

(8)

Table 3 Equivalentrepresentationsto{0,1,9,77,4,14,6,73}∈NNg(7,4). {0,1,9,77,4,14,6,73} zi 0 {0,1,9,77,4,14,6,73} z0 1 {1,0,8,76,5,15,7,72} z1 9 {9,8,0,68,13,7,15,64} z2 77 {77,76,68,0,73,67,75,4} z3 4 {4,5,13,73,0,10,2,77} z4 14 {14,15,7,67,10,0,8,71} z5 6 {6,7,15,75,2,8,0,79} z6 73 {73,72,64,4,77,71,79,0} z7 Table 4 Equivalentrepresentationsto{0,1,9,77,4,14,6,73}∈ NNg(7,4)withzeros orderedbysyndrome.

Req1 {0,1,9,77,4,14,6,73} Req2 {0,1,76,8,15,5,72,7} Req3 {0,68,9,8,15,64,13,7} Req4 {0,68,76,77,4,75,67,73} Req5 {0,10,2,77,4,5,13,73} Req6 {0,10,71,8,15,14,67,7} Req7 {0,79,2,8,15,75,6,7} Req8 {0,79,71,77,4,64,72,73}

6. Enumerationofneutralrepresentations

Although each representationof NNg

(,

k

)

has beendefinedby the choice oftheir set ofzeros,it isimportant not to

forget that the elements in this set have restrictions of syndrome and connectivity. For this reason, it is important to considerhowrepresentationsofsuchafamilycanbegenerated.

The proposed approach is to define a neighborhood relationship between representations of the same family and to calculateasetofcanonicalrepresentationsastheclosure

[28]

ofaninitialcanonicalrepresentationunderthatrelationship. The closure of an initial canonical representation of a family of representations under the neighborhood relationship is obtainedbyreplacingonezero totherepresentation,untilnomorerepresentationscanbecreatedforthatfamily.

Definition4(Neighboringneutralrepresentations).Twoneutralrepresentationsofa NNg

(,

k

)

family definedby thesets of

zeros R1

= {

z1,0

,

z1,1

,

. . . ,

z1,2−k1

}

andR2

= {

z2,0

,

z2,1

,

. . . ,

z2,2−k1

}

areconsideredneighboringifandonlyifthesets R1

and R2 differbyoneelement,i.e.,

i

∈ {

0

,

. . . ,

2k

1

}

:

j

∈ {

0

,

. . . ,

2k

1

}

, j

=

i

z1,j

=

z2,j.

It is importantto note that neighboringrepresentations ofa canonical representation willnot necessarily have tobe canonicalrepresentations.However,itispossibletoadoptadefinitionofalternativeneighborhood,appliedtothecanonical representationsset.Itshouldbenotedthattheadditionofthesameconstantc tothezeros oftwoneighboring representa-tionsresultsintotwootherrepresentations,eachoneneighboringofeachother.

Definition5 (Neighboringcanonicalneutralrepresentations). Twocanonical neutral representations of the NNg

(,

k

)

family

definedbythesetsofzeros R1

= {

z1,0

,

z1,1

,

. . . ,

z1,2−k1

}

andR2

= {

z2,0

,

z2,1

,

. . . ,

z2,2−k1

}

areconsideredneighborsifand

onlyifthereisathirdrepresentationR3,notnecessarilycanonical,thatisneighborsofR1 accordingto

Definition 4

,which

isequivalenttoR2.

Algorithm 3calculatestheneighboringrepresentationsofacanonicalneutralrepresentation.Oneneutralrepresentation ofNNg

(,

k

)

willhaveamaximumof2k



neighboringrepresentations,wheresomeofthemcanbeequaltothe

represen-tationinquestionby symmetryissues.Itisimportanttonote thataneighboringrepresentationisvalidwhentheneutral networkisconnectedanditisonlyanalyzedafteritscanonicalequivalentrepresentationhasbeenfound.

The neighboringrepresentationsof

{

0

,

1

,

2

,

3

}

NNg

(

4

,

2

)

andtheequivalentcanonical representationsareindicatedin Table 5.

6.1. ClosureofNNg

(,

k

)

underNeighborhoodrelationship

The determination of the canonical representations of the NNg

(,

k

)

family can be performed on the basis of the

calculationoftheclosureofaninitialcanonicalrepresentationundertheNeighborhood relationshipbetweencanonical rep-resentation(Definition 5). Assumingthatthe canonicalrepresentations setisclosedundertheneighborhood relationship, thiscalculationwillenumerateall representations.Intheabsenceofademonstrationofthevalidity ofthisassumption,it ispossibletoconsiderthat,atleast,apartofthedesiredrepresentationswillbefound.

(9)

Algorithm3CalculateneighboringrepresentationsNeighRep.

1: Input:Neutralrepresentation{0,z1,. . . ,z2−k−1}∈NNg(,k)

2: Output:NeighboringrepresentationsofneutralrepresentationNeighRep 3: R= {z0,z1,. . . ,z2−k1}

4: NeighboringrepresentationssetisemptyNeighRep= ∅ 5: while therearezeros ziR do

6: Calculateneighboringofzero zi,suchthatNeigh(zi)= {neigh1(zi),neigh2(zi),. . . ,neigh(zi)}

7: while thereareneighmNeigh(zi)do

8: Calculatesyndromes(neighm)

9: Searchthezero withthesamesyndromes(zj)=s(neighm)

10: Replacethezero foundzjwithneighm

11: if representationR isconnected then

12: DeterminethecanonicalrepresentationequivalenttoR 13: AddtoNeighRep 14: end if 15: end while 16: end while Table 5 Neighboringrepresentationsof{0,1,2,3}∈NNg(4,2)

andtheequivalentcanonicalrepresentations. R1 {0,1,2,4} ⇔ {0,1,2,4} R2 {0,8,2,3} ⇔ {0,1,2,10} R3 {0,1,5,3} ⇔ {0,1,2,4} R4 {9,1,2,3} ⇔ {0,1,2,10} R5 {0,6,2,3} ⇔ {0,1,2,4} R6 {0,1,2,10} ⇔ {0,1,2,10} R7 {7,1,2,3} ⇔ {0,1,2,4} R8 {0,1,11,3} ⇔ {0,1,2,10}

Thealgorithmstartswiththe

{

0

,

1

,

. . . ,

2k

1

}

representation.ThisisacanonicalrepresentationofNNg

(,

k

)

,whatever



and k areconsideredandcorrespondstotherepresentationswithnon-codingbits.Thus,thestartingsetis:

A

= {{

0

,

z1, . . . ,z2k1

}}

(13)

ThesetD ofneighboringrepresentationsofeachrepresentationiscalculatedby

Algorithm 3

.

Algorithm 4

performsthe calculationoftheclosure A undertheNeighborhood relationship:

Algorithm4CalculatetheclosureA underNeighborhood relationship.

1: Input:A= {0,1,. . . ,2k1}NN g(,k)

2: Output:Closure A 3: Closure A=A

4: while therearerepresentationstoanalyzeinClosure A do

5: CalculateD,thesetofneighboringcanonicalrepresentationsoftherepresentationanalyzed 6: for eachneighboringrepresentationinD do

7: if neighboringrepresentationdoesnotbelongtoClosure A then 8: AddtoClosure A

9: end if

10: end for

11: end while

Table 6presentsthenumberofcanonicalneutralrepresentationsofNNg

(,

k

)

foundfor0

<

k

11,whenthenumberof

redundantbitsisin0

< 

k

4 andthegeneratorpolynomial g

(

x

)

hasdegree0

< 

k

4.

Asobserved,the numberofneutralrepresentationsgrowsrapidlywithk andvery rapidlywiththenumberof redun-dantbits



k. Thishuge numbercreatesdifficulties notonly onexecutiontime butalso onthe storagecapacityofthe generatedrepresentations.Forexample,fora

|

p

|

=

22

=

4 phenotypic space, if3 redundantbits areadded,266 neutral

representationsareobtained,while1 845 628arereachedwhen4 redundantbitsarejoined.

As thenumber ofneutralrepresentations obtainedforsome NNg

(,

k

)

is very high, itwas decided to usea freeware

library, Oracle Berkeley DB that provides a high performance bases from embedded data and can be used with the C programminglanguage,C++,Java,Perl, Python,amongothers.Inthiscase, theClanguage wasused.Thisdatabaseallows youtomanipulatedataintheorderofterabytesinvarioussystemsincludingUNIXandMicrosoftWindows.

7. PropertiesofNNg

(,

k

)

Toanalyzetherelationshipsbetweenthepropertiesoftherepresentationsandtheperformance ofNNg

(,

k

)

appliedto

(10)

Table 6

NumberofneutralrepresentationsofNNg(,k).

x+1 x2+x+1 x3+x+1 x4+x+1 g(x) 1 2 4 19 1489 2 3 9 266 1 845 628 3 4 18 1828 4 5 32 8128 5 6 50 25 100 6 7 75 66 750 7 8 108 158 784 8 9 147 342 701 9 10 196 690 748 10 11 256 1 307 528 11 12 324 2 350 336 k

Fig. 1. Hypergraph representing a genotypic space with|g| =2=27.

Fig. 2. Hypergraph representing the genotypic space with=4 and the phenotypes of the neutral representation{0,1,2,4} ∈NNg(4,2). 7.1. Uniformity

The NNg

(,

k

)

familyisconsidereduniformbecauseall phenotypesarerepresentedby thesamenumberofgenotypes.

IneachrepresentationofNNg

(,

k

)

,eachphenotypeisrepresentedby2kgenotypes.

7.2. Phenotypicneighborhoodandconnectivity

Using hypergraphsisa wayofvisualizingthephenotypicneighborhood ofa representation.

Fig. 1

displays agenotypic spacewithcardinality

|

g

|

=

2

=

27,whereforthesakeofsimplicity,onlythe1100101 genotype(



)andits7neighbors

(

#

)arepresented,whileforallothergenotypes,onlythefirst4bitsareindicated.

Forexample,

Fig. 2

presentsahypergraphofthegenotypicspaceoftheneutralrepresentation

{

0

,

1

,

2

,

4

}

NNg

(

4

,

2

)

.In

(11)

Table 7

Genotypesandcorrespondingphenotypes[]ofneutralrepresentation{0,1,2,4}∈NNg(4,2).

syndrome 0 syndrome 1 syndrome 2 syndrome 3

0000 [00] 0001 [00] 0010 [00] 0100 [00]

0111 [01] 0110 [01] 0101 [01] 0011 [01]

1001 [10] 1000 [10] 1011 [10] 1101 [10]

1110 [11] 1111 [11] 1100 [11] 1010 [11]

Table 8

Neighboringgenotypesandcorrespondingphenotypesoftheneutralnetworkthatrepresents phenotype01 of{0,1,2,4}∈NNg(4,2). Neigh. gen. 1 2 3 4 0111 0110 [01] 0101 [01] 0011 [01] 1111 [11] 0110 0111 [01] 0100 [00] 0010 [00] 1110 [11] 0101 0100 [00] 0111 [01] 0001 [00] 1101 [10] 0011 0010 [00] 0001 [00] 0111 [01] 1011 [10] Table 9

MaximumconnectivityforeachNNg(,k).

x+1 x2+x+1 x3+x+1 x4+x+1 x5+x2+1 g(x) 1 1 1 1 1 1 2 2 3 3 3 (3) 3 3 5 7 (7) (7) 4 4 7 14 (15) (15) 5 5 9 18 (30) (31) 6 6 11 22 (43) (59) 7 7 13 26 (55) (98) 8 8 15 30 (63) (125) 9 9 17 34 (–) 10 10 19 38 (–) 11 11 21 42 (91) k

Table 7presentsthegenotypesandthecorrespondingphenotypes[]obtainedusingtheneutralrepresentationpresented in

Fig. 2

.Astherepresentationisuniform,each phenotypeisrepresentedby 2k

=

22

=

4 genotypes,whichcorresponds

tothenumberofzeros oftheneutralrepresentation.Furthermore,aseachphenotypeisrepresentedby aneutralnetwork with 2k genotypes, each ofwhich has



neighboring genotypes, including, atleast, a genotype which decodes to the

samephenotype,thephenotypicneighborhoodofeachphenotypehas,atmost,2k

(

1

)

differentphenotypes.Theactual

numberofdifferentphenotypesinthisneighborhooddeterminestheconnectivityoftherepresentation,whichisthesame forallphenotypes definedbythe representation,although thereachedphenotypescan bedifferent. Ascan beconfirmed with

Table 8

,theconnectivityof

{

0

,

1

,

2

,

4

}

NNg

(

4

,

2

)

is3,becauseitispossibletoreach3differentphenotypes(excluding

itself). When comparing thisrepresentation witha non-redundant representation withphenotype length of2, in which eachphenotypecanonlyreach2 differentphenotypes,theneutralrepresentation

{

0

,

1

,

2

,

4

}

NNg

(

4

,

2

)

canreachalarger

numberofdifferentphenotypes,inthiscase 3.

Inconclusion,thewayastheneutralnetworksoftheserepresentationsarestructuredallowedthat:

1. AllneutralrepresentationsNNg

(,

k

)

haveaconnectivityhigherorequaltotheconnectivityofthecorresponding

non-redundantrepresentation,i.e.,higherthanorequaltok;

2. The connectivity of anyrepresentation of NNg

(,

k

)

does not vary with the phenotype andthe number ofdifferent

reachablephenotypesisequalforallphenotypesofthesameneutralrepresentation; 3. Differentgenotypesofaneutralnetworkcanreachdifferentphenotypes.

Table 9presentsthemaximumconnectivityforeachNNg

(,

k

)

,wherefortheneutralrepresentationswhosenumeration

wasnotpossibleduetothelimitationsexplainedinsection5,alowerlimitforthemaximumconnectivitywasobtainedby generatingneutralrepresentationsfromtheneutralrepresentationwiththegreatestconnectivityfounduntilthemoment, andleavingthealgorithmrunninguntilnomoreincreasinginconnectivitywasdetectedforagivenperiodoftime. These valuesappearinside()in

Table 9

.Notethatthemaximumconnectivityappearstoincreaseexponentiallywiththenumber ofredundantbits



k.

(12)

Fig. 3. MDS of some NNg(7,4)representations.

Fig. 4. MDS of some NNg(14,11)representations.

7.3. Topology

As the NNg

(

7

,

4

)

representations proposed exhibit neutral networks, the topology is an appropriated property to be

analyzed.Inordertovisualizethetopology ofaneutralnetwork,atechnique calledMDS(Multi-DimensionalScaling)was used [30]. This technique consists in determining the positions of a set of points in the Euclidean space, such that the distances between them approximate a given measure of dissimilarity of the objects that are to be visualized. Neutral network topologiesmaybe visualizedinthisway,bytakingtheHammingdistancebetweenthegenotypesoftheneutral networksastheMDSdissimilaritymeasure.

Fig. 3

showsthetopologiesofsome neutralnetworksoftheNNg

(

7

,

4

)

family.

Whilethenon-codingredundantrepresentations,inwhichthegenotypicredundantbitsdonotaffectthephenotype,display thetopologyofacube,therepresentationswithmaximumconnectivity(14)intheNNg

(

7

,

4

)

familyexhibitaYtopology.

Fig. 4 showsthetopologies ofsome neutralnetworksoftheNNg

(

14

,

11

)

representations,whichpresentsthe greatest

connectivity(42),while

Fig. 5

displaysthetopologiesofsomerepresentationsoftheNNg

(

6

,

2

)

family.

7.4. Synonymity

The calculationofthe synonymityfollowstheequivalence classesapproachproposed by [5],inwhich ifallgenotypes

xg

∈ 

xp

(13)

Fig. 5. MDS of some NNg(6,2)representations.

Table 10

MinimumandmaximumsynonymityforeachNNg(,k).

x+1 x2+x+1 x3+x+1 x4+x+1 g (x) 1 1 1.33—1.5 1.71—2.18 2.13—2.67 2 1 1.33—1.67 1.71—2.71 3 1 1.33—1.67 1.71—2.86 4 1 1.33—1.67 1.71—2.86 5 1 1.33—1.67 1.71—3 6 1 1.33—1.67 1.71—3 7 1 1.33—1.67 1.71—3 8 1 1.33—1.67 1.71—3 9 1 1.33—1.67 1.71—3 10 1 1.33—1.67 1.71—3 11 1 1.33—1.67 1.71—3 k

to thesameequivalence classissmall, then therepresentationis consideredsynonymouslyredundant. Thus, synonymity forNNg

(,

k

)

representationscan be simplifiedusing expression

(14)

because thedistances betweenallgenotypes ofthe

neutralnetworkthatrepresentagivenphenotypeareequalforallphenotypes:

Syn

=



2k1 i=0



2k1 j=0 d

(

zi,zj)

(

2k

)

× (

2k

1

)

(14)

where d

(

zi

,

zj

)

corresponds to the Hamming distance between zero zi and zero zj. This expression gives a normalized

measureofsynonymity,sincethetotalnumberofdistanceswereconsidered.Arepresentationisconsideredsynonymously redundantwhenthevalue of Syn islowandnon-synonymouslyredundantwhenthisvalueishigh. Asthesynonymityis

ameasureofhowclosethegenotypesthatrepresentthesamephenotypeare,thesynonymityisrelatedwiththeneutral networkstopology.

Table 10

showstheminimumandmaximumsynonymityforNNg

(,

k

)

.

7.5. Locality

Asthelocality doesnot varywithdifferentphenotypesforNNg

(,

k

)

,thismeasurecanbe calculatedonthebasisofa

singlephenotype,leadingtoasimplificationofexpression

(2)

.Thus,localityfortheNNg

(,

k

)

familycanbecalculatedusing

theaveragedistancefromaphenotypetothecorrespondingneighboringphenotypes,includingthephenotypeinquestion.

(14)

Table 11

MinimumandmaximumlocalityforeachNNg(,k).

x+1 x2+x+1 x3+x+1 x4+x+1 g(x) 1 0.5 0.33—0.5 0.25—0.50 0.2—0.57 2 0.67—1.0 0.5—1.0 0.4—1.0 3 0.75—1.25 0.6—1.5 0.5—1.58 4 0.8—1.4 0.67—1.67 0.57—2.0 5 0.83—1.5 0.71—1.86 0.62—2.38 6 0.86—1.57 0.75—2.0 0.67—2.78 7 0.88—1.62 0.78—2.0 0.7—2.95 8 0.89—1.67 0.8—2.1 0.73—3.14 9 0.9—1.7 0.82—2.18 0.75—3.25 10 0.91—1.73 0.83—2.17 0.77—3.27 11 0.92—1.75 0.85—2.23 0.79—3.36 k

Fig. 6. Relationships between connectivity, synonymity and locality of NNg(14,11).

There areotherpropertiessuchasinnovationrate

[31]

,diameterofconnectedneutralnetworks, accessibilityof neigh-boringneutralnetworks

[32]

andothersthatwillbeanalyzedinafuturework.

7.6. RelationshipsbetweenpropertiesofNNg

(,

k

)

Fig. 6presentsscattergraphsshowingtherelationshipbetweenconnectivity,localityandsynonymityoftheNNg

(

14

,

11

)

family, which corresponds tothe familyofrepresentations withthe highestvalue ofk thatwas completelyenumerated. Althoughthescattergraphsonlyshow thepropertiesfortheNNg

(

14

,

11

)

family,theresultsare similarfortheremaining

familiesofNNg

(,

k

)

.Thesescattergraphsshowthattherearerepresentationswithlowlocalitythatreachhighconnectivity

values.Thechartattheright-topin

Fig. 6

contradictstheideapresentedin

[5]

thatsynonymouslyredundantrepresentations do not allowthe connectivitybetweenthe phenotypesto increase incomparisonwith thecorresponding non-redundant representations. Thecircleinthischart showsthatthere isasetofsynonymouslyneutralrepresentations thathavehigh valuesofconnectivity. TheNNg

(,

k

)

familyhasavariety ofrepresentationswithdifferentpropertiesthatprovidean

im-portantbasis forthestudyoftherelationshipsbetweenthemandwillallowtostudytowhatextentthesepropertiescan affecttheperformanceofanevolutionaryalgorithm.

8. NK fitnesslandscapes

Inordertostudytheeffectofredundancyandneutralityonevolutionaryalgorithmperformance,evolutionarystrategies, originally proposed by [18,19] for integer and continuous spaces, are used in this research for binary spaces and with constantmutationrate.Inparticular,anevolutionarystrategy(1+1)-ES isappliedtoNK fitnesslandscapes

[33]

withdifferent degreesofruggedness.

(15)

Fig. 7. Hypergraph representing an instance of a NK(3,2)fitness landscape.

TheNK fitnesslandscapesproposedby

[33]

arebasedonthefitnesslandscapesproposedby

[34]

.Thisauthorfoundthat whenfitness interactions existbetweengenes,the geneticcompositionofa populationcan evolveinto multipledomains ofattraction

[35]

.Ontheotherhand,genescansuppressorenabletheeffectsofothergenes,andthisinteractioniscalled epistasis.Wrightestablishedaconceptuallinkbetweenthismicroscopicpropertyoftheorganisms(epistasis)anda macro-scopicpropertyoftheevolutionarydynamics(multipleattractorsofthepopulationintheareaofgenotypes),andusedthe analogywithalandscapewithmultiplepeakstoillustratethissituation.Fitnesslandscapesmaybedefinedasfollows

[36]

:

Definition6(Fitnesslandscape). A landscape isa triplet

(

S

,

V

,

f

)

where S is a set ofpotential solutions that matchthe search space, V

:

S

2S is afunction that assigns toeach s

S aset ofneighbors V

(

s

)

S,defining theneighborhood

structure,and f

:

S

→ R

isafitnessfunctionthatassignsavalueoffitnesstoeachsolutions.

The NK fitness landscapes allow toexplore the wayasepistasis can control theruggedness ofthe landscape.Tothis purpose, fitness functions family whose ruggedness could be shaped by a single parameter was developed. NK fitness

landscapesare definedasstochasticallygeneratedfitnessfunctionsofbitstrings withlength N andinteractions between groupsof K

+

1 bits.The fitnessofeach bitdependson its value andthevaluesofthe bitswithwhich itinteracts.For each instanceofthe landscape,fitness valuesassociated witheach bitarerealizationsofarandom variablewithuniform distributionbetween0.0and1.0,andthefitnessofeachchromosomeistheaverageofthebitsfitnessforall N loci.

Theruggednessofthelandscapeis controlledby theparameter K andreachesits maximumwhen K

=

N

1 andits minimumwhen K

=

0.When K

=

0,thereisnoepistasis,thelandscapehasauniquelocaloptimum(theglobaloptimum), andthefitnessofdifferentchromosomesishighlycorrelatedwiththeHammingdistancebetweenthem.When K

=

N

1, thenumberofinteractionsbetweenbitsismaximumandthefunctionispurelyrandom,sothelandscapehasmanylocal optima,andthefitnessofchromosomesisnotcorrelatedwiththeHammingdistancebetweenthem.

The choice of the K bits is known to affect the computational complexity of the NK fitness landscape optimization problem.WhenthesebitsaretheK nearesttothelocusthatisbeingassessed,theproblemispolynomialinN,whereasit becomesNP-hardwhenK

>

1 andsuchbitsarerandomlychosen

[37]

.

Next the definitions of local optimum and global optimum in NK fitness landscapes are given as well as how the

NNg

(,

k

)

representationscanchangetheruggednessoftheNK fitnesslandscapes.

9. RuggednessofNK fitnesslandscapes

Alocaloptimum

[38]

correspondstoapointinthefitnesslandscapethathasnoneighborswithhigherfitness.Alocal optimum willbe theglobaloptimum ifitis thebest solutionfortheproblem. A definitionoflocaloptimum andglobal optimumisgiven

[36]

:

Definition7(Localoptimum).Givenafitnesslandscape

(

S

,

V

,

f

)

,asolutions

S isalocaloptimumifandonlyif:

s

V

(

s

),

f

(

s

)

f

(

s

)

Definition8(Globaloptimum).Givenafitnesslandscape

(

S

,

V

,

f

)

,asolutions

S isalocaloptimumifandonlyif:

s

S

,

f

(

s

)

f

(

s

)

(15)

AstheNK fitnesslandscapecanbegraduallytunedfromsmoothtorugged,itisagoodfitnessmodeltostudydifferent typesofrepresentations.Awaytocharacterizethenatureofthelandscapeistounderstandtheruggednessorsmoothness ofthe landscape basedon thenumber oflocal optima,also knownas modality [38],and thedistribution ofthose local optima.As itis known,a landscapeisinduced by theoperator whichisused todefine neighborhoods[39]. Inthiscase, as the landscape is defined over the binary space, the Hamming metric is used and the neighborhood relation can be representedbyahypergraph.Next,anexampleisgiventhatshowsthat NNg

(,

k

)

representationscanchangethenumber

oflocaloptimaoftheNK fitnesslandscapes,changinginthiswaythedifficultyoftheproblem.

Fig. 7 representsan instanceof aNK

(

3

,

2

)

fitnesslandscape,where N

=

3 and K

=

2. Havinginmind thedefinitions givenin section 9,the globaloptimum inthis instancecorresponds to genotype 100 with fitness0

.

640,while 001, 010

Imagem

Table 3 Equivalent representations to { 0 , 1 , 9 , 77 , 4 , 14 , 6 , 73 } ∈ NN g ( 7 , 4 )
Table 6 presents the number of canonical neutral representations of NN g (, k ) found for 0 &lt; k ≤ 11, when the number of redundant bits is in 0 &lt;  − k ≤ 4 and the generator polynomial g ( x ) has degree 0 &lt;  − k ≤ 4.
Fig. 2. Hypergraph representing the genotypic space with  = 4 and the phenotypes of the neutral representation { 0 , 1 , 2 , 4 } ∈ NN g ( 4 , 2 )
Table 7 presents the genotypes and the corresponding phenotypes [] obtained using the neutral representation presented in Fig
+7

Referências

Documentos relacionados

A tarefa consiste em parafrasear e normalizar, en- tre outras, constru¸c˜ oes como vou-lhe/posso-lhe fazer uma surpresa em vou/posso fazer-lhe uma surpresa, em que o pronome

A presente pesquisa teve como objetivo verificar a incidência de custos transacionais na implementação de programas de modernização administrativa e fiscal, realizando um estudo

Fig 2 shows the average simulated pedestrian throw distance as a function of the collision velocity, for different pedestrians and vehicles.. The pedestrian throw

Para a praia de Carimã foram estabelecidas três classes de oportunidades, representando um continuum de qualidade ambiental, onde a Classe 1 está associada a

As imagens CBERS possibilitaram diagnosticar o uso e ocupação do solo na Microbacia do Riacho Maracajá, Olivedos, PB, sendo constatado que 80,84% da área ainda

Observando a figura 11 verifica-se que as concentrações mais baixas de antifúngico não alteraram a concentração celular viável dos biofilmes das três estirpes, que

The “drop of honey effect” metaphor shows to be clearly adequate to present the way the virus spread from Wuhan to the world, already reaching the generality of the

A análise dos valores deste quadro sugere para esta estrutura algébrica uma maior eficiência das implementações do algoritmo Tietze-SA (V) e do algoritmo SA (+), sendo também de reter