Comparative analysis of strategies for feature extraction and classification in SSVEP BCIs.

(1)

ContentslistsavailableatScienceDirect

Biomedical

Signal

Processing

and

Control

j ou rn a l h o m e pa g e :w w w . e l s e v i e r . c o m / l o c a t e / b s p c

Comparative

analysis

of

strategies

for

feature

extraction

and

classiﬁcation

in

SSVEP

BCIs

Sarah

N. Carvalho

a,b,∗

_,

_Thiago

_B.S.

_Costa

a

_,

_Luisa

_F.S.

_Uribe

a

_,

_Diogo

_C.

_Soriano

c

_,

Glauco

F.G.

Yared

b

_,

_Luis

_C.

_Coradine

d

_,

_Romis

_Attux

a

a_University_of_Campinas,_UNICAMP,_Campinas,_Brazil b_Federal_University_of_Ouro_Preto,_UFOP,_Ouro_Preto,_Brazil c_Federal_University_of_ABC,_UFABC,_Santo_André,_Brazil d_Federal_University_of_Alagoas,_UFAL,_Maceió,_Brazil

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received20January2015

Receivedinrevisedform15April2015 Accepted5May2015

Availableonline1June2015

a

b

s

t

r

a

c

t

Brain–computerinterface(BCI)systemsbasedonelectroencephalographyhavebeenincreasinglyused indifferentcontexts,engenderingapplicationsfromentertainmenttorehabilitationinanon-invasive framework.Inthisstudy,weperformacomparativeanalysisofdifferentsignalprocessingtechniques foreachBCIsystemstageconcerningsteadystatevisuallyevokedpotentials(SSVEP),whichincludes:(1) featureextractionperformedbydifferentspectralmethods(bankoffilters,Welch’smethodandthe mag-nitudeoftheshort-timeFouriertransform);(2)featureselectionbymeansofanincrementalwrapper, afilterusingPearson’smethodandaclustermeasurebasedontheDavies–Bouldinindex,inaddition toascenariowithnoselectionstrategy;(3)classificationschemesusinglineardiscriminantanalysis (LDA),supportvectormachines(SVM)andextremelearningmachines(ELM).Thecombinationofsuch methodologiesleadstoarepresentativeandhelpfulcomparativeoverviewofrobustnessandefficiencyof classicalstrategies,inadditiontothecharacterizationofarelativelynewclassificationapproach(defined byELM)appliedtotheBCI-SSVEPsystems.

1. Introduction

ABrain–computerinterface(BCI)isadevicethataimstomap brain signals onto commands for external devices, deﬁning an alternativecommunicationchannelforusersindifferent practi-calcontexts,whichcanincludeapplicationsfromcomputergames toassistivetechnologies[1,2].

BCIs,ingeneral,makeuseofelectroencephalography(EEG)[3], asaconsequenceoffactorslikeportability,non-invasivenessand cost.EEGsignalsareacquiredwiththeaidofanelectrodecap pos-itionedontheuser’sscalp,whichisconnectedtopre-processing andsamplingmodules.ThedesignofaBCIisdeterminedbythe cho-senparadigm,themaintrendsintheﬁeld[4]beingmotorimagery, P300andsteadystatevisuallyevokedpotentials(SSVEP).Thelast

∗ Correspondingauthorat:FederalUniversityofOuroPreto,Rua36,número115, salaG302,35931-008,JoãoMonlevade,MinasGerais,Brazil.Tel.:+553138528719.

E-mailaddresses:sarah@deelt.ufop.br(S.N.Carvalho),

bulhoes@dca.fee.unicamp.br(T.B.S.Costa),lsuarez@dca.fee.unicamp.br

(L.F.S.Uribe),diogo.soriano@ufabc.edu.br(D.C.Soriano),attux@dca.fee.unicamp.br (R.Attux).

twoareapproachesbasedonevent-relatedpotentials(ERP).The firstoftheseparadigmsreliesontheabilityoftheoperatorin mod-ifying–byimaginingtheprocessofmovingpartsofbothsidesof his/herbody(e.g.openingorclosingtherightorthelefthand)– theactivityofthemotorcortex[5],whilethesecondmakesuse ofaspecificevent-relatedpotential,theP300wave,to character-izetheinteractionbetweentheoperatorandacommandinterface [6].Finally,theSSVEPparadigm,thesubjectofthisstudy,isbased ontheanalysisofoscillatingEEGpatternsthataregeneratedinthe cortexinresponsetocertainvisualstimuli.Morespecifically,when anindividualisvisuallystimulatedbyapatternthatflickers repet-itivelywithinacertainrangeoffrequencies,asynchronizedSSVEP canbedetectedinhis/herbrainelectricalactivity.Hence,iflight sourceswithdifferentflickeringratesareusedtobuildacommand interface,itispossibletoidentifyonwhichlightthesubjectfocused his/herattentionatagivenperiodoftimebysuitablyprocessing andclassifyingtheEEGsignal.

Ingeneral,thestructureofanSSVEP-basedBCIcanberoughly dividedintofourstages:dataacquisition,signalprocessing, com-mandgenerationand ﬁnalapplication [7].Fig.1shows ablock diagramofthisstructurehighlightingthefourstagesofthesignal processingmodule,whichisthefocusofthisstudy.Theﬁrststage, http://dx.doi.org/10.1016/j.bspc.2015.05.008

(2)

Fig.1.OverviewofaBCIsystem.

pre-processing,isbasedontemporalandspatialﬁlteringandis typicallyofamoregeneralcharacter.Thesecondandthirdstages, ontheotherhand,haveastrongerdependencewithrespecttothe featuresoftheselectedparadigm.Theclassiﬁerstagegeneratesthe controlcommandbasedoninputsignal.

Inthisstudy,wewillperformacomparativeanalysisof meth-odsforfeatureextraction,featureselectionand classificationin SSVEPBCIs.Threefeatureextractionapproaches—spectral estima-tionusinga bankof band-passfilters,Welch’s methodandthe magnitudeoftheshort-timeFouriertransform(STFT)calculatedat theevokedfrequencies,threefeaturesselection–andthree clas-sifiers–alineardiscriminant,anextremelearningmachine(ELM) [8]andasupportvectormachine(SVM)[9]–willbeconsidered. Furthermore,theperformanceofeachstructurewillbeanalyzed underthreefeatureselectionapproaches:anincremental wrap-per[10],afilterusingPearson’smethod[11]andastrategybased ontheDavies–Bouldinindex[12],inadditiontoacasewithout featureselection.Thisrepertoireof36scenarios appliedonthe samedatabasedefinesinterestingcomparativeelements:(1)since SSVEPengenders awell-defined spectralresponse,this studyis relevantasaperformanceanalysisofdistinctfrequency-domain featureextractionmethods.(2)Therobustnessofnonlinear struc-tures,asELMandSVM,inhandlingtherequiredSSVEPclassification taskisinvestigated.(3)Theprocessofchannelselectionisanalyzed adoptingthreestrategieswithdistinctconceptualfoundations.(4) Statisticalconsiderationsaremadeaboutthebestconfigurationof electrodesaccordingtodifferentmethodsoffeatureselection.

Thisstudywillbecarriedoutusingadatabasegenerated accord-ingtotheexperimentalsetupdescribedinSection3.Inadditionto thecontributionthatthestudyasawholerepresents,webelieve theanalysisoftheperformanceofanELMinSSVEPsystemscan alsobeconsideredasacontributionperse,asanequivalent anal-ysis,tothebestofourknowledge,hasnotbeenreportedsofarin theliterature.

Theremainderof this paperis organizedasfollows.Section 2 presentsbriefly theSSVEP paradigm. Section3 describesthe experimentalsetupandproceduresofdatarecorder.Sections4–7 discussthefourstagesofsignalprocessing,i.e.,pre-processing, fea-tureextractionapproaches,featuresselectionandtheclassification criteria,respectively.Section8presentstheresults,whileSection 9containsourconclusionsandfinalremarks.

2. Fundamentalsofsteadystatevisuallyevokedpotentials

Theneurophysiologyofthehumanvisualsystemreportsthat theneuronal activityof thecells ofthe visualcortexis altered byvisualstimulation,anditispossibleidentifyvariationsofthe brainresponserelatedtopropertiesofthevisualstimulus,suchas luminance,contrastandfrequency(between1Hzand100Hz[13]). Neuronsinvisualcortexsynchronizetheirﬁringtothefrequencyof blinkingofvisualstimulus.Thesteadystatevisuallyevoked poten-tialsoccurwhenvisualstimuliarepresentedrepeatedlycreating almostsinusoidaloscillations[14,15].TheEEGresponsepresents anincreaseofenergyinthesamefrequencyoftheblinking stim-ulus [16]. The strongest response occurs in the primary visual cortex,althoughotherareasofthebrainareactivatedinvarying degrees.TheSSVEPcanbedetectedwithinnarrowfrequencybands (e.g.,0.1Hz)aroundthefrequencyof visualstimulationvia sig-nalprocessingmethodsthatexploitspeciﬁccharacteristicsofthe signal,suchastimingandrhythm.

TheSSVEPBCIsystemsusevisualstimuliasawaytoevokea cer-tainelectricalpatterninthevisualcortex.UnlikeindependentBCI systems,wheretheimplementationisbasedonvoluntarycontrol ofneuralactivityofthesubject[17,18],theoperationofSSVEP sys-temsdependsontheabilityofthesubjecttofocuson,ﬁxandfollow thevisualstimuliaccordingtoanintendedaction,asalsoonthe adoptedsignalprocessingstrategies,whichjustiﬁestheextensive scenariosanalyzedinthepresentstudy.

3. Experimentalsetup

The stimulationinterface(see Fig.2)consistsof two square checkerboardswithsidesof3.8cm,displayedontherightandleft centersofa blackscreen,blinkingat12and15Hz,respectively. A14-in.monitorwithrefreshrateof60Hzwasused.Thesubject focusedhis/hergazefor12soneachstimulus,repeatingthis pro-cesseighttimeswithrestintervals.TheEEGdatawerecollected fromsevenhealthyvolunteers,withanaverageageof26.3±3.3 years.TheacquisitionprotocolwasapprovedbytheEthics Com-mitteeoftheUniversityofCampinas(n.791/2010).Thedatabase iscomposedof1344sofEEGdatarecordedatasample rateof 256Hz,using a g®_.SAHARAsys _{dry-electrode}_cap_with₁₆ chan-nelsandag®_.USBamp_biosignal_ampliﬁer_[19]_,_and_registered_at

(3)

Fig.2. Experimentalsetup.(a)Screenwithcheckerboardsusedtogeneratevisual stimuliat12and15Hz(b)Conﬁgurationofequipmentanddatacollection environ-ment.

theMATLAB®_2012b,_using_an_Application_Programming_Interface (API)providedbytheaforementioneddevicemanufacturer.The acquisitionswereonlyperformedafterthefollowingproceedings regardingtheEEGapparatus:channelimpedancecalibration; ver-iﬁcationoftheimpedanceelectrodecalibration(between0.5and 5.0k);connectionofthegroundandreferencechannelsockets, respectively,tocommongroundandreference;andstabilizationof thesignal.Thegroundandreferencearepositionedonmastoids. Fig.3showsthearrangementofelectrodesatO1,O2,Oz,POz,Pz, PO4,PO3,PO8,PO7,P2,P1,Cz,C1,C2,CPz,FCz,accordingtothe international10–20system[20].

Fig.3.DispositionofelectrodesonthescalpforEEGsignalacquisition.

Severalinterferents are added to theEEG signal duringthe recording.Theseartifactscompromisethequalityoftheobtained signal,affectingtheBCIperformance.Themainartifactsourcesare: EEGequipmentanditsconnectionstothescalp;electricalsource (60Hz);thenormalelectricalactivityofthesubjectasheart,eye blinking,eyesmovementandmusclesingeneral.Recognitionand eliminationofartifactsinEEGsignalsarecomplextasks,but essen-tialtothedevelopmentofpracticalsystems.

Inthisstudy,theEEGsignalwasfilteredbyananalog Butter-worthbandpassfilter(5–60Hz)andanotchfilter(58–62Hz)in orderto removethe smoothdisplacement and electromagnetic artifacts.Inthesequence,toremoveotherartifactspresentinthe bandof,aseyeblinkingandneckmovements,dataaresubmitted toaspatialfilteringusingtheCommonAverageReference(CAR) [21]method,definedas:

V_iCAR=V_iER−1 n n

j=1 V_jER (1) where VER

j is thepotentialof i-thelectrode measurement with respecttosamereferenceandnisthenumberofelectrodesinthe array.TheCARusestheaveragevalueoftheentirearraytosubtract thismeanfromeachelectrode,henceeliminatingsimilarartifacts presentinmostelectrodes.Althoughnoisesourcesaredeeply com-plexandvaryacrossandwithinsubjects,thetemporalandspatial ﬁlteringhavebeendemonstratedtobeconvenienttomaximizethe signal-to-noiseratioandtoimprovetheaccuracyoftheSSVEPBCI system[22,23].

5. Featureextractionapproaches

Featuresare,insimpleterms,elementsofacompactand efﬁ-cientdatarepresentation[24].InthecontextofaBCIsystem,itis essentialthatthefeaturesextractedfromthebrainsignals facil-itatethediscriminationtasktobeperformedattheclassiﬁcation stage.AsdiscussedinSection1,theSSVEPparadigmisbasedonthe detectionofoscillatingpatternswithinEEGwaves,hencetheuse ofspectralfeaturesisanaturalchoice[25].Fig.4showsthe spec-tralcharacteristicsoftheSSVEPresponsesobservedonchannelO2 fortheevokedfrequencies12and15Hz.Itisnoticeablethatthe spectralcontentisconcentratedaroundtheevokedfrequencies.

In fact, the standard technique for identifying the SSVEP responseassociatedwithanEEGsignalistoanalyzethesignalin thefrequencydomainbycalculatingitspowerspectraldensityin allpossiblyevokedfrequencybands.Aseachofthesebands cor-respondstotheimmediatevicinityofoneoftheinterfaceblink rates, it is possible identify the desired BCI command. In this study,theunderlyingspectralcontentwasestimatedusingthree approaches:a ﬁlterbank,theshort-time Fouriertransform and Welch’smethod.

5.1. Filterbank

AnintuitivewaytoestimatethespectralpowerofanSSVEP signalistofocusonthefrequencyrangeofinteresttoassessthe spectralcontentofthisinterval.Theﬁlterbankusesthisidea com-biningasetofbandpassﬁltersthatseparatestheinputsignalinto multiplecomponents[26],each onecarryingasinglefrequency sub-bandoftheoriginalsignal,asshowninFig.5.

Inourstudy,theﬁlterbankis designedwithtwoequiripple bandpass ﬁlters centered at the evoked frequencies, with 2Hz bandwidth, attenuationof 40dB in the stop bandand 1Hz of

(4)

(a)

(b)

11.5 12 12.5 13 13.5 14 14.5 15 15.5 16 0 0.5 1 1.5 2 2.5 x 10-12 Frequency (Hz) PS D ( W e lc h)

Spectral Density - Channel O2

Evoked Frequency 12 Hz Evoked Frequency 15 Hz 0 2 4 6 8 10 12 14 x 10-12 0 1 2 3 4 5 6 7x 10 -12

_{Space of Features}

Spectral features extracted at 12 Hz

S pec tr a l fe a tu re s ex tr ac ted a t 1 5 Hz Features of 12 Hz Features of 15 Hz

Fig.4.FeaturesextractionofSSVEPresponseat12and15Hz,(a)powerspectraldensity,(b)spaceofspectralfeaturesconsideringonlyanoccipitalchannel.

Fig.5. Filterbankschemefortwofrequencies.

transitionrange(seeFig.6).Theoutputpoweroftheelementsof thebankisconsideredasanestimateofthepowerspectrumatthe centralfrequencies.

5.2. ShortFouriertransform

Theshort-timeFouriertransformallowstheestimationofthe powerspectrumviathecomputationoftheFouriertransformon segmentsofthesignal,normallywithanoverlaptoreduce arti-factsattheboundary[26].Theobtainedcomplexvaluesprovide

informationconcerningthemagnitudeandphaseofeachpointin timeandfrequency.TheSTFTisgivenby

X(m,ω)= ∞

n=−∞

x [n] w [n−m] exp (−jωn) (2)

inwhichx[n]isthesignal,w[n]isthewindow,misthesegment lengthandωistheangularfrequency.Thesquaredmagnitudeof theSTFTisgivenbythespectrogramas:

spectrogram≡

X(m,ω)

2 (3)

andprovidesanestimateofthepowerspectrumofthesignal. In our study, thespectrogram is computed around the two evokedfrequencies(12and15Hz),usingHammingwindowsof 3swith1sofoverlap.

5.3. Welch’smethod

Welch’s methodestimates thepower spectral density (PSD) applyingthefastFourier transform(FFT)algorithm [26,27].The methodsplitstheinputdataintoNsegments,computesmodiﬁed periodogramsofsegmentsviaFFTandestimatesthePSDbythe

(5)

tionofthePSDcanbeexpressedby ˆS(ω)= 1 KNU K

k=1

K

k=1 W(n)x(n+kD)exp(−jωn)

2 (4)

inwhich thesignalisdividedinto Ksegmentsof lengthNand shiftedofDpoints.WisawindowfunctionandUisa constant givenby: U= 1 N N

n=1

_W(n)

2 (5)

Inthepresentstudy,thedatawaswindowedbyHamming win-dowswith3sand1sofoverlap.ThePSDwasestimatedforeach visualstimulususing1Hzbandscenteredonfrequenciesof12and 15Hzandwithastepof0.01Hz.

6. Featureselection

Theamountoffeaturesavailabletodesignaclassiﬁcation sys-temisusuallylarge,whencomparedtotherestrictednumberof featuresrequiredtoensuresuitablegeneralizationpropertiesof theclassiﬁer,reasonablecomputationalcomplexityandprocessing time.

Inordertofindthemostrelevant featuresfor designingthe classificationsystem,featureselectionisusuallyapplied.This tech-niqueexploitsthe mutual (linear and/ornonlinear) correlation amongfeaturesselectingthosethatretainsmoreclass discrimi-natoryinformation.Strategiesforperformingthisselectionfollow twoapproaches:filtersorwrappers[10,11].Thefirstuses statis-ticalmeasurestoquantifytherelevanceofeachfeatureandare probablythesimplesttechniquestooperateonthefeaturespace [11,28].Filtersoperatewithmetricsdirectlyobtainedfrom fea-tures,being,therefore,independentoftheclassifiertoperformthe choice.Thefiltersusuallyoutlinestatisticfunctionsthatreturna relevanceindexmatchingeachattributeandlabel.Thisapproach tacitly assumesindependence betweenfeatures and, therefore, ignoresthecorrelationbetweenvariables,whichcanaffectthe pre-dictionperformance.Thesecondapproachtakesintoaccountthe performanceofthetrainedclassifiertorankthefeatures.Inthe fol-lowing,twofiltertechniquesaredescribed–PearsonandDavies Bouldin–,aswellastheforwardwrapperalgorithmusedinthis study.

6.1. Pearson’sﬁlter

ThePearsoncorrelationcoefficient[28,29]definesakindoffilter strategyinwhichaninputvectorxiisassociatedwithafeatureand itslabelyintheform:

Ri=

cov(xi,y) var(xi)var(y)

(6)

beingcov(.)isthecovarianceandvar(.)isthevariance.

ThisstrategyfirstlyevaluatesRifori=1,...,M,beingMthe num-berofattributes,and,afterwards,rankstheKfeaturesusingthe criterionofmaximumvaluesofRi.Ascorrelationdefinesa second-orderstatisticalmeasure,thiscoefficientisabletocaptureonly lineardependencybetweenthefeatures.However,duetoits com-putationalsimplicity,itcanbesuitablyusedasabasicmetricto understandthefeaturespace.

TheDaviesBouldin(DB)indexisaclustermeasurethatattempts to quantify the separability of of different classes considering twomainrelevantaspectsofdataclustering:theminimizationof thedistancewithinaclassandthemaximizationofthedistance betweentheclasses.Forclasseswiwithi=1,2,...,m,theDBindex canbedescribedbytheratio:

DB= 1 m m

i=1

max_j=1,...,m _j_/=1 si+sj dij

(7) inwhichsiistheaveragedistancebetweeneachpointoftheclass iandthecentroidofthisclass,andsjisthesamefortheclassj. TheparameterdijistheEuclideandistancebetweenthecentroids ofclassesiandj.

TakingFig.4basan exampleofa two-dimensionalattribute space,itisnotdifficulttorealizethatalowclassdispersionwith farapartcentroidscontributestoadesirableseparable configura-tion,whichimpliesinsmallDBvaluesandinaninterestingranking measure.Inthiscase,theinverseofthisindex(DBinv)wasusedto inordertoseekthebestchannels(electrodes)atstimulation fre-quencies,and,consequently,todefinethefeaturevector.Adetailed descriptionoftheDBindexcanbefoundin[12].

6.3. Wrappers

Thewrappermethodology[10,11]performsfeatureselectionin termsoftheperformanceoftheclassifier.Insimpleterms,there arethreeaspectstodefineitsimplementation[10]:(i)thesearch strategyemployedatthefeaturespace,(ii)thestoppingcriterion and(iii)theclassifierstructure.

Thefirststepreliesonperforminganefficientsearchonthe fea-turespaceduetothelargenumberofpossibilitiesinorderof2M₋_1, beingMthenumberoffeatures.Therearemanypossibilitiesto realizesuchsearchasgeneticalgorithms,simulatedannealingor greedyheuristics.Inthestudy,thegreedyheuristicbasedon for-wardselectionwaschosen,onceitissupposedthattheattributes arebettercorrelatedbyaprogressiveincorporation.Thesimplest stoppingcriterionconsistsoftherule“ifnoimprovement,sostop”. Thisapproach can,however,lead tolocalconvergence.A more robuststopping criterionconsiders k consecutivestepswithout performancegain.Inthisstudyk=2wasadopted.Thethirdaspect, theclassifier structure,hasa stronginfluence onfeature selec-tion,sincetheperformanceofclassifierisconstantlyevaluated,as describedinthealgorithmpresentedonTable1.Itisimportantto notethatwrappersdonotguaranteeglobalconvergence.

7. Classiﬁers

Theclassifierstructureisresponsibleformappingeachinput featurevectorontoalabelcorrespondingtoanelementofa dis-cretesetofclasses.Insimpleterms,themappingperformedbya classifiercanbeunderstoodasengenderingasetofpartitionsofthe inputspacethataredelimitedbydecisionboundaries[28,29]. Clas-sifierscanbeeitherlinearornonlinear,dependingonthenature oftheperformedmapping.Inthefollowing,wewilldiscussthree classifiersthatareinterestingoptionsintheBCIcontext,andshall be,accordingly,adoptedforfurtheranalysis.

7.1. Lineardiscriminantanalysis

TheLDAisoneofthemostusedstrategiesinBCIssystemsdue toitssimplicityandlowcomputationalcost.In simpleterms,it consistsinﬁndingthelinearcombinationwthatbetterseparate theclasses,whichimpliesinestablishingadecisionsurfaceinthe

(6)

Table1

Incrementalwrappersalgorithm.

Initially,therearek=0andthreesets:T={1,2,...,M}withallfeatures,S=∅ withselectedfeaturesandO=∅withfeaturesonobservation

1. Evaluate,onebyone,theclassiﬁerperformancebycross

validationforallfeaturesofsetT.PutinSthefeaturethat presentedthebestperformanceandremoveitfromT

2. ConsiderallfeaturescomposedwiththeelementsofsetsS

andOandtesttheinclusion,onebyone,ofthefeaturesof setT,evaluatingtheperformanceoftheclassiﬁerbycross validation

3. Iftheclassiﬁerperformanceincreased,selectthefeature thatgavethebestperformance,includeitinSandremove itfromT

3.1 Ifk=1,puttheelementofOinS,makeO=∅andk=0 3.2 IfTisnotthenullset,goto(2).Else,stop

4. Iftheclassiﬁerperformancedecreasedandk=0,putinO thenewfeaturethatpresentedthebestperformancein thelastcomparisons,removeitfromTandmakek=1 4.1 IfTisnotthenullset,goto(2).Else,stop

5. Iftheclassiﬁerperformancedecreasedandk=1,occurred

asecondconsecutivedecrement,sostop Intheend,theSsethastheselectedfeaturesbyincrementalwrappers

formwT_x₊_c₌_0,_for_a_constant_threshold_value_c._For_instance,_if weassumetwonormalmultivariatedistributionswithmeans1 and2andcorrelationmatricesC1andC2,respectively,theLDA approach aimstoestablishw thatmaximizetheratio between theinter-classandintra-classvariance,whichcanmathematically describedby: S= 2 between 2 within =(wT(1−2)) 2 wT_(C₁₋_C₂_)w (8)

It ispossibleto showthatmaximization of Sis satisﬁedfor w∝(C1+C2)−1(1+2)andc=1/2wT(1+2)[28].Thereare

alsodifferentcriteriathancanbeusedtosetwforobtaininglinear decisionsurfaces,astheoneprovidedbysupportvectormachines strategieswithlinearkernelfunctions.WhenaGaussian distribu-tionisassumed, thecovarianceandthemeanfullydescribethe model.However,non-Gaussianrandomvariablescanbeassumed inthismodel,astheuseoftheirstatisticalstructureuptosecond ordermightbeenoughtosolvetheproblemathand.

7.2. Extremelearningmachines

Structurally,anELMcanbedeﬁnedasamultilayerperceptron neuralnetworkwithasinglehiddenlayerandalinearoutputlayer (seeFig.7).Theparametersoftheneuronsthatformthehidden

layerarerandomlychosen[8],andtheprocessoftrainingthe out-putlayeris essentially equivalenttotheadaptation ofa linear classiﬁer.Thechoiceofthenumberofneuronsintheintermediate layercanbemadebycross-validationmethods.

The model evokes elements of biological neuron operation—input data are weighted representing the synaptic efficiency and the activation function determines the firing (returnsoutput+1)ortheabsenceoffiring(outputreturns−1)of theneuron.Atypicalactivationfunctionisthehyperbolictangent, whichpresentsexactlyanonlinearityofthiskind.

Insimpleterms,thehiddenlayergeneratesanumberof non-linearrandomprojectionsthatmaptheinputvectorspaceonto afeaturespaceoverwhichtheoutputlayeroperatesasalinear regressor.Thecanonical approachistousethemethodof least squares,presentedinSection7.3.TheELMisaninterestingoption inthecontextofBCIinviewofthesimplicityofitsassociated train-ingprocessandofitsinherentregularizationproperties[30,31].

Inouranalyses,thenumberofneuronsinthehiddenlayerofthe ELMwasﬁxedat20afterpreliminarytests.Thehyperbolictangent wasusedasactivationfunction.Theweightsofhiddenlayerwere generatedusingarandomGaussianfunction.Theperformanceof ELMwasdeﬁnedintermsoftheaverageof20runsforeachsubject toaccountfortherandomcharacterofthenetwork.

7.3. Leastsquares

Themethodofleastsquaresisoftenusedinregressionanalysis. Inthis study,theleastsquareswereusedin twoapproachesof classiﬁcationmethods:theLDAandtheoutputlayeroftheELM.

ConsideringthatinaclassiﬁerproblemwehaveasetofN sam-pleslabeledfortrainingandthevectoroftheoutputlayerweights isw,themaincriterionunderlyingsuchstrategyisthefollowing:

min_w||Hw−d||2 (9)

beingHisthefeaturematrix,dthelabelvectorusedtotrainthe classiﬁerandwtheweightvector.Thesolutiontothisproblemcan becalculatedasaprojectionofthelabelvectordcarriedoutwith theaidofanoperatorbasedontheMoore–Penrosepseudo-inverse [28].InthecaseofanELM,ifthenumberofneuronsinthehidden layer(M)islargerthanthenumberofavailabledatasamples,there willbemultipleoptimalsolutionstotheproblemshownin(9), andthepseudo-inversehasthedesirableproperty–froma reg-ularizationperspective–ofgeneratingaminimalnormsolution. Inthisstudy,asalreadymentioned,thevalueofMwaschosenin

(7)

thenumberofdatasamples(N)islargerthanM,thesolutionis:

w=(HTH)−1HTd (10)

IfM>N,thesolutionisgivenby:

w=HT(H·HT)−1d (11)

IfM=N,wisthesameforbothequationsoncethematrixH

becomessquare.

7.4. Supportvectormachines

TheSVM[9]isalearningstructurethatcanbeusedtosolve classificationandregressiontasks.Inthecontextofclassification, itcanbeunderstoodasa maximal marginclassifierwhose lin-ear/nonlinearstructureisdefinedbyakernelfunction.Thedesign ofa classifier of this kindgives riseto a quadraticconstrained optimizationtaskthatcanbesolvedusinganumberofefficient computationaltools.Inaclassificationsystem,theSVMfollowstwo stages:trainingandclassification.

Inthetraining,labeleddataareusedinordertodeterminethe hyperplaneinahigh-dimensionalfeaturespacethatdistinguishthe classeswithmaximalmargin.Inpractice,thetrainingcanbe per-formedintheoriginaldataspaceusingdifferentkernelfunctions, aslinear,quadratic,polynomial,multilayerperceptron (MLP)or Gaussianradialbasis(RBF)[32].Inthisstudy,theMLPkernelwas selectedafterpreliminarytestswithallthemethods,inviewofits stabilityformultipletrials.TheMLPkernelisdeﬁnedas:

k(x,xi)=tan h(P1xT_ix+P2) (12)

wherexiistheinputdataandthekernelsparameterswereP1=1 andP2=−1.

Themachinesfoundinthetrainingphasearethenusedto clas-sifynewdataontheclassiﬁcationstage.

8. Resultsanddiscussion

Theperformance ofall classificationschemes wasevaluated usingcrossvalidation,therebeingsixtrialsfortrainingandtwo forvalidation.The36combinationsofdifferenttechniquesof fea-tureextraction,featureselectionandclassifiershavebeentestedfor eachperson,consideringwindowingof3s.Fig.8summarizesthe averageperformanceofallclassifierschemeswiththerespective standarddeviation.

Despitetheenvironmentanddataacquisitionhavingbeenkept constant,thebest BCIperformance isvariable accordingtothe individuals;inourdatabasewehad:

• 1subjectwithaccuracyrateof100%,

• 4subjectswithperformancebetween90%and100%, • 1subjectwithperformancebetween80%and90%, • 1subjectwithregularperformanceabout70%.

Theinter-subjectvariabilityisaclassicalcharacteristicofBCI systems,beingcommonlyreportedintheliterature(see[33,34]just tociteafew).Suchvariabilityisassociatedtoseveralfactors,such asageofthevolunteer,cerebralphysiologyandabilityto concen-trate.Furthermore,accordingto[33],someindividualsdonothave avisuallyevokedpotential(VEP)responseadequatetooperatean SSVEP-BCI.

Figs.8 and 9 showthat the performanceof the linear,ELM andSVM classiﬁerswasverycloseforthesubjects(p=0.3992). TheELMs are potentially capableof operating withthesimilar robustnessof linear classiﬁers,while providing a usefuldegree

Fig.8. Averageperformanceofclassiﬁersystemswithstandarddeviation.

offlexibility.TheSVMclassifier dependedheavilyonthe selec-tionstage: forinstance, usingall16 channels,theperformance drops significantly of about 8% when compared to best result achieved using selected attributes. The relatively poor perfor-manceoftheSVM,inthiscase,maybebecausekernelparameters werefixed:amoresystematicselectionbasedongridsearchand cross-validationcouldleadtoabetterperformance andwillbe investigatedinthenearfuture.

Regardingfeatureextraction,thestudiedmethodspresented similarbehaviors(seeFig.8),althoughtheuseofWelch’sandSTFT methodsappeartobeslightlymoreeffectivethantheuseofaﬁlter bank(p=0.011).

(8)

Fig.9. Performanceofclassiﬁersystemsforsubjectswith(a)excellent,(b)goodand(c)regularVEPresponse.

Featureselectionstrategiesprovedtoberelevant(p=0.0001), astheuseofdifferentEEGchannelshadaclearpositiveimpacton thesystemperformance.Allthestudiedstrategiesledtosimilar successrates,beingtheincrementalwrappercapableofreachinga slightlybetterperformance(around3%).

FromFig.9(c),itispossibletonotethat,foralowVEPresponse, somecombinationsofsignalprocessing methodsgivea perfor-mancegain.Inthebestcase,thesystemachieves75%ofthehitrate usinglinearclassifierwiththefeaturesextractedbyfilterbankand selectedbywrappers.Ontheotherhand,thesystemperformance dropsforjust45%intheworstcase,when,forthesamefeatures extractedbyfilterbank,nofeatureselectioncriterionisadopted andtheSVMisused(withfixedkernelparameters).Surprisingly, forthesesubjects,themostinformativeelectrodesarenotinthe occipitalzone,asshowninFig.9(c).Thechannelsassociatedwith themotorcortexandparietalzonealsoincludeduseful informa-tiontotheclassifierandappearbeforeintherankingofthefeatures selector.

Intermsofthebestchosenfeatures,Fig.9showsthe perfor-manceofeachclassifiersystemlistingthechannelsusedinthe bestconfigurationsforeachcase.Interestingly,asmentioned,the selectedchannelsarenotalwaysontheoccipitalzone,whichwould stronglyjustifytheuseofafeatureselectionstageforSSVEP-BCIs systems.Also,it canbenotedthateach subjectisassociatedto a specificchannel configuration,which couldvaryaccordingto thefeatureselectionstrategy andtheadoptedclassifier system. Asarule,thereisagainofinformationusingchannelsfrom dif-ferentregions;suchperformancegaincouldbeattributedtothe variabilitybetweenthechosenchannels,sincechoosingelectrodes

fromthesameregioncanleadtoanundesirablebiasrelatedto highcorrelatedsignals.Thisfactcanbeconﬁrmedbytheselection performedusingwrappers,whichdoesnotconsidertheamount of information present at the channels from a perspective of

(9)

togethertoselecttheelectrodesthatgivemoreinformation for thesystem.Asimilardependenceamongelectrodeslocationand features extraction technique was related by [35] for motor-imagery-basedBCIs.

Fig.10ranksthe16channelsinfrequencyorderastheyappear inthebestconfigurationforeachscenario,consideringtheseven subjects.Theoccipitalchannels(Oz,O1andO2)arethemost fre-quent,asexpected[7],appearing14%,11%and9%ofthetimes,a totalof34%.InthesequencearePO7(9%)andCz(8%),thesefive electrodesbeingresponsiblefor51%ofthefrequency.The chan-nelsPz,FCzandP2appearoccasionally,butthisdoesnotmean thattheyshouldnotbeconsidered.Thisfrequencyrankingisan averageamongsubjectsandcouldbeusedtoinitiallyoutlinethe bestchannels.But,foreachsubject,thebestconfigurationis vari-able:asillustratedinFig.9(b),theFCzisarelevantchannelforthis specificvolunteer.

9. Conclusions

Theresultsrevealedthat,forthetwo-classSSVEPproblem,the beststructurewasthelinearclassifierusingtheWelchmethodfor featureextractionandincrementalwrapperstocarryoutfeature selection. This configuration obtainedaverage accuracy around 95%, withwindowing of 3s, for the 7 subjects, reaching 100% forsome,whichisverysatisfactory.Thefeatureextraction tech-niquesshowedtobeequivalenttoestimatethespectralpower.The WelchandtheSTFTmethodspresentedasimilarperformanceanda slightlybetterperformance(6%,approximately)wasattainedusing filterbanks,althoughthisseemstobewithinthemarginoferror ofthesubjects.Featureselectionproveditselftobeanextremely importantstep,indicatingthepresenceofrelevantinformationin theparietal,motorandcentralzones,inadditiontotheoccipital lobe.Theresultsshowthatthethreeclassifierscanbeefficiently usedtobuildanSSVEP-basedBCI.However,theSVMclassifieris verysensitivetothefeatureselection strategy,especially when associatedwithfilterbankfeatureextracting.TheELMsare promis-ingclassifiersinthecontextofSSVEP,deservingtobeconsidered aspartofthecurrentrepertoireofBCIsystemclassifiers,asthey exhibitagoodgeneralizationperformance.Theobtainedresults supporttheuseofELMs,whichcanbeeven moreefficientand promisingwhenmoreclassesareconsidered.

Acknowledgements

TheauthorsthankFINEP,FAPESP,CNPq,CAPES,UFABCandUFOP fortheirﬁnancialsupport,andProf.Dra.GabrielaCastellano,Dr. RafaelFerrariand Ms.HarleiLeitefortheirimportanttechnical assistance.

References

[1]J.R. Wolpaw, N.B.D.J. McFarland, G. Pfurtscheller, T.M. Vaughan, Brain–computerinterfacesforcommunicationandcontrol,Clin.Neurophysiol. 113(6)(2002)767–791.

[2]J.D.R.Millán,etal.,Combiningbrain–computerinterfacesandassistive tech-nologies:state-of-the-artandchallenges,Front.Neurosci.4(2010)1–15,http:// dx.doi.org/10.3389/fnins.2010.00161,Article161.

[3]A.Nihjolt,D.Tan,Brain–computerinterfacingforintelligentsystems,IEEE Intell.Syst.vol.23(3)(2008)72–79.

[4]G.Dornhege,TowardBrain–ComputerInterfacing,MITPress,UnitedStatesof America,2007.

[5]N.F.Ince,F.Goksu,A.H.Tewﬁk,S.Arica,Adaptingsubjectspeciﬁcmotorimagery EEGpatternsinspace-time-frequencyforabraincomputerinterface,Biomed. SignalProcess.Control4(3)(2009)236–246.

featuresofP300event-relatedpotentials(ERPs)forbrain–computerinterface speller,Biomed.SignalProcess.Control5(4)(2010)243–251.

[7]Y.Wang,X.Gao,B.Hong,C.Jia,S.Gao,Brain–computerinterfacesbasedon visualevokedpotentials,IEEEEng.Med.Biol.Mag.27(5)(2008)64–71. [8]G.B.Huang,D.H.Wang,Y.Lan,Extremelearningmachines:asurvey,Int.J.

Mach.Learn.Cybern.2(May(2))(2011)107–122.

[9]C.J.C.Burges,Atutorialonsupportvectormachinesforpatternrecognition, DataMin.Knowl.Discovery2(2)(1998)1–47.

[10]R.Kohavi,G.H.John,Wrappersforfeaturesubsetselection,Artif.Intell.97(1) (1997)273–324.

[11]I.Guyon,A.eElisseeff,Anintroductiontovariableandfeatureselection,J.Mach. Learn.Res.3(2003)1157–1182.

[12]D.L.Davies,D.W.Bouldin,Aclusterseparationmeasure,IEEETrans.Pattern Anal.Mach.Intell.PAMI-1(2)(1979)224–227.

[13]C.S. Hermann, Human EEG responses to 1–100Hz ﬂicker: resonance phenomena in visual cortexand their potential correlation to cognitive phenomena,Exp.BrainRes.137(3–4)(2001)346–353,http://dx.doi.org/10. 1007/s002210100682

[14]G. Bin,X.Gao,Y. Wang,VEP-basedbrain–computer interfaces:time, fre-quency, and code modulations, IEEE Comput. Intell. Mag. 4 (4) (2009) 22–26.

[15]KianB.Ng,A.P.Bradley,R.Cunnington,Stimulusspeciﬁcityofasteady-state visual-evokedpotential-basedbrain–computerinterface,J.NeuralEng.9(3) (2012)036008.

[16]D.Regan,HumanBrainElectrophysiology:EvokedPotentialsandEvoked Mag-neticFieldsinScienceandMedicine,Elsevier,NewYork,NY,1989.

[17]L.J.Trejo,R.Rosipal,B.Matthews,Brain–computerinterfacesfor1-Dand2-D cursorcontrol:designsusingvolitionalcontroloftheEEGspectrumor steady-statevisualevokedpotentials,IEEETrans.NeuralSyst.Rehabil.Eng.14(2) (2006)225–229.

[18]G.Pfurtscheller,C.Neuper,Motorimageryanddirectbrain–computer commu-nication,Proc.IEEE89(7)(2001)1123–1134.

[19]G.tec, G.tec Medical Engineering, 2015, Available http://www.gtec.at/ [accessedJan.,2015].

[20]B.Graimann,B.Allison,G.Pfurtscheller,Brain–computerinterfaces:agentle introduction,in:Brain–ComputerInterfaces,Springer,SpringerBerlin Heidel-berg,2010,pp.1–27,http://dx.doi.org/10.1007/978-3-642-02091-91 [21]G.G.Molina,D.Zhu,Optimalspatialﬁlteringforthesteadystatevisualevoked

potential:BCIapplication,in:FifthInternationalIEEE/EMBSConferenceon NeuralEngineering(NER),2011,pp.156–160.

[22]O.Friman,I.Volosyak,A.Graser,Multiplechanneldetectionofsteady-state visualevokedpotentialsforbrain–computerinterfaces,IEEETrans.Biomed. Eng.54(4)(2007)742–750.

[23]P. Martinez, H. Bakardjian, A. Cichocki, Fully online multicommand brain–computerinterfacewithvisualneurofeedbackusingSSVEPparadigm, Comput.Intell.Neurosci.(2007)13–22,http://dx.doi.org/10.1155/2007/94561 [24]C.M.Bishop,NeuralNetworksforPatternRecognition,ClarendonPress,Oxford,

NewYork,1995.

[25]M.H. Chang,K.S.Park,Frequencyrecognition methodsfordual-frequency SSVEPbasedbrain–computerinterface,in:EngineeringinMedicineand Biol-ogySociety(EMBC),35thAnnualInternationalConferenceoftheIEEE,2013, pp.2220–2223.

[26]S.S.Haykin,AdaptiveFilterTheory,PearsonEducationIndia,NewDelhi,India, 2008.

[27]P.D.Welch,TheuseoffastFouriertransformfortheestimationofpower spec-tra:amethodbasedontimeaveragingovershort,modiﬁedperiodograms,IEEE Trans.AudioElectroacoust.AU-15(June)(1967)70–73.

[28]S.Theodoridis,K.Koutroumbas,PatternRecognition,Fourthed.,Academic Press,London,UK,2008.

[29]C.M.Bishop,etal.,PatternRecognitionandMachineLearning,Springer,New York,NY,2006.

[30]A.Bamdadian,C.Guan,K.K.Ang,J.Xu,Improvingsession-to-session trans-ferperformanceofmotorimagery-basedBCIusingadaptiveextremelearning machine,in:EngineeringinMedicineandBiologySociety.35thAnnual Inter-nationalConferenceoftheIEEE,2013,pp.2188–2191.

[31]L.Duan,H.Zhong,J.Miao,Z.Yang,W.Ma,X.Zhang,Avotingoptimizedstrategy basedonELMforimprovingclassiﬁcationofmotorimageryBCIdata,Cogn. Comput.6(3)(2014)477–483.

[32]N.Cristianini,J.Shawe-Taylor,AnIntroductiontoSupportVectorMachinesand OtherKernel-basedLearningMethods,CambridgeUniversityPress,Cambridge, UK,2000.

[33]B.Allison,etal.,BCIdemographics:Howmany(andwhatkindsof)people canuseanSSVEPBCI?IEEETrans.NeuralSyst.Rehabil.Eng.18(2)(2010) 107–116.

[34]B.Z.Allison,C.eNeuper,CouldAnyoneUseaBCI?,Brain–computerInterfaces, Springer,London,2010,pp.35–54.

[35]S.A.Park,H.J.Hwang,J.H.Lim,J.H.Choi,H.K.Jung,C.H.Im,Evaluationof fea-tureextractionmethodsforEEG-basedbrain–computerinterfacesintermsof robustnesstoslightchangesinelectrodelocations,Med.Biol.Eng.Comput.51 (5)(2013)571–579.

Comparative analysis of strategies for feature extraction and classification in SSVEP BCIs.

Biomedical

Signal

Processing

and

Control

Comparative

analysis

of

strategies

for

feature

extraction

and

classiﬁcation

in

SSVEP

BCIs

Sarah

N.

Carvalho

,

Thiago

B.S.

Costa

,

Luisa

F.S.

Uribe

,

Diogo

C.

Soriano

,