ContentslistsavailableatScienceDirect
Biomedical
Signal
Processing
and
Control
j ou rn a l h o m e pa g e :w w w . e l s e v i e r . c o m / l o c a t e / b s p c
Comparative
analysis
of
strategies
for
feature
extraction
and
classification
in
SSVEP
BCIs
Sarah
N.
Carvalho
a,b,∗,
Thiago
B.S.
Costa
a,
Luisa
F.S.
Uribe
a,
Diogo
C.
Soriano
c,
Glauco
F.G.
Yared
b,
Luis
C.
Coradine
d,
Romis
Attux
aaUniversityofCampinas,UNICAMP,Campinas,Brazil bFederalUniversityofOuroPreto,UFOP,OuroPreto,Brazil cFederalUniversityofABC,UFABC,SantoAndré,Brazil dFederalUniversityofAlagoas,UFAL,Maceió,Brazil
a
r
t
i
c
l
e
i
n
f
o
Articlehistory:
Received20January2015
Receivedinrevisedform15April2015 Accepted5May2015
Availableonline1June2015
a
b
s
t
r
a
c
t
Brain–computerinterface(BCI)systemsbasedonelectroencephalographyhavebeenincreasinglyused indifferentcontexts,engenderingapplicationsfromentertainmenttorehabilitationinanon-invasive framework.Inthisstudy,weperformacomparativeanalysisofdifferentsignalprocessingtechniques foreachBCIsystemstageconcerningsteadystatevisuallyevokedpotentials(SSVEP),whichincludes:(1) featureextractionperformedbydifferentspectralmethods(bankoffilters,Welch’smethodandthe mag-nitudeoftheshort-timeFouriertransform);(2)featureselectionbymeansofanincrementalwrapper, afilterusingPearson’smethodandaclustermeasurebasedontheDavies–Bouldinindex,inaddition toascenariowithnoselectionstrategy;(3)classificationschemesusinglineardiscriminantanalysis (LDA),supportvectormachines(SVM)andextremelearningmachines(ELM).Thecombinationofsuch methodologiesleadstoarepresentativeandhelpfulcomparativeoverviewofrobustnessandefficiencyof classicalstrategies,inadditiontothecharacterizationofarelativelynewclassificationapproach(defined byELM)appliedtotheBCI-SSVEPsystems.
©2015ElsevierLtd.Allrightsreserved.
1. Introduction
ABrain–computerinterface(BCI)isadevicethataimstomap brain signals onto commands for external devices, defining an alternativecommunicationchannelforusersindifferent practi-calcontexts,whichcanincludeapplicationsfromcomputergames toassistivetechnologies[1,2].
BCIs,ingeneral,makeuseofelectroencephalography(EEG)[3], asaconsequenceoffactorslikeportability,non-invasivenessand cost.EEGsignalsareacquiredwiththeaidofanelectrodecap pos-itionedontheuser’sscalp,whichisconnectedtopre-processing andsamplingmodules.ThedesignofaBCIisdeterminedbythe cho-senparadigm,themaintrendsinthefield[4]beingmotorimagery, P300andsteadystatevisuallyevokedpotentials(SSVEP).Thelast
∗ Correspondingauthorat:FederalUniversityofOuroPreto,Rua36,número115, salaG302,35931-008,JoãoMonlevade,MinasGerais,Brazil.Tel.:+553138528719.
E-mailaddresses:sarah@deelt.ufop.br(S.N.Carvalho),
bulhoes@dca.fee.unicamp.br(T.B.S.Costa),lsuarez@dca.fee.unicamp.br
(L.F.S.Uribe),diogo.soriano@ufabc.edu.br(D.C.Soriano),attux@dca.fee.unicamp.br (R.Attux).
twoareapproachesbasedonevent-relatedpotentials(ERP).The firstoftheseparadigmsreliesontheabilityoftheoperatorin mod-ifying–byimaginingtheprocessofmovingpartsofbothsidesof his/herbody(e.g.openingorclosingtherightorthelefthand)– theactivityofthemotorcortex[5],whilethesecondmakesuse ofaspecificevent-relatedpotential,theP300wave,to character-izetheinteractionbetweentheoperatorandacommandinterface [6].Finally,theSSVEPparadigm,thesubjectofthisstudy,isbased ontheanalysisofoscillatingEEGpatternsthataregeneratedinthe cortexinresponsetocertainvisualstimuli.Morespecifically,when anindividualisvisuallystimulatedbyapatternthatflickers repet-itivelywithinacertainrangeoffrequencies,asynchronizedSSVEP canbedetectedinhis/herbrainelectricalactivity.Hence,iflight sourceswithdifferentflickeringratesareusedtobuildacommand interface,itispossibletoidentifyonwhichlightthesubjectfocused his/herattentionatagivenperiodoftimebysuitablyprocessing andclassifyingtheEEGsignal.
Ingeneral,thestructureofanSSVEP-basedBCIcanberoughly dividedintofourstages:dataacquisition,signalprocessing, com-mandgenerationand finalapplication [7].Fig.1shows ablock diagramofthisstructurehighlightingthefourstagesofthesignal processingmodule,whichisthefocusofthisstudy.Thefirststage, http://dx.doi.org/10.1016/j.bspc.2015.05.008
Fig.1.OverviewofaBCIsystem.
pre-processing,isbasedontemporalandspatialfilteringandis typicallyofamoregeneralcharacter.Thesecondandthirdstages, ontheotherhand,haveastrongerdependencewithrespecttothe featuresoftheselectedparadigm.Theclassifierstagegeneratesthe controlcommandbasedoninputsignal.
Inthisstudy,wewillperformacomparativeanalysisof meth-odsforfeatureextraction,featureselectionand classificationin SSVEPBCIs.Threefeatureextractionapproaches—spectral estima-tionusinga bankof band-passfilters,Welch’s methodandthe magnitudeoftheshort-timeFouriertransform(STFT)calculatedat theevokedfrequencies,threefeaturesselection–andthree clas-sifiers–alineardiscriminant,anextremelearningmachine(ELM) [8]andasupportvectormachine(SVM)[9]–willbeconsidered. Furthermore,theperformanceofeachstructurewillbeanalyzed underthreefeatureselectionapproaches:anincremental wrap-per[10],afilterusingPearson’smethod[11]andastrategybased ontheDavies–Bouldinindex[12],inadditiontoacasewithout featureselection.Thisrepertoireof36scenarios appliedonthe samedatabasedefinesinterestingcomparativeelements:(1)since SSVEPengenders awell-defined spectralresponse,this studyis relevantasaperformanceanalysisofdistinctfrequency-domain featureextractionmethods.(2)Therobustnessofnonlinear struc-tures,asELMandSVM,inhandlingtherequiredSSVEPclassification taskisinvestigated.(3)Theprocessofchannelselectionisanalyzed adoptingthreestrategieswithdistinctconceptualfoundations.(4) Statisticalconsiderationsaremadeaboutthebestconfigurationof electrodesaccordingtodifferentmethodsoffeatureselection.
Thisstudywillbecarriedoutusingadatabasegenerated accord-ingtotheexperimentalsetupdescribedinSection3.Inadditionto thecontributionthatthestudyasawholerepresents,webelieve theanalysisoftheperformanceofanELMinSSVEPsystemscan alsobeconsideredasacontributionperse,asanequivalent anal-ysis,tothebestofourknowledge,hasnotbeenreportedsofarin theliterature.
Theremainderof this paperis organizedasfollows.Section 2 presentsbriefly theSSVEP paradigm. Section3 describesthe experimentalsetupandproceduresofdatarecorder.Sections4–7 discussthefourstagesofsignalprocessing,i.e.,pre-processing, fea-tureextractionapproaches,featuresselectionandtheclassification criteria,respectively.Section8presentstheresults,whileSection 9containsourconclusionsandfinalremarks.
2. Fundamentalsofsteadystatevisuallyevokedpotentials
Theneurophysiologyofthehumanvisualsystemreportsthat theneuronal activityof thecells ofthe visualcortexis altered byvisualstimulation,anditispossibleidentifyvariationsofthe brainresponserelatedtopropertiesofthevisualstimulus,suchas luminance,contrastandfrequency(between1Hzand100Hz[13]). Neuronsinvisualcortexsynchronizetheirfiringtothefrequencyof blinkingofvisualstimulus.Thesteadystatevisuallyevoked poten-tialsoccurwhenvisualstimuliarepresentedrepeatedlycreating almostsinusoidaloscillations[14,15].TheEEGresponsepresents anincreaseofenergyinthesamefrequencyoftheblinking stim-ulus [16]. The strongest response occurs in the primary visual cortex,althoughotherareasofthebrainareactivatedinvarying degrees.TheSSVEPcanbedetectedwithinnarrowfrequencybands (e.g.,0.1Hz)aroundthefrequencyof visualstimulationvia sig-nalprocessingmethodsthatexploitspecificcharacteristicsofthe signal,suchastimingandrhythm.
TheSSVEPBCIsystemsusevisualstimuliasawaytoevokea cer-tainelectricalpatterninthevisualcortex.UnlikeindependentBCI systems,wheretheimplementationisbasedonvoluntarycontrol ofneuralactivityofthesubject[17,18],theoperationofSSVEP sys-temsdependsontheabilityofthesubjecttofocuson,fixandfollow thevisualstimuliaccordingtoanintendedaction,asalsoonthe adoptedsignalprocessingstrategies,whichjustifiestheextensive scenariosanalyzedinthepresentstudy.
3. Experimentalsetup
The stimulationinterface(see Fig.2)consistsof two square checkerboardswithsidesof3.8cm,displayedontherightandleft centersofa blackscreen,blinkingat12and15Hz,respectively. A14-in.monitorwithrefreshrateof60Hzwasused.Thesubject focusedhis/hergazefor12soneachstimulus,repeatingthis pro-cesseighttimeswithrestintervals.TheEEGdatawerecollected fromsevenhealthyvolunteers,withanaverageageof26.3±3.3 years.TheacquisitionprotocolwasapprovedbytheEthics Com-mitteeoftheUniversityofCampinas(n.791/2010).Thedatabase iscomposedof1344sofEEGdatarecordedatasample rateof 256Hz,using a g®.SAHARAsys dry-electrodecapwith16 chan-nelsandag®.USBampbiosignalamplifier[19],andregisteredat
Fig.2. Experimentalsetup.(a)Screenwithcheckerboardsusedtogeneratevisual stimuliat12and15Hz(b)Configurationofequipmentanddatacollection environ-ment.
theMATLAB®2012b,usinganApplicationProgrammingInterface (API)providedbytheaforementioneddevicemanufacturer.The acquisitionswereonlyperformedafterthefollowingproceedings regardingtheEEGapparatus:channelimpedancecalibration; ver-ificationoftheimpedanceelectrodecalibration(between0.5and 5.0k);connectionofthegroundandreferencechannelsockets, respectively,tocommongroundandreference;andstabilizationof thesignal.Thegroundandreferencearepositionedonmastoids. Fig.3showsthearrangementofelectrodesatO1,O2,Oz,POz,Pz, PO4,PO3,PO8,PO7,P2,P1,Cz,C1,C2,CPz,FCz,accordingtothe international10–20system[20].
Fig.3.DispositionofelectrodesonthescalpforEEGsignalacquisition.
Severalinterferents are added to theEEG signal duringthe recording.Theseartifactscompromisethequalityoftheobtained signal,affectingtheBCIperformance.Themainartifactsourcesare: EEGequipmentanditsconnectionstothescalp;electricalsource (60Hz);thenormalelectricalactivityofthesubjectasheart,eye blinking,eyesmovementandmusclesingeneral.Recognitionand eliminationofartifactsinEEGsignalsarecomplextasks,but essen-tialtothedevelopmentofpracticalsystems.
Inthisstudy,theEEGsignalwasfilteredbyananalog Butter-worthbandpassfilter(5–60Hz)andanotchfilter(58–62Hz)in orderto removethe smoothdisplacement and electromagnetic artifacts.Inthesequence,toremoveotherartifactspresentinthe bandof,aseyeblinkingandneckmovements,dataaresubmitted toaspatialfilteringusingtheCommonAverageReference(CAR) [21]method,definedas:
ViCAR=ViER−1 n n
j=1 VjER (1) where VERj is thepotentialof i-thelectrode measurement with respecttosamereferenceandnisthenumberofelectrodesinthe array.TheCARusestheaveragevalueoftheentirearraytosubtract thismeanfromeachelectrode,henceeliminatingsimilarartifacts presentinmostelectrodes.Althoughnoisesourcesaredeeply com-plexandvaryacrossandwithinsubjects,thetemporalandspatial filteringhavebeendemonstratedtobeconvenienttomaximizethe signal-to-noiseratioandtoimprovetheaccuracyoftheSSVEPBCI system[22,23].
5. Featureextractionapproaches
Featuresare,insimpleterms,elementsofacompactand effi-cientdatarepresentation[24].InthecontextofaBCIsystem,itis essentialthatthefeaturesextractedfromthebrainsignals facil-itatethediscriminationtasktobeperformedattheclassification stage.AsdiscussedinSection1,theSSVEPparadigmisbasedonthe detectionofoscillatingpatternswithinEEGwaves,hencetheuse ofspectralfeaturesisanaturalchoice[25].Fig.4showsthe spec-tralcharacteristicsoftheSSVEPresponsesobservedonchannelO2 fortheevokedfrequencies12and15Hz.Itisnoticeablethatthe spectralcontentisconcentratedaroundtheevokedfrequencies.
In fact, the standard technique for identifying the SSVEP responseassociatedwithanEEGsignalistoanalyzethesignalin thefrequencydomainbycalculatingitspowerspectraldensityin allpossiblyevokedfrequencybands.Aseachofthesebands cor-respondstotheimmediatevicinityofoneoftheinterfaceblink rates, it is possible identify the desired BCI command. In this study,theunderlyingspectralcontentwasestimatedusingthree approaches:a filterbank,theshort-time Fouriertransform and Welch’smethod.
5.1. Filterbank
AnintuitivewaytoestimatethespectralpowerofanSSVEP signalistofocusonthefrequencyrangeofinteresttoassessthe spectralcontentofthisinterval.Thefilterbankusesthisidea com-biningasetofbandpassfiltersthatseparatestheinputsignalinto multiplecomponents[26],each onecarryingasinglefrequency sub-bandoftheoriginalsignal,asshowninFig.5.
Inourstudy,thefilterbankis designedwithtwoequiripple bandpass filters centered at the evoked frequencies, with 2Hz bandwidth, attenuationof 40dB in the stop bandand 1Hz of
(a)
(b)
11.5 12 12.5 13 13.5 14 14.5 15 15.5 16 0 0.5 1 1.5 2 2.5 x 10-12 Frequency (Hz) PS D ( W e lc h)Spectral Density - Channel O2
Evoked Frequency 12 Hz Evoked Frequency 15 Hz 0 2 4 6 8 10 12 14 x 10-12 0 1 2 3 4 5 6 7x 10 -12
Space of Features
Spectral features extracted at 12 Hz
S pec tr a l fe a tu re s ex tr ac ted a t 1 5 Hz Features of 12 Hz Features of 15 Hz
Fig.4.FeaturesextractionofSSVEPresponseat12and15Hz,(a)powerspectraldensity,(b)spaceofspectralfeaturesconsideringonlyanoccipitalchannel.
Fig.5. Filterbankschemefortwofrequencies.
transitionrange(seeFig.6).Theoutputpoweroftheelementsof thebankisconsideredasanestimateofthepowerspectrumatthe centralfrequencies.
5.2. ShortFouriertransform
Theshort-timeFouriertransformallowstheestimationofthe powerspectrumviathecomputationoftheFouriertransformon segmentsofthesignal,normallywithanoverlaptoreduce arti-factsattheboundary[26].Theobtainedcomplexvaluesprovide
informationconcerningthemagnitudeandphaseofeachpointin timeandfrequency.TheSTFTisgivenby
X(m,ω)= ∞
n=−∞x [n] w [n−m] exp (−jωn) (2)
inwhichx[n]isthesignal,w[n]isthewindow,misthesegment lengthandωistheangularfrequency.Thesquaredmagnitudeof theSTFTisgivenbythespectrogramas:
spectrogram≡
X(m,ω)2 (3)andprovidesanestimateofthepowerspectrumofthesignal. In our study, thespectrogram is computed around the two evokedfrequencies(12and15Hz),usingHammingwindowsof 3swith1sofoverlap.
5.3. Welch’smethod
Welch’s methodestimates thepower spectral density (PSD) applyingthefastFourier transform(FFT)algorithm [26,27].The methodsplitstheinputdataintoNsegments,computesmodified periodogramsofsegmentsviaFFTandestimatesthePSDbythe
tionofthePSDcanbeexpressedby ˆS(ω)= 1 KNU K
k=1 K k=1 W(n)x(n+kD)exp(−jωn) 2 (4)inwhich thesignalisdividedinto Ksegmentsof lengthNand shiftedofDpoints.WisawindowfunctionandUisa constant givenby: U= 1 N N
n=1 W(n)2 (5)Inthepresentstudy,thedatawaswindowedbyHamming win-dowswith3sand1sofoverlap.ThePSDwasestimatedforeach visualstimulususing1Hzbandscenteredonfrequenciesof12and 15Hzandwithastepof0.01Hz.
6. Featureselection
Theamountoffeaturesavailabletodesignaclassification sys-temisusuallylarge,whencomparedtotherestrictednumberof featuresrequiredtoensuresuitablegeneralizationpropertiesof theclassifier,reasonablecomputationalcomplexityandprocessing time.
Inordertofindthemostrelevant featuresfor designingthe classificationsystem,featureselectionisusuallyapplied.This tech-niqueexploitsthe mutual (linear and/ornonlinear) correlation amongfeaturesselectingthosethatretainsmoreclass discrimi-natoryinformation.Strategiesforperformingthisselectionfollow twoapproaches:filtersorwrappers[10,11].Thefirstuses statis-ticalmeasurestoquantifytherelevanceofeachfeatureandare probablythesimplesttechniquestooperateonthefeaturespace [11,28].Filtersoperatewithmetricsdirectlyobtainedfrom fea-tures,being,therefore,independentoftheclassifiertoperformthe choice.Thefiltersusuallyoutlinestatisticfunctionsthatreturna relevanceindexmatchingeachattributeandlabel.Thisapproach tacitly assumesindependence betweenfeatures and, therefore, ignoresthecorrelationbetweenvariables,whichcanaffectthe pre-dictionperformance.Thesecondapproachtakesintoaccountthe performanceofthetrainedclassifiertorankthefeatures.Inthe fol-lowing,twofiltertechniquesaredescribed–PearsonandDavies Bouldin–,aswellastheforwardwrapperalgorithmusedinthis study.
6.1. Pearson’sfilter
ThePearsoncorrelationcoefficient[28,29]definesakindoffilter strategyinwhichaninputvectorxiisassociatedwithafeatureand itslabelyintheform:
Ri=
cov(xi,y) var(xi)var(y)(6)
beingcov(.)isthecovarianceandvar(.)isthevariance.
ThisstrategyfirstlyevaluatesRifori=1,...,M,beingMthe num-berofattributes,and,afterwards,rankstheKfeaturesusingthe criterionofmaximumvaluesofRi.Ascorrelationdefinesa second-orderstatisticalmeasure,thiscoefficientisabletocaptureonly lineardependencybetweenthefeatures.However,duetoits com-putationalsimplicity,itcanbesuitablyusedasabasicmetricto understandthefeaturespace.
TheDaviesBouldin(DB)indexisaclustermeasurethatattempts to quantify the separability of of different classes considering twomainrelevantaspectsofdataclustering:theminimizationof thedistancewithinaclassandthemaximizationofthedistance betweentheclasses.Forclasseswiwithi=1,2,...,m,theDBindex canbedescribedbytheratio:
DB= 1 m m
i=1 maxj=1,...,m j/=1 si+sj dij (7) inwhichsiistheaveragedistancebetweeneachpointoftheclass iandthecentroidofthisclass,andsjisthesamefortheclassj. TheparameterdijistheEuclideandistancebetweenthecentroids ofclassesiandj.TakingFig.4basan exampleofa two-dimensionalattribute space,itisnotdifficulttorealizethatalowclassdispersionwith farapartcentroidscontributestoadesirableseparable configura-tion,whichimpliesinsmallDBvaluesandinaninterestingranking measure.Inthiscase,theinverseofthisindex(DBinv)wasusedto inordertoseekthebestchannels(electrodes)atstimulation fre-quencies,and,consequently,todefinethefeaturevector.Adetailed descriptionoftheDBindexcanbefoundin[12].
6.3. Wrappers
Thewrappermethodology[10,11]performsfeatureselectionin termsoftheperformanceoftheclassifier.Insimpleterms,there arethreeaspectstodefineitsimplementation[10]:(i)thesearch strategyemployedatthefeaturespace,(ii)thestoppingcriterion and(iii)theclassifierstructure.
Thefirststepreliesonperforminganefficientsearchonthe fea-turespaceduetothelargenumberofpossibilitiesinorderof2M−1, beingMthenumberoffeatures.Therearemanypossibilitiesto realizesuchsearchasgeneticalgorithms,simulatedannealingor greedyheuristics.Inthestudy,thegreedyheuristicbasedon for-wardselectionwaschosen,onceitissupposedthattheattributes arebettercorrelatedbyaprogressiveincorporation.Thesimplest stoppingcriterionconsistsoftherule“ifnoimprovement,sostop”. Thisapproach can,however,lead tolocalconvergence.A more robuststopping criterionconsiders k consecutivestepswithout performancegain.Inthisstudyk=2wasadopted.Thethirdaspect, theclassifier structure,hasa stronginfluence onfeature selec-tion,sincetheperformanceofclassifierisconstantlyevaluated,as describedinthealgorithmpresentedonTable1.Itisimportantto notethatwrappersdonotguaranteeglobalconvergence.
7. Classifiers
Theclassifierstructureisresponsibleformappingeachinput featurevectorontoalabelcorrespondingtoanelementofa dis-cretesetofclasses.Insimpleterms,themappingperformedbya classifiercanbeunderstoodasengenderingasetofpartitionsofthe inputspacethataredelimitedbydecisionboundaries[28,29]. Clas-sifierscanbeeitherlinearornonlinear,dependingonthenature oftheperformedmapping.Inthefollowing,wewilldiscussthree classifiersthatareinterestingoptionsintheBCIcontext,andshall be,accordingly,adoptedforfurtheranalysis.
7.1. Lineardiscriminantanalysis
TheLDAisoneofthemostusedstrategiesinBCIssystemsdue toitssimplicityandlowcomputationalcost.In simpleterms,it consistsinfindingthelinearcombinationwthatbetterseparate theclasses,whichimpliesinestablishingadecisionsurfaceinthe
Table1
Incrementalwrappersalgorithm.
Initially,therearek=0andthreesets:T={1,2,...,M}withallfeatures,S=∅ withselectedfeaturesandO=∅withfeaturesonobservation
1. Evaluate,onebyone,theclassifierperformancebycross
validationforallfeaturesofsetT.PutinSthefeaturethat presentedthebestperformanceandremoveitfromT
2. ConsiderallfeaturescomposedwiththeelementsofsetsS
andOandtesttheinclusion,onebyone,ofthefeaturesof setT,evaluatingtheperformanceoftheclassifierbycross validation
3. Iftheclassifierperformanceincreased,selectthefeature thatgavethebestperformance,includeitinSandremove itfromT
3.1 Ifk=1,puttheelementofOinS,makeO=∅andk=0 3.2 IfTisnotthenullset,goto(2).Else,stop
4. Iftheclassifierperformancedecreasedandk=0,putinO thenewfeaturethatpresentedthebestperformancein thelastcomparisons,removeitfromTandmakek=1 4.1 IfTisnotthenullset,goto(2).Else,stop
5. Iftheclassifierperformancedecreasedandk=1,occurred
asecondconsecutivedecrement,sostop Intheend,theSsethastheselectedfeaturesbyincrementalwrappers
formwTx+c=0,foraconstantthresholdvaluec.Forinstance,if weassumetwonormalmultivariatedistributionswithmeans1 and2andcorrelationmatricesC1andC2,respectively,theLDA approach aimstoestablishw thatmaximizetheratio between theinter-classandintra-classvariance,whichcanmathematically describedby: S= 2 between 2 within =(wT(1−2)) 2 wT(C1−C2)w (8)
It ispossibleto showthatmaximization of Sis satisfiedfor w∝(C1+C2)−1(1+2)andc=1/2wT(1+2)[28].Thereare
alsodifferentcriteriathancanbeusedtosetwforobtaininglinear decisionsurfaces,astheoneprovidedbysupportvectormachines strategieswithlinearkernelfunctions.WhenaGaussian distribu-tionisassumed, thecovarianceandthemeanfullydescribethe model.However,non-Gaussianrandomvariablescanbeassumed inthismodel,astheuseoftheirstatisticalstructureuptosecond ordermightbeenoughtosolvetheproblemathand.
7.2. Extremelearningmachines
Structurally,anELMcanbedefinedasamultilayerperceptron neuralnetworkwithasinglehiddenlayerandalinearoutputlayer (seeFig.7).Theparametersoftheneuronsthatformthehidden
layerarerandomlychosen[8],andtheprocessoftrainingthe out-putlayeris essentially equivalenttotheadaptation ofa linear classifier.Thechoiceofthenumberofneuronsintheintermediate layercanbemadebycross-validationmethods.
The model evokes elements of biological neuron operation—input data are weighted representing the synaptic efficiency and the activation function determines the firing (returnsoutput+1)ortheabsenceoffiring(outputreturns−1)of theneuron.Atypicalactivationfunctionisthehyperbolictangent, whichpresentsexactlyanonlinearityofthiskind.
Insimpleterms,thehiddenlayergeneratesanumberof non-linearrandomprojectionsthatmaptheinputvectorspaceonto afeaturespaceoverwhichtheoutputlayeroperatesasalinear regressor.Thecanonical approachistousethemethodof least squares,presentedinSection7.3.TheELMisaninterestingoption inthecontextofBCIinviewofthesimplicityofitsassociated train-ingprocessandofitsinherentregularizationproperties[30,31].
Inouranalyses,thenumberofneuronsinthehiddenlayerofthe ELMwasfixedat20afterpreliminarytests.Thehyperbolictangent wasusedasactivationfunction.Theweightsofhiddenlayerwere generatedusingarandomGaussianfunction.Theperformanceof ELMwasdefinedintermsoftheaverageof20runsforeachsubject toaccountfortherandomcharacterofthenetwork.
7.3. Leastsquares
Themethodofleastsquaresisoftenusedinregressionanalysis. Inthis study,theleastsquareswereusedin twoapproachesof classificationmethods:theLDAandtheoutputlayeroftheELM.
ConsideringthatinaclassifierproblemwehaveasetofN sam-pleslabeledfortrainingandthevectoroftheoutputlayerweights isw,themaincriterionunderlyingsuchstrategyisthefollowing:
minw||Hw−d||2 (9)
beingHisthefeaturematrix,dthelabelvectorusedtotrainthe classifierandwtheweightvector.Thesolutiontothisproblemcan becalculatedasaprojectionofthelabelvectordcarriedoutwith theaidofanoperatorbasedontheMoore–Penrosepseudo-inverse [28].InthecaseofanELM,ifthenumberofneuronsinthehidden layer(M)islargerthanthenumberofavailabledatasamples,there willbemultipleoptimalsolutionstotheproblemshownin(9), andthepseudo-inversehasthedesirableproperty–froma reg-ularizationperspective–ofgeneratingaminimalnormsolution. Inthisstudy,asalreadymentioned,thevalueofMwaschosenin
thenumberofdatasamples(N)islargerthanM,thesolutionis:
w=(HTH)−1HTd (10)
IfM>N,thesolutionisgivenby:
w=HT(H·HT)−1d (11)
IfM=N,wisthesameforbothequationsoncethematrixH
becomessquare.
7.4. Supportvectormachines
TheSVM[9]isalearningstructurethatcanbeusedtosolve classificationandregressiontasks.Inthecontextofclassification, itcanbeunderstoodasa maximal marginclassifierwhose lin-ear/nonlinearstructureisdefinedbyakernelfunction.Thedesign ofa classifier of this kindgives riseto a quadraticconstrained optimizationtaskthatcanbesolvedusinganumberofefficient computationaltools.Inaclassificationsystem,theSVMfollowstwo stages:trainingandclassification.
Inthetraining,labeleddataareusedinordertodeterminethe hyperplaneinahigh-dimensionalfeaturespacethatdistinguishthe classeswithmaximalmargin.Inpractice,thetrainingcanbe per-formedintheoriginaldataspaceusingdifferentkernelfunctions, aslinear,quadratic,polynomial,multilayerperceptron (MLP)or Gaussianradialbasis(RBF)[32].Inthisstudy,theMLPkernelwas selectedafterpreliminarytestswithallthemethods,inviewofits stabilityformultipletrials.TheMLPkernelisdefinedas:
k(x,xi)=tan h(P1xTix+P2) (12)
wherexiistheinputdataandthekernelsparameterswereP1=1 andP2=−1.
Themachinesfoundinthetrainingphasearethenusedto clas-sifynewdataontheclassificationstage.
8. Resultsanddiscussion
Theperformance ofall classificationschemes wasevaluated usingcrossvalidation,therebeingsixtrialsfortrainingandtwo forvalidation.The36combinationsofdifferenttechniquesof fea-tureextraction,featureselectionandclassifiershavebeentestedfor eachperson,consideringwindowingof3s.Fig.8summarizesthe averageperformanceofallclassifierschemeswiththerespective standarddeviation.
Despitetheenvironmentanddataacquisitionhavingbeenkept constant,thebest BCIperformance isvariable accordingtothe individuals;inourdatabasewehad:
• 1subjectwithaccuracyrateof100%,
• 4subjectswithperformancebetween90%and100%, • 1subjectwithperformancebetween80%and90%, • 1subjectwithregularperformanceabout70%.
Theinter-subjectvariabilityisaclassicalcharacteristicofBCI systems,beingcommonlyreportedintheliterature(see[33,34]just tociteafew).Suchvariabilityisassociatedtoseveralfactors,such asageofthevolunteer,cerebralphysiologyandabilityto concen-trate.Furthermore,accordingto[33],someindividualsdonothave avisuallyevokedpotential(VEP)responseadequatetooperatean SSVEP-BCI.
Figs.8 and 9 showthat the performanceof the linear,ELM andSVM classifierswasverycloseforthesubjects(p=0.3992). TheELMs are potentially capableof operating withthesimilar robustnessof linear classifiers,while providing a usefuldegree
Fig.8. Averageperformanceofclassifiersystemswithstandarddeviation.
offlexibility.TheSVMclassifier dependedheavilyonthe selec-tionstage: forinstance, usingall16 channels,theperformance drops significantly of about 8% when compared to best result achieved using selected attributes. The relatively poor perfor-manceoftheSVM,inthiscase,maybebecausekernelparameters werefixed:amoresystematicselectionbasedongridsearchand cross-validationcouldleadtoabetterperformance andwillbe investigatedinthenearfuture.
Regardingfeatureextraction,thestudiedmethodspresented similarbehaviors(seeFig.8),althoughtheuseofWelch’sandSTFT methodsappeartobeslightlymoreeffectivethantheuseofafilter bank(p=0.011).
Fig.9. Performanceofclassifiersystemsforsubjectswith(a)excellent,(b)goodand(c)regularVEPresponse.
Featureselectionstrategiesprovedtoberelevant(p=0.0001), astheuseofdifferentEEGchannelshadaclearpositiveimpacton thesystemperformance.Allthestudiedstrategiesledtosimilar successrates,beingtheincrementalwrappercapableofreachinga slightlybetterperformance(around3%).
FromFig.9(c),itispossibletonotethat,foralowVEPresponse, somecombinationsofsignalprocessing methodsgivea perfor-mancegain.Inthebestcase,thesystemachieves75%ofthehitrate usinglinearclassifierwiththefeaturesextractedbyfilterbankand selectedbywrappers.Ontheotherhand,thesystemperformance dropsforjust45%intheworstcase,when,forthesamefeatures extractedbyfilterbank,nofeatureselectioncriterionisadopted andtheSVMisused(withfixedkernelparameters).Surprisingly, forthesesubjects,themostinformativeelectrodesarenotinthe occipitalzone,asshowninFig.9(c).Thechannelsassociatedwith themotorcortexandparietalzonealsoincludeduseful informa-tiontotheclassifierandappearbeforeintherankingofthefeatures selector.
Intermsofthebestchosenfeatures,Fig.9showsthe perfor-manceofeachclassifiersystemlistingthechannelsusedinthe bestconfigurationsforeachcase.Interestingly,asmentioned,the selectedchannelsarenotalwaysontheoccipitalzone,whichwould stronglyjustifytheuseofafeatureselectionstageforSSVEP-BCIs systems.Also,it canbenotedthateach subjectisassociatedto a specificchannel configuration,which couldvaryaccordingto thefeatureselectionstrategy andtheadoptedclassifier system. Asarule,thereisagainofinformationusingchannelsfrom dif-ferentregions;suchperformancegaincouldbeattributedtothe variabilitybetweenthechosenchannels,sincechoosingelectrodes
fromthesameregioncanleadtoanundesirablebiasrelatedto highcorrelatedsignals.Thisfactcanbeconfirmedbytheselection performedusingwrappers,whichdoesnotconsidertheamount of information present at the channels from a perspective of
togethertoselecttheelectrodesthatgivemoreinformation for thesystem.Asimilardependenceamongelectrodeslocationand features extraction technique was related by [35] for motor-imagery-basedBCIs.
Fig.10ranksthe16channelsinfrequencyorderastheyappear inthebestconfigurationforeachscenario,consideringtheseven subjects.Theoccipitalchannels(Oz,O1andO2)arethemost fre-quent,asexpected[7],appearing14%,11%and9%ofthetimes,a totalof34%.InthesequencearePO7(9%)andCz(8%),thesefive electrodesbeingresponsiblefor51%ofthefrequency.The chan-nelsPz,FCzandP2appearoccasionally,butthisdoesnotmean thattheyshouldnotbeconsidered.Thisfrequencyrankingisan averageamongsubjectsandcouldbeusedtoinitiallyoutlinethe bestchannels.But,foreachsubject,thebestconfigurationis vari-able:asillustratedinFig.9(b),theFCzisarelevantchannelforthis specificvolunteer.
9. Conclusions
Theresultsrevealedthat,forthetwo-classSSVEPproblem,the beststructurewasthelinearclassifierusingtheWelchmethodfor featureextractionandincrementalwrapperstocarryoutfeature selection. This configuration obtainedaverage accuracy around 95%, withwindowing of 3s, for the 7 subjects, reaching 100% forsome,whichisverysatisfactory.Thefeatureextraction tech-niquesshowedtobeequivalenttoestimatethespectralpower.The WelchandtheSTFTmethodspresentedasimilarperformanceanda slightlybetterperformance(6%,approximately)wasattainedusing filterbanks,althoughthisseemstobewithinthemarginoferror ofthesubjects.Featureselectionproveditselftobeanextremely importantstep,indicatingthepresenceofrelevantinformationin theparietal,motorandcentralzones,inadditiontotheoccipital lobe.Theresultsshowthatthethreeclassifierscanbeefficiently usedtobuildanSSVEP-basedBCI.However,theSVMclassifieris verysensitivetothefeatureselection strategy,especially when associatedwithfilterbankfeatureextracting.TheELMsare promis-ingclassifiersinthecontextofSSVEP,deservingtobeconsidered aspartofthecurrentrepertoireofBCIsystemclassifiers,asthey exhibitagoodgeneralizationperformance.Theobtainedresults supporttheuseofELMs,whichcanbeeven moreefficientand promisingwhenmoreclassesareconsidered.
Acknowledgements
TheauthorsthankFINEP,FAPESP,CNPq,CAPES,UFABCandUFOP fortheirfinancialsupport,andProf.Dra.GabrielaCastellano,Dr. RafaelFerrariand Ms.HarleiLeitefortheirimportanttechnical assistance.
References
[1]J.R. Wolpaw, N.B.D.J. McFarland, G. Pfurtscheller, T.M. Vaughan, Brain–computerinterfacesforcommunicationandcontrol,Clin.Neurophysiol. 113(6)(2002)767–791.
[2]J.D.R.Millán,etal.,Combiningbrain–computerinterfacesandassistive tech-nologies:state-of-the-artandchallenges,Front.Neurosci.4(2010)1–15,http:// dx.doi.org/10.3389/fnins.2010.00161,Article161.
[3]A.Nihjolt,D.Tan,Brain–computerinterfacingforintelligentsystems,IEEE Intell.Syst.vol.23(3)(2008)72–79.
[4]G.Dornhege,TowardBrain–ComputerInterfacing,MITPress,UnitedStatesof America,2007.
[5]N.F.Ince,F.Goksu,A.H.Tewfik,S.Arica,Adaptingsubjectspecificmotorimagery EEGpatternsinspace-time-frequencyforabraincomputerinterface,Biomed. SignalProcess.Control4(3)(2009)236–246.
featuresofP300event-relatedpotentials(ERPs)forbrain–computerinterface speller,Biomed.SignalProcess.Control5(4)(2010)243–251.
[7]Y.Wang,X.Gao,B.Hong,C.Jia,S.Gao,Brain–computerinterfacesbasedon visualevokedpotentials,IEEEEng.Med.Biol.Mag.27(5)(2008)64–71. [8]G.B.Huang,D.H.Wang,Y.Lan,Extremelearningmachines:asurvey,Int.J.
Mach.Learn.Cybern.2(May(2))(2011)107–122.
[9]C.J.C.Burges,Atutorialonsupportvectormachinesforpatternrecognition, DataMin.Knowl.Discovery2(2)(1998)1–47.
[10]R.Kohavi,G.H.John,Wrappersforfeaturesubsetselection,Artif.Intell.97(1) (1997)273–324.
[11]I.Guyon,A.eElisseeff,Anintroductiontovariableandfeatureselection,J.Mach. Learn.Res.3(2003)1157–1182.
[12]D.L.Davies,D.W.Bouldin,Aclusterseparationmeasure,IEEETrans.Pattern Anal.Mach.Intell.PAMI-1(2)(1979)224–227.
[13]C.S. Hermann, Human EEG responses to 1–100Hz flicker: resonance phenomena in visual cortexand their potential correlation to cognitive phenomena,Exp.BrainRes.137(3–4)(2001)346–353,http://dx.doi.org/10. 1007/s002210100682
[14]G. Bin,X.Gao,Y. Wang,VEP-basedbrain–computer interfaces:time, fre-quency, and code modulations, IEEE Comput. Intell. Mag. 4 (4) (2009) 22–26.
[15]KianB.Ng,A.P.Bradley,R.Cunnington,Stimulusspecificityofasteady-state visual-evokedpotential-basedbrain–computerinterface,J.NeuralEng.9(3) (2012)036008.
[16]D.Regan,HumanBrainElectrophysiology:EvokedPotentialsandEvoked Mag-neticFieldsinScienceandMedicine,Elsevier,NewYork,NY,1989.
[17]L.J.Trejo,R.Rosipal,B.Matthews,Brain–computerinterfacesfor1-Dand2-D cursorcontrol:designsusingvolitionalcontroloftheEEGspectrumor steady-statevisualevokedpotentials,IEEETrans.NeuralSyst.Rehabil.Eng.14(2) (2006)225–229.
[18]G.Pfurtscheller,C.Neuper,Motorimageryanddirectbrain–computer commu-nication,Proc.IEEE89(7)(2001)1123–1134.
[19]G.tec, G.tec Medical Engineering, 2015, Available http://www.gtec.at/ [accessedJan.,2015].
[20]B.Graimann,B.Allison,G.Pfurtscheller,Brain–computerinterfaces:agentle introduction,in:Brain–ComputerInterfaces,Springer,SpringerBerlin Heidel-berg,2010,pp.1–27,http://dx.doi.org/10.1007/978-3-642-02091-91 [21]G.G.Molina,D.Zhu,Optimalspatialfilteringforthesteadystatevisualevoked
potential:BCIapplication,in:FifthInternationalIEEE/EMBSConferenceon NeuralEngineering(NER),2011,pp.156–160.
[22]O.Friman,I.Volosyak,A.Graser,Multiplechanneldetectionofsteady-state visualevokedpotentialsforbrain–computerinterfaces,IEEETrans.Biomed. Eng.54(4)(2007)742–750.
[23]P. Martinez, H. Bakardjian, A. Cichocki, Fully online multicommand brain–computerinterfacewithvisualneurofeedbackusingSSVEPparadigm, Comput.Intell.Neurosci.(2007)13–22,http://dx.doi.org/10.1155/2007/94561 [24]C.M.Bishop,NeuralNetworksforPatternRecognition,ClarendonPress,Oxford,
NewYork,1995.
[25]M.H. Chang,K.S.Park,Frequencyrecognition methodsfordual-frequency SSVEPbasedbrain–computerinterface,in:EngineeringinMedicineand Biol-ogySociety(EMBC),35thAnnualInternationalConferenceoftheIEEE,2013, pp.2220–2223.
[26]S.S.Haykin,AdaptiveFilterTheory,PearsonEducationIndia,NewDelhi,India, 2008.
[27]P.D.Welch,TheuseoffastFouriertransformfortheestimationofpower spec-tra:amethodbasedontimeaveragingovershort,modifiedperiodograms,IEEE Trans.AudioElectroacoust.AU-15(June)(1967)70–73.
[28]S.Theodoridis,K.Koutroumbas,PatternRecognition,Fourthed.,Academic Press,London,UK,2008.
[29]C.M.Bishop,etal.,PatternRecognitionandMachineLearning,Springer,New York,NY,2006.
[30]A.Bamdadian,C.Guan,K.K.Ang,J.Xu,Improvingsession-to-session trans-ferperformanceofmotorimagery-basedBCIusingadaptiveextremelearning machine,in:EngineeringinMedicineandBiologySociety.35thAnnual Inter-nationalConferenceoftheIEEE,2013,pp.2188–2191.
[31]L.Duan,H.Zhong,J.Miao,Z.Yang,W.Ma,X.Zhang,Avotingoptimizedstrategy basedonELMforimprovingclassificationofmotorimageryBCIdata,Cogn. Comput.6(3)(2014)477–483.
[32]N.Cristianini,J.Shawe-Taylor,AnIntroductiontoSupportVectorMachinesand OtherKernel-basedLearningMethods,CambridgeUniversityPress,Cambridge, UK,2000.
[33]B.Allison,etal.,BCIdemographics:Howmany(andwhatkindsof)people canuseanSSVEPBCI?IEEETrans.NeuralSyst.Rehabil.Eng.18(2)(2010) 107–116.
[34]B.Z.Allison,C.eNeuper,CouldAnyoneUseaBCI?,Brain–computerInterfaces, Springer,London,2010,pp.35–54.
[35]S.A.Park,H.J.Hwang,J.H.Lim,J.H.Choi,H.K.Jung,C.H.Im,Evaluationof fea-tureextractionmethodsforEEG-basedbrain–computerinterfacesintermsof robustnesstoslightchangesinelectrodelocations,Med.Biol.Eng.Comput.51 (5)(2013)571–579.