• Nenhum resultado encontrado

A Novel Approach in Text-Independent Speaker Recognition in Noisy Environment

N/A
N/A
Protected

Academic year: 2017

Share "A Novel Approach in Text-Independent Speaker Recognition in Noisy Environment"

Copied!
12
0
0

Texto

(1)

!

"#$ %

) 1 (

) * + +%

) 2 (

) 1 % & ' ( ( ) % * + , ( '

) 2 . % ( / 0 1% *234 % % / 56 % * 76 %

8 9 : , 1 / 5 / 1392 := > 8 9

1 / 3 / 1393

:.$/0

? 7 @>1 A3 6 = . % C., D % E CF 9 3 G.3 % HI.J3 K 1 D C L I3 2 CK I3 G %

MN ' 1% C / O PI9 ?4 F % 2DQ R% S. % % C'% M HE% 1 % / N H3 9

( 3 23 H T 2 . / CU F % U . '

3 , C V ' , R% S. % % .)D C PE W % C'% M 23 2

C X = M T% . 2DQ *

( 3

YZ % [ 2 9 ' MFCC

2DQ ( 3

23 Z ' 9 U % ). % ? 6 3 / \ . .

CIA] ^ % '

MLP 23 ( 6 C'

( K 1 6 = D 2 _ G 5 3

33 / 97 C 5 AJ % 2

2 H 33 / 61 C AJ C' % %

= 23 ( 6 A % .3 .

: 12 3 +12

Z % [ * D 2 MFCC

V ' , * 3 , *C

*( 2 . / * MLP

.

A Novel Approach in Text-Independent Speaker Recognition in

Noisy Environment

Nona Heydari Esfahani(1) - Hamid Mahmoodian(2) (1) MSc – Persian Foolal Company, Isfahan

[email protected]

(2) Assistant Professor - Department of Electrical Engineering, Najafabad Branch, Islamic Azad University [email protected]

In this paper, robust text-independent speaker recognition is taken into consideration. The proposed method performs on manual silence-removed utterances that are segmented into smaller speech units containing few phones and at least one vowel. The segments are basic units for long-term feature extraction. Sub-band entropy is directly extracted in each segment. A robust vowel detection method is then applied on each segment to separate a high energy vowel that is used as unit forpitch frequency and formant extraction. By applying a clustering technique, extracted short-term features namely MFCC coefficients are combined with long term features. Experiments using MLP classifier show that the average speaker accuracy recognition rate is 97.33% for clean speech and 61.33% in noisy environment for -2db SNR, that shows improvement compared to other conventional methods.

Index Terms: Speaker identification, MFCC coefficients, pitch ferequency, formants, Shannon entropy, MLP

4 5 2 ) % 1 : *

0 1% 234 % % / 56 % * /

[email protected]

(2)

1 ) .

.JD 0U.S3 ' 3% C % ,% e S69 C %

2DQ H K C ? 3/ F Q ' * X f X

%

% % ,% e S69 h Id9 G % L % % @ . % C., e S69 ic d3 / .)D 23

= % 27 C'

. % % ,% e S69 M .3 2U % UT C7 % CF 9

.J PX G h P9 L T * D 2 ' X

5K% 2 3/ = C 2 . * % 2U T H I3 L I3

( , 7 3 Hd3 ?% j9 H3 ( D D H3% T Q *

2D '

? T [ 3 % 27 * k ( 3 .)D *2P d3 * I. % U' ldA3 2

23 m Jd3 D .

K , ( ' 9 2DQ 2 .A3

% .Jn' H3 Z % [

2J ' ,

1

L % D 2 1

c % C' %

E 2 / ' ( % 2K AE H

] 1 [ % ( N 1 G % .

Z % [ R% S. % MFCC

H3 & I3 2 .A3 .U , M M %

23 ). % 2F X *

? C .U , M G % .U ,

d9 X% 7 k op9

23 % E 2DQ G % G % . D %

% X H I3 2 + 3 I3 7U T ,% lT .J

c d3 .J 23

.

= % 27 H A9 % ). % o% CAUk % % .3

H7 C MF 3 . % 0U.S3

2012 e S69 .J M

c .3 % ). % D , Y D

CU F N Z % [ % %

3 H A9 (/ Z ' 9 6 *MF

AJ 1 9 C' %

. ' I3 C C 5 AJ r ' 3% % L I3 C % 5 2 23 ,% ? C (/ 2 / ' *H ' ] 2 [ . 2009 M

(/ C' % 6 C L I3 D 2 =

R% S. % 4T Z % [

MFCC CU F N Z % [

* % % H A9 MF 3 ) C.JJD DWT ( ). % 5 G % . %

Z ' 9 5 7 Z % [ Vn %

D e S69 .J

15 % ). % D 2AsT C7A

MLP % ] 3 [ . \ . D e S69 _ A ) % %

2 D 25

2DQ % Z ' 9 G % C AJ

Z % [ MFCC

CU F N Z % [ 23 ( 6 %

. 2010

.J

Z % [ % ). % D e S69 MFCC

% R% S. %

' MF 3 CIA] c 9 Ad 5 '

HMM

. (% 3 C e S69 _ r % ,% \ . 6

/ 0 % %

r % ,% ( c % 4

2 D ) 2P % % %

C 5 AJ O F 20

23 ( 6 H 2

] 4 [ . 2011 Khaled Daqrouq % * 30 ( 2 . / Z [

H A9 % 3/ C MF 3

'

2

C V ' , % C 5

4 (% T C Ad 5 % R% S. % % 3 , V ' ,

2AsT C7A C MLP

G.3 % HI.J3 D 2 F C .)D ?4 F e S69 _ . ' ). %

= c 9 3/

( c % Ct% % 09

/ 91 H A9 ( . % % MF 3

C.JJD

3

O F 2 D ) % 9 C., D 5 % C 5 AJ 2

\ . hA] C' L I3 H 2

e S69 _ G 5 3 MF 3

uP 4 % % C 6 D 58 / 39 C %

% 3/ ]

5 [ . 2DQ Z ' 9 *CK I3 G % @ 2 .A3

, 2DQ rS M % R% S. % YG.3 % HI.J3

9 % .)D op9

D e S69 .J 2 / '

1% M % ). % . r ' G.3 % HI.J3 C.J CT 3 Q (% . C' .)D 2D

MFCC ' R% S. % (/ % %

2DQ R% S. % % 3 % * Z 3 G.3 % HI.J3

1 G % ? I Id9 C' % 2K 1 G % . % C., D % E CF 9 , ZUk% ? AT CU F M H' % % ' 9

23 ). % 2DQ R% S. % % 3 Y .)D 1% % .

23 3 29 .)D C PE ). % CU F M O PI9 % C'

? 7 @>1 23 C

* .)D 1% G % % ). % . /

Z % [ MFCC , 2 .A3 = % C' 23 C

/ 2DQ C 3/ % C PE 29 ( ) , ( H3 Z % [

D ( 2 . / MF 3

N V ' , C V ' , * '

Z ' 9 5 7 % 3 , M .)D rS % 9 %

2DQ Z ' 9 ( 73% w 3 C . H 769 2DQ % 2 .A3

, C X . 5K% % 29 H , 2 .A3 Kmeans

YZ % [ MFCC ). % D

. %

CIA] F 2DQ % C + N ( .n 2AsT C7A C

) (MLP 23 % PX 6. % V . 5K% .

) rS % . % . % X Ct% % ZK P3 CK I3 G % C3% % 2

(

% X 2 D e S69 .J

) rS . 3

2DQ (

C., ' C x% s.S3

% u [ 9 ) rS . %

4 = 2 C (

CIA] C7A MLP rS 6 = . % C.X%

) 5 % u [ 9 ( ) rS %

6 % ( ? 6 3 / \ .

) rS . % X 3 L % 7

. ld C ( C

D

23 ?% 6 Ct% % . % 2 ) 6 7

Ad C' % 2sS e S69 ' X * D e S69 23 sd 3 ? T4]% & % ' C

D * .)D 5 R 3 ,

23 2 % e S69 C. C D e S69 .J . '

D h s9 D 23 JI9

] 6 [ . .J

C e S69 y%

" 23 Ad 2J' CN z '

" 23 8 .

.J C' 2K 1 % h s9

"

( D /

23 T % C' % 2J' z X '

" 23 { P3 .

.J D C. C 5 2 D % D e S69

4

D C.J

5

23 JI9 ]

(3)

D 2 r % C' % % D ' 2, 3 & (% T C

CT 3 C d3 % % .)D (/ C' 2 .J % 23 3 C.J D * % C.X ( D D G | .

% % .)D C7 / & % D e S69 X % L U 3 r %

23 JI9 G.3 % HI.J3 G.3 C C.J % rS C Z 9 9 C ] 6 [ . = 3/ CU1 3 % % D e S69 .J e S69

23 , ( D D Ad 5 = 3/ CU1 3 .

D % 2 F 3 *(/ % 2DQ R% S. % % 23 C.X

.

23 OF 3 G % CT 3 3/ 3 M % 9

. 5K% % %

= 3/ CU1 3 % C' OF 3 5K% CJ I3 & % *e S69 CU1 3 H 1 * Ad 5 % 3/ C 5K% %

23 2 w 3 D

.

3 ) 9 : ; 7

S. % % % ? sS63 ( / C w 3 C 2DQ R%

.3 % % ( D D 2 X C % . C' % Ad 5 2DQ . e S69 % Ad 5 % R% S. %

2t F ?% j9 % % w 3 D % * D CU , 1 G T (/ 4T . C. % ( D D

2DQ % *G.3 % HI.J3 2 % C' ' ). % 2

2 % / % .d3 C J1 G . '

6

. C. % % Ad 5

2DQ % CK I3 G % 2 .A3 % .Jn' Z % [ 3 0U.S3

H3 3

7

D ( 2 . / Z % [ * MF 3

V ' , * '

3 , V ' , C % u [ 9 C3% % C' % ). %

23 .

3 ) 1 ) ) < ;= > 5 ; ?2 @ ;A (

Z % [

MFCC

2DQ % .)D l1 A3 5K% e S69 \ %

f% X % H3 3 2 .A3 % .Jn' Z % [ 2U % % . % D C., D .)D } , ( J % = % ). % . %

AJ V ' , G d3 ? T4]% C .6 = % H3 X 23 V ' , + d3 ? T4]% C e S69 E & I3 G % .

.J % 23 r % ,% D e S69

] 3 [ .

MFCC

n' Z % [ C., A ^ H % 23 % .J

G % C7 % C CF 9

, 2 .A3 9 ' ( 3 2DQ 2T Z % [

8

?% j9 * %

23 ( 6 X % 9 ' ( 3 C U' *R% G 9 %

. % Z 3 D

Z % [ CA d3 %

MFCC

, Ad 5 % . %

23 %>D 9

r ' , . % ^ 25.JJD

23 T% 2UAE CU1 3 C . C , H A9 L D . .

23 H A9 V ' , 1 C ( 3 1 , G % . D

V ' , ( 3% 0 ] W % CA d3 % M .U , % Vn 23 ). % eS63 M .U , G % .

( 3 .U , % 9 H3 N H3 & I3 . % 2PX ? C C' % >D

%

C] 3 .U , ' 3 V ' , % % .U , 5 9 U ' M 1 9 H3 & I3 .H3 & I3 x A I9

V ' , % 2PX 23 2 . 5K 9+

I3 . 9 &

) CP % % ). % 1

23 H A9 H3 & I3 C ( .

f 2595 log 1 (1)

23 AT 2~U~3 >D V ' , M % % 29 5 M .U , .

Z % [ *M .U , % Z % [ ( % >D %

MFCC

CP % % ) 2 (

23 CA d3 .

c ∑ # log S m!"cos %& m ' 0.5!* (2)

(/ C' i=1,2,…,P *

P

Z % [ % 9

MFCC

*

N

.U , % 9

M .U , S(m)

2F X m H3 & I3 M .U , .U , G 3%

Z % [ ( / C % CK I3 G % . %

MF CC

% % CA F %

voicebox % ,% L C.

MATLAB . % ). %

, T%

• ? C T%

2~U~3 ? C H3 3 2 .A3 .U , M .U , C., D w . %

3 ) 2 ) * B ; C @ ;A

H A9 MF 3 C€ 9 C€., € 9 €' M€F 3

( €73% C€' € %

23 % 5 H Ud9 % .6 H Ud9 . '

MF 3 5 *

Z I9 CX C

9

? t F

10

23 JI9 € € G€ % €

23 % 79 Z I9 CX % .

n % € 9 *uP€ n+1

G€7 3 €J3

% F 5 %>D ' C 9 % H Ud9 .

MF 3 * €'

23 % ? t F €' G€ % H€ 1 . €' C€ 9 ? A I9 (% 9

% r C 2 .

n

2 . % 5 %>D ' % ? ).3 =

C 9 I I1 MF 3

C D = C C 9 % % MF 3

. % '

H A9 G % MF 3

( ' K / % . r ' 5

H A9 C AJ MF 3

23 Ct% % .

23 G % R% S. % % (% 9

2DQ ' C 9+ e S69 _ w 3 C 2, [% 3% .

X h T r % ,% MF 3

CIA] % 3 ( 3 * '

23 2PX k % 5 Z 3 C Z % [ r ' C .

3/ % MF 3 Z % [ w 3 G . % [ 3% '

D ( 2 . / MF 3

= % 27 C' ' Z 3

2DQ % r ' % %

] 7*5 [ 23 CA d3 .

C Ad 5 % . % *( 2 . / Z % [ R% S. % w 3 C H A9 CU MF 3 9 2sS63 uP ' Vn C

D C % ( 2 . / CK 3 % ). % uP (

) 3 23 CA d3 ( .

El s! ' ∑ S log S ! (3 ) (/ C' S

Ad 5 Si

H A9 Z % [ MF 3

'

23 . 2DQ F 2DQ G % % Ad 5 U ( 3

2 CA d3 * 9 ' 2 3 H % , x+ 3 .

(4)

3 ) 3 ) 9 D ;E F 2;E . B F 2;E ! !

.)D K 9 2] 29 9 = 9 % V ' , C V ' , 23 C % % ( F C' 23 5 29 9 r6' D% .

23 AT 1 '

ZA cI, 23 * L% / ?

% VK % 9

23 ZA VK G % F . ' K 9 M CA % 3 C'

? sX *C V ' , . ' K 9 % % % @ 1 % . 29 % % @ 1 ( L 5 % D 29 9 ? M

23 ( 6 D 2 %

oy3 23 OE% X .

? ).3 23 9 / .X H K C *C V ' , ( ) sd 3 HI.J3 2DQ M 9 C C V ' , . % D ) sd 3 2DQ M (% T C .)D ? I Id9 C7U % G.3 % oy3

% ' J D J F e S69 ]

8 9 [ .

% Q C % / 2X 0 ] * Ad 5 M R% 3% 6.

C'%

11

23 L% % . H7 .)D

% % 2 3

23 % % % @ 1 K 9 [ 69 % 2I] 3 G % . '

23 H7 29 % 3 69 0 ] 69 G % ( 73 . D

2J ' , 5.J 2 H7 ^ C % 3

29 D %

.

V ' ,

V ' , * % '% .)D ] 29 69 3 3 ,

23 3 , . 23

e S69 % *C V ' , 3 % 9

' C J F ]

10 [ 2.1 W K , *M U3 K 1 G |

% D 21 ?+ 1 23 ( 6

] 11 [ . L + C' % '• C

3 , C V ' , 2DQ F

Ad 5 U ( 3

x+ 3 .J 2DQ Z ' 9

23 ). % .

= V ' , C V ' , ( / C % 2)U.S3

= . % Ct% % 3 , L .Jn'

] 12 [ * X O 9 25.JA

] 13 [ M 3 m ‚U 1 0 ] ]

14 [ G 9 ' CU F %

) 2) ] (% 9 2K 5N . 5K% = % CK I3 G % . .J PSD

12

(

. % ). % 3 , V ' , C V ' , G S9 %

. 5K% PSD % CU1 3 H3 ]

5 [ % . % : PSD ) Pxx

). % (

% = Yule-Walker 2.6D X

13

) AR 23 G S9 (

' 3 L CU1 3 Vn 23 % e S69 2Ud3

= .

Yule-Walker (/ 2F X C' % .3% 0 ] G S9 = M

) Pxx

% 2 S9 ( PSD

5 x 23 .U , 3 = G % .

3 c 9 % 2.6D X 2PX 2t 56 PX ( '

23 AJ 5 C ? 3 HE% 1 H7 C 6 2t 56 .

8 C 3% ? 3 *= G % CU C 3/ C 2 S9 0 ] X 3 G % 2J ' , C7 % % V . % 2.6D

PSD G S9

) CP % % ). % (/ H[ )9 * 4

23 CA d3 ( .

D P.. i 1! ' P.. i! (4) (/ C' Pxx(i)

Pxx(i+1)

, 2J ' , 0 ] 2K 5N % I3

. .J 2K% .3 ' 3 D % E Hd3 Vn

2 F 2 *2Ud3

C' D 23 34T j9 2) 3 C A~3 %

23 2 *

G K % . , ' eS63 CUE

CUE C I C V ' ( 6 Z 9 9 C

3 , . .J L N 9 %

4 ) # 7 = = . >H MLP

C7A

MLP

@ P % H K C 2 % 9 * % >

CIA] C7A . % 5K%

MLP

M CK I3 G % ). %

+ M H3 *C + N C7A C + M 2)S3 C + * C

23 2F X 2 % 9 r % ,% H K C ( C + % ). % .

2DQ % 2A ' 9 5K% C7A D . % 0U.S3

23 % = 3/ PX 6. % V . 5K% A3 C7A G % .

2F X = G % 2F X) 2 E%

(C7A 2F X

23 CJ I3 m UP3 (

C * 6. % V . 5K% CU C

23 w 9 ? w d9 ? 9 D

. / F C Z 3 5K%

^ 3

O 3 * PX

E

* 2F X G m UP3

2F X 2 E% % 23 9

( ) CP % ? C C7A 2F X ƒ + 5

23 ( ( ]

15 [ .

E ∑0# D0' Y0! (5)

C' n

D

n

Y

2F X Z 9 9 C ( 2 E% m UP3

n

L%

.J 2F X

N

. % 2F X ( % 9 @

r '

E

23 w 9 2K ( % D = C 3 % I3 C . D

K 3 ƒ ( ( / C %

= 3/ c % ? C

) 6 ) ( 7 % ( ] 15 [ :

w3 t 1! w3 t! ηΔw3 t! (6 )

Δw3 t! '7:78;<9=!! (7 ) C'

η

% D _ G + 3

0 1 23 m S. % *

wij(t+1)

F ( wij(t)

23 2UAE (

n

E

O 3 ^ 3

5K% % 2F X PX

n

. % L% 23 5 D

23 0E .3 * PX H' ^ 3 C'

E

* 5K% C % % I3 %

H' % 9 . ' G 9 ƒ . / /

= 3 ( C

] 15 [ .

5 ) " B J

6 H1 % Z 3 Z ' 9 % ). % *h Id9 G %

2DQ 2DQ , 2 .A3 2 3 ? PE 2 .A3

%

R% S. % % C' % 6 % . % .)D 2DQ CU F H' C 79 F C * U ( 3 ? PE % ? AT

23 C PE . .J % / % 9 H3 C' ). % .)D % 9

M H3 D Ad T 2 % / % .d3 C C.J C'% C' % 3 . 9 C PE ] ( 9 ' 2U X

23 r % ,% C U' M 2. ? C ?4 F O PI9 .

C 79

. % C., D L % Ad 5 „ H7 29 % .d3 ) H7 1 23 ( 6 % % F rS M CU F M ( .

(5)

) H7 1 G N cX :O PI9 d ( D % F 3 ?4 F % 27 :( M

23 ( 6 % O PI9 C 1 .

Fig. (1): A sample utterance of dataset and segmentation process: dot lines correspond to a segment

2DQ Z % [ h Id9 G % ). % 3 C K % MFCC

. .J

2€DQ G 9 €' (% € T C Z % [ G % € D e S€69 €

23 C.X , % V

5 C PE Y Z % [ Ad

MFCC 3 , %

2€3 CA€ d G€ % % % € 9 C€N C€7 % . €

€3 C€K I3 G€ % € € % Z 3 D e S69 % Z % [ % Z % €[ % 9 . % C., D % E 2 12

% r % €,% \ €9 C€

. % C., D % E 2 3 .J 2 / '

% C., D % E ). % 3 CK I3 G % C' 5 2DQ Z % [

D ( 2 . / Z % [ . % 2 . / MF 3

( '

,

C PE % 5

.)D CA d3 23 . Z % [ G % 23 R% S. % C PE % 2.1% C ? PE % L% ' ( N C. H3 2 % '% % / % % Z % [ G % * .J }% Ad ? sX 2 X C % / C % / M % >D % D 23 r 5 % . % *( 2 . / Z % [ CA d3 % . H A9 Ad MF 3 ^ N uP ' Daubechies ) db1 23 C 9 ( uP 5 C 9 . 4 * 30 C D 23 / C' Z % [ ( 2 . / 23 9 G % D CA d3 23 . 5 ) 1 ) D ;E . B F 2;E : ; 7 ! 2DQ 5 % % 3 , N V ' , C V ' , 3 . 5K% % / CA d3 % C' .J = G % 5 ). %

PSD ). % 23 . G % . 5K% V ' , 3 , V ' , C % % , ( ) .)D C PE ( R% S. % 23 . ' C G % w 3 C L I3 2 W % U C'% M .)D C PE % . % 23 2 3 , C V ' , 9 . R% S. % (/ % x+ 3 € %>€D C .€ / 2T € % % '% 21% 2 % 2DQ 23 ). % V ' , ( 3 ( €3 W % O 9 % ). % . 5 € .' , % .Jn' M C 3% * ) % AT _ * 9 ' = G % . % % .3 ' G % % .)D € o C .€ / € 2 .A3 €w €3 € .3% m S. % C . / I3 h E G 9 .J . % % t 1 J %>D C . / F * % .Jn' M C 3% * W % 2 .3% N Z ' 9 % CK I3 G % % % '% 21% 2 F * .' , ) % AT _ 2 .3% C CF 9 C C . / I3 . % ). % }% CA d3 U 9 ( D % 3/ M C … 3 C . / C7 / G [ . % 2I, ? C % .Jn' 14 23 w 9 M C 3% C' 2 / % . C Vn ? C % . % % '% rS M % .Jn' 23 j9 2K ? * % '% 21% ' 1 ' G % * ' x t% 23 2 C >D ?% j9 9 rS % 1 . † dK (/ C V ' , ) CP % % 9 ' ( 3 W % , % 8 23 CA d3 ( (/ C' En m E W % n *L% W ] s(i) .)D C . % x+ 3 2 21% % 9+ % '% 21% W % uP . % }% E0 ∑ s i!># (8)

2 21% C * % .' , 5 3 ' .)D }% = % .' , CA d3 % . % % ' G % r % ,% Kat'z ] 16 [ ) 3 , % .' , = G % . % ). % 9 ( d3 23 CA : FD0 @A@ABCBCD!E! (9)

(/ C' FDn m E .' , n *L% L 5 2 d 3 H' ] % 2K% .3 … I G CU , ^ 3 ( d CU , 5 C G K % G 2 UE% C CU , G .6 C' % % . % % L d ) c % ? C 10 ) ( 11 23 CA d3 ( : L ∑>J GH s i 1! ' s i!! 1I # (10)

d maxNdistance s 1!, s i!", i 2, … , WT (11)

% 9 ? C C' % ) % AT _ * ' j9 ? , 23 0 9 5 34T * % .Jn' M CA d3 % . 23 CA d3 % .3 = % 5 L .Jn' % . % . CU , % .Jn' M C 3% Vn 5 / 2 9 15 2U 3 2 C o 23 ) c % ? C C . / I3 . 12 ) 9 ( 15 23 0 9 ( : Th1 k1. mean E! (12)

Th2 k2. mean FD! (13)

Th3 k3. median ZCR! (14)

Th4 k4. median CP! (15)

(/ C' mean median ( 6 Z 9 9 C 3 G 5 3 C I3 C . / I3 G % CA d3 d . .J 3/ k1 9 k4 C H Ud9 L % % V % 2 9 ? % 3/

C

2 / % '% 21% e S69 % . % 3/ I3 *}%

CP,

ZCR, FD, E C . / I3 G % E % ). % CA d3

23 CJ I3 G % 2F X .

G % E 2AJN % C' eS63 23 '

2 % '% .)D *? 7 C 1 C … 3 , . % }%

(6)

•If (En<0.5.Th1) ˄ (FDn>Th2) ˄ (ZCRn>Th3)→ Silence

•If (0.5.Th1<En<Th1) ˄ (FDn>Th2) ˄

(CPn<Th4)→Unvoiced Speech

•If (En>Th1) ˄ (FDn<Th2) ˄ (ZCRn<Th3)˄ (CPn>Th4)

If (CPn-1<Th4) ˄ (CPn+1>Th4)→ Start of Voiced

Speech, Th4=2.Th4

If (CPn-1>Th4) ˄ (CPn+1>Th4)→ Voiced Speech

If (CPn-1>Th4) ˄ (CPn+1<Th4)→ End of Voiced

Speech, Th4=0.5.Th4

If (CPn-1<Th4) ˄ (CPn+1<Th4)→ Unvoiced

Speech

rS G % E G % 2F X 23 eS63 % + W % % '%

'

C'% ( 6 C' 23 .J

N V ' , C V ' , (% 9

. 5K% % ). % % % 3 , PSD

C'% % . ' CA d3

% 3 3 C PE C' C'% * C. % C'% M % r w

) H7 . % C., D % E CA d3 A3 .6 ] 2

C PE M (

23 ( 6 % 6 = (/ % R% S. % C'% .)D cX C' G N *H7 G JE . % eS63 T

PSD 23 r % C'% G % 3 .

CA d3 % % M E

M C I E 3 C V ' , 3 , CA d3 %

3

. % C., D % E ). %

) H7 2 ) (/ % R% S. % C'% .)D C PE M (

a

% C (

PSD

(/

) C'%

b

(

Fig. (2): A sample speech segment and the extracted vowel (a) with vowel's PSD (b)

5 ) 2 ) 9 @ 2; . >H !

=

w 3 C Z ' 9 C.

*2DQ 2 2DQ , 2 .A3

2DQ 29 ? PE 2 .A3

% . 5K% C X Kmeans

Z % [ MFCC 23 ). % M .)D C PE % .

23 C., D w C X Y ' 3 1 2 2DQ % G % .

35 H3 29 C PE 2 .A3 2DQ 30

*( 2 . / Z [

C' % % 3 , N V ' , C'% M C V ' , 12

Z [ MFCC 23 Z ' 9 29 C PE (/ C X ' 3 C … 3 .

( D % YrS % C' 2 2DQ % ^ 3 23 R% S. %

% M 47 % ). % ? C' % eS63 . % X

Z [ .6 % 9 MFCC

. , % X r % ,% 2DQ %

23 % 2AsT C7A C (% T C 2DQ % Vn .

CIA] F C., ' C 2AsT C7A ( .n C7A *

C *C +

23 2)S3 C + % 9 .

D % % % % C +

% C' % 2DQ 47

23 ( % 9 . % 2)S3 C +

% % 25.J ( D D ^ 9 C L 15

r 3 / 3 D

23 C 2 9 = h Id9 G % CU1 3 G N % V . /

/ 7U T G . *r 3 50

% 2)S3 C + ( 40

(

. % L 2)S3 C + C 2 . % 2 9 ? 6 3 / \ .

) F 2AsT C7A 7U T G . 1

2 . % 3/ (

2DQ % C … 3 48

. % ( K 1 6 Y

63 C' P 23

( % 9 .6 r % ,% C 3

2 2 _ 2 Jd3 j9

% D ( 3 9

23 r % ,% x [ .

C + % .6 % 9 % ). % C 2)S3

. J 3 * J9 = 3/ CU1 3 ? A d3 A % 1 H K % ( N 2AsT C7A 2F X 23 C'

% 9 16 0U.S3 K 1

% % C A D e S69 .J L% D } U . '

) H7 h P3 3

G % ). % 2AsT C7A .X . % (

) H7 L% D } U 4

(/ C' % % ( 6 ( NH

% 9

( 23 ( % 2)S3 C + PX % . '

C7A = 3/

) H7 2AsT 5

. % % ( 6 (

Table (1): Experimental results for neural network to achieve the best performance

) F 1 ( 3 / \ . :( 7U T G . C 2 . % 2AsT C7A 2 9

C7A .3% C7A .3%

3 D e S69 _ G 5 ZJ1 = 3/ ( 3

C o

C + % 9 2)S3

1 % 2)S3 C + 40

(

L 2)S3 C + 30

(

L 2)S3 C + 20

( 80

98.4756

2 85.33

242.6515

3 86.66

518.0851

( % 9 % 2)S3 C +

40 2)S3 C + % %

2)S3 C + L

30 ( 85.33

242.6515

45 86.66

348.6753

50 88

384.6829

55 89.33

501.5637

( % 9 L 2)S3 C +

30 2)S3 C + % %

% 2)S3 C + 50

( 88

384.6829

35 90

412.0913

40 97.33

498.3161

45 98.66

617.1348

0 50 100 150 200 250 300

-10 -5 0 5 10

speech signal

Time in ms

a

0 20 40 60 80 100 120 140

0 1

x 10-4

b

(7)

) H7 3 .X :( D e S69 .J

Fig. (3): The structure of speaker identification system

) H7 4 2AsT C7A .X :(

Fig. (4): Neural network structure

) H7 5 2AsT C7A = 3/ PX % :(

Fig. (5): Neural network training error curve

6 ) 3 C L

JE G % = % ). % / H Ud9 ? 6 3 / \ .

23 9 . % 3/ D e S69 % .3 = N 6 J9 ) K 1 ( K 1

AJ O F 2 D L % 0U.S3 C 5

. %

6 ) 1 ) !

% / 5 H3 r 3 % % C .)D

15

H3 D

8

( 7 % 5 % C' % 3 TIMIT

R% S. % G % . %

% C V ' , %

16 E 9 U ' bit

16 ? C

H , wave O F / D . % 10

' ( % CU F

C' % 2 CU F } .63 ( D D C G

8 5 CU F

. .J ? ).3 8

… 3 .)D ? PE R% S. % % '>3 CU F

2 3/ ( D % C 2

% E ). % 3 J9 % } .63 CU F

C., D ?4 F .J U 9 % ? ).3 ?4 F % ). % . %

T % 2I E r lT * J9 % } .63 K 1 (/ 7U

23 G.3 % HI.J3 .

(8)

G % CK I3 % R% S. % 2DQ

MFCC , 2 .A3 C'

, ] * .J 40

2U 3 , 2 n C o 30

2U 3 C o

C., D w .J E r % ,% w 3 C 1 G % . %

3/ 2 %

2DQ R% S. % % * 5

JE .)D k

(? 7 ) 5

Ad 2

@>1

2DQ R% S. % % (/ 4T . C'% 2 .A3

CU F %

3 , C V ' , HE% 1 1 .)D C PE C' % L +

PE % ). % C7 / G [ . C'% M 9 ' 1 % r ?

2DQ 2 / ' * U D e S69 % 2 . / 2 .A3

23 r ' 3/ CU1 3 G % .

% Q % % %

. % % X

? 7 CK I3 G % CU F % / CT 3 ? U' G

L % ). % % ,%

15

PRAAT @>1

.)D ? PE % M

. % X• % 3 29 H , M 23 ( 6 .6 2

M9 % F *O PI9 ^ G . C' C.AK% . % 2 2

C'% C' % 3 2X % % 9 ' J

.6 % 9 % % % /

' , R% S. % 9 % ). % C U' . 6 % PX C V

C D c 9 ( % % d (/ % .d3 C CF 9 CU F ) F . % JI9 .)D C PE % 9 2

? PE % 9 (

( 6 J9 2 3/ ( D % % D C … 3 .)D 23 .

Table (2): Number of speech segments for each speaker

) F 2 % .)D ? PE % 9 :( D C … 3

.)D ? PE % 9 J9 ( D %

.)D ? PE % 9

= 3/ ( D % J F ( D D

7 32 ( 1 D

9 46 ( 2 D

12 40 ( 3 D

8 42 ( 4 D

10 37 ( 5 D

9 45 3 6 D

10 47 3 7 D

9 45 3 8 D

11 49 3 9 D

11 38 3 10 D

9 44 ( 11 D

8 50 ( 12 D

10 36 ( 13 D

7 35 3 14 D

6 24 3 15 D

% I3 @>1 %

DC 5 ' 9 r .U , T%

5

) 3 , c 9 Ad 16

23 K 3 (

] 17 [ @ 9 . 5 7 CJ I3 H E C 3% ? )9 % w

S ^;J_

` (16)

C' Si

* i 5 € C€)K 3 G €3% S

* € %

σ

G 5 € 3 Z€ 9 9 C€

% 3 @% d % S

23 . SNi

* i % C)K 3 G 3% 5

SN

. % ( % K 3 % V

6 ) 2 ) . >H 9 : ; 7 =

2DQ CU1 3 G % R% S.€ % 5 € C€ PE € % €w 3

23 € ( € K 1 2 3/ ( D % % 2DQ R% S. % . AJ % 2sS63 % I3 % % C ( ' C, [% % V J9 ( D %

23 L % C 5 . D

12 9 16 Z [ MFCC % V €,

23 R% S. % € % € R% S.€ % Z % [ Vn .

C X M .)D C PE Kmeans

23 T% €' 3 M€ 9

29 € H€ , € % € . € / C 29 H , % C X 30

€ ( 2 . / Z [ C, €[% 2€DQ % € C€ € CA€ d3

23 2 .A3 = % ). % Vn . C . /

JE C' %>D

5 1 ( * d3 M C'% W % 2 .)D C PE

= 3 , N C V ' , * PSD

23 R% S. % .

2DQ % ). % C7A Vn = €3/ €d9 2€ 3/ ( D %

23 J9 C., D % E *@ € 2F X 21% ] d C CF 9 .

A€ (% € 3 G € 9 % € 2€ UE% CU€ , € 3 % J9 L 5 X 23 ). % D 2F X C7A 2F CU€ , G €. ' .

2 D (% T C

23 C., D w CA€ d3 €d .

) CP % ? G % C D e S69 2 UE% CU , 17

% ( ] 5 [ :

C0 100 ' a100 ∑ P0' SR0! / ∑ P0 "c (17 ) C' (/ Cn

C A 3 *2 UE% CU , % ). % 3/

Pn

( D D OF 3 @ 2F X SRn

. € 2€3 C7A€ 2F X

' n 23 ( 6 % D G % € €

n={1,2,…,15}

. € % % € ‚T% M9 C M9 H[ )9 *h )9 34T % w 3 . % X 0 9 d H K C % % € % *C7A€ .X @ 2F

4

C€ % 2€ UE% CU€ , '% G C. 3 , . .J ‚T G T ? 0

9 1 23 ( €T (/ m €[ € C' ' 100

23 ( ? C CU , % % I3 G % J' .

100 3

23 C

A€ ( 6 C' /

2€F X € 2€F X

€ ? € C 2 D . % ( D D % M @ ) CP % 18 23 eS63 ( .

) 18 (

C arg max Cn, n 1, … ,15!!

G€€ % * €€6 .€€J €€7U T 2 €€ % C€€J I3 €€w 3 C€€ r 3 / % e S69 _ ^ 23 CA d3 .J

(9)

€d € % C€' € % € D e S€69 _ € *2 % 3 % r D% D % C' % ? G % C ' G % CA d3 €w €3 € D * € % e S€69 € 29 ? PE 0s % e S69 u d . %

' * D e S69 _ 4T e S69 _ (% T 5

' G % CA d3 % . % C., D w .)D C PE ( D % F 3 Ad 5 ? PE H' ^ 3 % . % r 3 /

23 CA d3 C C7A c 9 C' 2K 5 ? PE % 9 Vn .

ZJN 2. 2

23 CA d3 % C PE H' ^ 3 . D

5 2 2. C C' 2 ? PE H' JI9 %

5 5 e S69 _ 5 * ? C F 3 ) c % . %

19 ) ( 20 C PE e S69 _ CA d3 d Z 9 9 C (

23 ( 6 % D e S69 _ 29 .

) 19 (

Segment Accuracy%

100H∑ Truely Detected Segments! ∑ All segmants!⁄ I

) 20 (

Speaker Accuracy%

100H∑ Truely Detected Speakers! ∑ Speakers!⁄ I

x [ r w 3 C u d

7U T C7A ( C + C CF 9

J9 H1% 3 23 9 *eE D .1% C7A = 3/

F C

= 3/ % V C7A CU1 3 . % L % \ * M % % * (

23 % E J9 3 . D

\ 2] C . G . 2 % .3% G 5 3 = 3/

C7A J9 . % C., D % E ). % 3 2 % .3% (% T C

Table (3): Recognition results in noise-free condition

) F 3 :( K 1 e S69 \ . (

e S69 _ G 5 3 .)D C PE

e S69 _ G . .)D C PE

_ G 5 3 D e S69

_ G .

D e S69 2DQ %

63.08 80.88 62.66 86.66 MFCC Z [12

76.32 85.29 89.33 100 W %+MFCCZ [12

66.76 77.20 76 86.66 2 . /+C V ' ,+ 3 , +MFCCZ [12

82.20 83.82 97.33 100 2 . /+C V ' ,+ 3 ,+ W %+MFCCZ [12

Table (4): Recognition results in -2dB SNR

) F 4 C 5 AJ e S69 \ . :( 2

H 2

e S69 _ G 5 3 .)D C PE

e S69 _ G . .)D C PE

e S69 _ G 5 3 D

_ G . e S69 D

2DQ %

53.38 65.44 54.66 80 MFCC Z [12

42.79 58.08 37.33 53.33 W %+MFCCZ [12

44.85 61.74 46.66 66.66 2 . /

52.05 55.88 54.66 60 3 ,+C V ' ,+2 . /

59.70 72.79 61.33 86.66 2 . /+ 3 ,+C V ' ,+MFCCZ [12

6 ) 3 )

9 @ 2; M1 3N % 3 C L

!

% r 3 / G K % 12

Z [ MFCC Z ' 9

% 2)U.S3

2DQ ? C D e S69 \ . . % ). %

( K 1 C . G . ? C G 5 3

2 2 ) % F Z 9 9 C H 3

) ( 4 C U' . % % ( 6 (

3 r % ,% 2U' K 1 . % ( ? C \ . G 5

23 ( 6 % \ . % A 2T e S69 .

C' 2K 1

e S69 _ G . r % ,% 23 *2 9 C

% ( 6 % 9

.3 D C7A 5

. J9 = 3/ CU1 3

) F C' P 3

23 63 ( e S69 _ G .

% % C D 12

Z [ MFCC C 9 ' ( 3 W % % C 23

2DQ C U' Z ' 9 % % C 2 63 C . . / H 1

23 Z ' 9 % ). % D e S69 _ G 5 3 1 G % .

2DQ 23 ( 6 CUˆJ3 G % . % 9+ J 6

2DQ G % Z ' 9 ? T4]% 1

h E D Ad % 9

% 5 2A 3 d C * U 9 \ C7A G % * % 23 23 r % ,% _ G . ' e S69 _ G 5 3 .

k UT . % F .)D C PE e S69 _ 3 2 63 F YL 0 .)D C PE e S69 _ G . C7 / 0 %

Z ' 9 K 1 5 2s S69 3 C * % . 2' % X/ 2DQ = C AJ 2 Jd3 ] C . .J 9+

(10)

F * 6 = C 3 I3 (% 3 2 w 3 C ) 4 C 5 AJ e S69 3 ( 2

2 ( 6 H

. % % ) F @4X

3 % % C e S69 _ % *(

Z ' 9 MFCC % I3 G . ' C ' ,% ? C W %

23 CUˆJ3 G % H K . %

. C W % J1 % 9

2DQ % C. G % % 2 63 x43 ' G % ( K 1

2 63

Z 3 c % C' % 9

Z % [ Y 6 Z ' 9 . A W % ' H3 2DQ % C AJ % D e S69 _ G . K 1 W % ( = . % % r % ,% 2 Jd3 ] C

\ €. €. H€ Ud9 % € 2 . / 2DQ ( L I3 C CF 9 *

% e S69 3 C' % C, [% F G % C 5 0 € 3 , C€ V €' , (/ Z ' 9 Vn 2 . / 2DQ €

23 ( 6 23 ( 6 3/ C \ . .

?4€ F O PI9 C'

Z ' 9 2DQ 2 .A3 , 2DQ 2 .A3 ? PE .)D

% C' 2€3 .J 7U T A lT % 6

5 C' oy3 MN ' 1% % ). % ?4 F O PI9 ( % 9

= % ).€€ % €€ 29 €€ C€€ PE e S€€69 G 5 €€ 3 . €€ % C€€U F 9 6 70

/ 59 23 € 9 % % C€ (/ € 4T . % C., A %

6 = % ). % 3 . % % A % e S69 _

6

) 4 ) ; OP Q@ ;A < R E MFCC

6 S 7 ;T1+U ;=

" B

D ? T4]% T rS C' % % C. >D ' G CA9 3 Z % [ MFCC

% € % € 9 % ).€ % € % % E

2 e S69 _ A C 3 Z % [ ]

18 [ . C r 3 / G %

2 op9 Z % €[ % € 9 r % €,% MFCC

e S€69 _ € €

% \ €9 C Z % [ % 9 . % C.X% 12

€ % r % €,%

\ . . % CA d3 e S69 _ cA9 3 3 N % 5 AJ % % C ( K 1 ? 6 3 / G % € C

2 2 ) % F Z 9 9 C H 5

) ( 6 C€' €] ( . % 3/ (

% Z % [ % 9 r % ,% % eS63 15

C 16 _ 3 C U' *

23 ,% e S69 C€ € 3 % € 9 G€ % % r r % ,% G % . '

2 e S69 _ A .

) F 2 5

23 ( 6 ( ( K 1

2 9 A

9 Z % [ % 9 r % ,% % % C e S69 3 C 15

F C Z [

23 C€' € % ( J7 ?+ 1 C D e S69 _ G . . /

% 2€I E / C.AK% c

% % C€ . €J .€J €7U T 16

Z €[

23 ,% .J 7U T C … 3 e S69 _ G . G % '

15 & Jd3 ] C G % . % Z [ ) F 9

6 % % C€ (

C 5 AJ 2

2 23 63 H €K 1 C7 % .

) F Y w 3 % \ . 5

\ . G . ( | 3% * % % (

% % C 15 23 C Z [ . /

Table (5): Recognition results with gradually increased number of MFCC coefficients in noise-free condition

) F 5 :( op9 Z % [ % 9 r % ,%

MFCC

( K 1 e S69 _

C PE e S69 _ G 5 3 .)D

C PE e S69 _ G . .)D

e S69 _ G 5 3 D

e S69 _ G . D

% 9 Z % [ MFCC

63.08 80.88 62.66 86.66 12

68.67 78.67 73.33 86.66 13

70.29 78.67 77.33 86.66 14

73.23 80.88 78.66 86.66 15

68.97 80.14 74.66 86.66 16

Table (6): Recognition results for naive and hybrid feature vectors with increased number of MFCC coefficients in -2dB SNR

) F 6 :( op9 Z % [ % 9 r % ,%

MFCC

C 5 AJ 2A ' 9 k 2A ' 9 K 1 2

2 H

e S69 _ G 5 3 .)D C PE

e S69 _ G . .)D C PE

_ G 5 3 D e S69

_ G . D e S69

2DQ %

53.38 65.44 54.66 80 MFCCZ [12

51.47 66.17 54.66 73.33 MFCC Z [13

34.70 43.38 32 40 MFCCZ [14

53.23 73.52 56 80 MFCC Z [15

37.05 52.20 37.33 60 MFCCZ [16

53.52 55.88 50.66 60 2 . /+ 3 ,+C V ' ,+MFCCZ [15

(11)

Table (7): Performance of speaker identification system for three sets of feature vectors

) F 7 2DQ % % C. C % % C D e S69 .J 7U T :(

29 C PE e S69 _ G 5 3 D e S69 _ G 5 3 SNR 2DQ %

74 82 40

6 =

70.58 76.66 30

67.52 74 20

65 69.23 10

59.70 61.33 -2

68 73.9 40

15 Z [ MFCC

65.23 69.80 30

61.88 66 20

57.7 62.66 10

53.23 56 -2

61.77 61.20 40

12 Z [ MFCC

60.01 59.8 30

58 58.50 20

56.47 57.66 10

53.38 54.66 -2

' 9 CU1 3 *\ . G % C CF 9 2€DQ Z

€6 €

G % 15 Z [ MFCC C., D % E r 3 / 3 K 1

) F C . . % 6

\ €. € C€ . G€ % C€J I3 . € % 3/ (

) F 4 23 ( 6 ( % ). % N C'

15 Z [ MFCC C€

€. € €K 1 ( € | €3% % 2K AE H E \ . 2 9 G

Z ' 9 % % C \ . 12

€ 3 , *C€ V ' , Z [ Z % €[ €

23 C 2 . / 23 CUˆJ3 G % H K . /

€1 % r r % ,% % 9

2€3 lT C' 2DQ % % D € 2 € % 9 C7A€ €

3 , 5 ? AT C . % % Z 3 2 % 5 Z % €[

. / D 2 23 + V ' ,

29 T4]% ( 1 % 9

YZ % [ r % ,% C' MFCC

2DQ C 23 C, [% K 1 G % .

% ). % 15 A C 3 3 % 27 9 *2 9 C Z [

. % 6 = C AJ 2' %

6 ) 5 )

.= D * 7 ;=

" B 6 7

€. 2 / €' C' 2DQ % % C. C 7U T d . % r €3 / €3 C 5 AJ % 0U.S3 I3 % % C * %

\ . C., D % E ) €F

7 . € % € % ( €6 ( ) H7€

6 (

C 5 AJ % % C % D e S69 _ G 5 3 €

2 €9

40 2 23 ( 6 H C€PI % .

C€ … € 3 G €N 12

YZ €[

MFCC cX % * C … 3 G N

15 YZ [ MFCC ( 6 cX

C€' % 2.K 1 2 6 = 12

YZ €[ MFCC €

3 , *C V ' , Z ' 9 2 . /

% C€' €] ( € . € %

% eS63 H7 A€J % 0€U.S3 I3 % % C 6 =

G % € . % . ' ,% . e S69 _ % % C 5 L I3 C AJ ) H7 . % 9

7 % € % C €63 ? 6 3 / \ . (

23 ( 6 .)D C PE e S69 _ G 5 3 3 G€ % C€' €

9 % % 6 = H7 . % 5 = C AJ

) H7 6 2DQ % % C. C % % C D e S69 _ G 5 3 :(

AJ 0U.S3 C 5

Fig. (6): Average speaker recognition rate for three sets of feature vectors in different signal to noise ratios

) H7 7 :( % % C. C % % C 29 C PE e S69 _ G 5 3

AJ 2DQ 0U.S3 C 5

Fig. (7): Average speech segment recognition rate for three sets of feature vectors in different signal to noise ratios

7 ) ; .W

DQ Z ' 9 % CK I3 G % 2

, 2 .A3 2DQ

9 ). % .)D rS M % R% S. % YG.3 % HI.J3 op9

r ' G.3 % HI.J3 D e S69 .J 2 / ' = % ). % F C . G 5 3 H~3 % .3

% * D

2DQ % r ' ( 3

A3 i 9 ' = X % , 2 .

Kmeans % 6 = 2 / ' ? 6 3 / \ . . ). %

-5 0 5 10 15 20 25 30 35 40

50 55 60 65 70 75 80 85

SNR ratio

%

M

e

a

n

S

p

e

a

k

e

r

A

c

c

u

ra

c

y

12 MFCC 15 MFCC Proposed

-5 0 5 10 15 20 25 30 35 40

50 55 60 65 70 75

SNR ratio

%

M

e

a

n

S

e

g

m

e

n

ts

A

c

c

u

ra

c

y

(12)

AJ % 0U.S3 I3 % % C ( K 1 5

23 ( 6 C K 1 3/ C \ . C CF 9 .

C 5 AJ 2

. G . H 2 Z ' 9 % % C \

12

3 , *C V ' , Z [ G % C' 3/ C 2 . / Z % [

2DQ % Z ' 9 G 5 3

(% 3 C % D e S69 _ 67

/ 6 %

(% 3 C % .)D C PE e S69 _ G 5 3 32

/ 6 = C AJ %

2 .A3 % .3 12

Z [ MFCC . % % r % ,% %

C' 2 /

, % ). % F C CK I3 G % .)D ? PE %

% ).

23 r ' ? C .)D 1% % 9 * % Z ' 9 G % .

2DQ / ( D % 1 2DQ % r % ,% 3

% J9 =

2 r % ,% 23 6 1 G % .

3 % C H1 F

C 79 O PI9 % *2. ( D % M 9 3 9% O PI9 ). %

. 2

23 C L I3 Z 3 % 2 2 % / N 1% (% 9

2DQ Vn ' % F ?4 F . ' R% S. % % G.3 % HI.J3

B

:D 1- MFCC

2- WPT 3- DWT 4- Open-Set 5- Closed-Set 6- Phonetic context 7- MFCC

8- Frame Based 9- Approximation 10- Detail 11- Wovels

12- Power spectral density 13- Auto-Regressive 14- Addaptive 15- PRAAT

References

[1] R. ShanthaSelvaKumari, S. SelvaNidhyananthan, G. Anand, "Fused Mel feature sets based text-independent speaker identification using Gaussian mixture model", Procedia Engineering, Vol. 30, pp. 319-326, 2012.

[2] K. Daqrouq, K.Y. Al Azzawi, "Average framing linear prediction coding with wavelet transform for text-independent speaker identification system", Computers & Electrical Engineering, Vol. 38, No. 6, pp. 1467-1479, Nov. 2012.

[3] A. Shafik, S.M. Elhalafawy, S.M. Diab, B.M. Sallam, F.E. Abd El-samie, "A wavelet based approach for speaker identification from degraded speech", International Journal of Communication Networks and Information Security (IJCNIS), Vol. 1, No. 3, Dec. 2009.

[4] M.I. Abdalla, S.A. Hanaa, "Wavelet-based mel-frequency cepstral coefficients for speaker identification using hidden markov models", JOURNAL OF TELECOMMUNICATIONS, Vol. 1, No 2, March 2010.

[5] K. Daqrouq, "Wavelet entropy and neural network for text-independent speaker identification", Engineering Applications of Artificial Intelligence, Vol. 24, No 5, pp. 796–802, Aug. 2011.

[6] Md. Murad Hossain, B. Ahmed, M. Asrafi, "A real time speaker identification using artificial neural network", 10th international conference on computer and information technology, iccit, pp.1-5, 27-29 Dec. 2007.

[7] E. Avci, "A new optimum feature extraction and classification method for speaker recognition: GWPNN ", Expert Systems with Applications, Vol. 32, No. 2, pp. 485–498, Feb. 2007.

[8] H. Harb, C. Liming, "Gender identification using a general audio classifier", Proceeding of the IEEE/ICME, Vol. 2, pp. II-733-736, July 2003.

[9] H. Harb, L. Chen, "Voice-based gender identification in multimedia applications", Journal of Intelligent Information Systems, Vol. 24, No. 2-3, pp. 179-198, March 2005.

[10] J.A. Bachorowski, M.J. Owren, "Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech", Journal of the Acoustical Society of America, Vol. 106, No. 2, pp. 1054–1063, Aug. 1999.

[11] A. Cherif, L. Bouafif, T. Dabbabi, "Pitch detection and formants analysis of arabic speech processing", Applied Acoustcs, Vol. 62, No. 10, pp. 1129–1140, Oct. 2001.

[12] A.M. Noll, "Cepstrum pitch determination", Journal of the Acoustical Society of America, Vol. 41, pp. 293-309, 1967.

[13] W. Yutai, L. Bo, J. Xiaoqing, L. Feng, W. Lihao, "Speaker recognition based on dynamic MFCC parameters", Proceeding of the IEEE/IASP, pp. 406-409, April 2009.

[14] S. Chougule, P.P. Rege, "Language independent speaker identification", Proceeding of the IEEE/ICIT, pp. 364-368, 15-17 Dec. 2006.

[15] S. Haykin, "Neural networks", Macmillan College Publishing Company, Section 5.3: The Steepest Descent Method, 1994.

[16] M. Katz, "Fractals and the analysis of waveforms", Computers in Biology and Medicine, Vol. 18, No. 3, pp. 145-156, 1988.

[17] J.D. Wu, B.F. Lin, "Speaker identification using discrete wavelet packet transform technique with irregular decomposition", Expert Systems with Applications, Vol. 36, No. 2, pp. 3136–3143, March 2009.

Referências

Documentos relacionados

[13] Sun-Jin Oh, &#34;The location management scheme using mobility information of mobile users in wireless mobile networks&#34;, IEEE International Conference on Computer Networks

Speech Segmentation, Spoken Word Boundary Detection, TEO, Speaker- Independent, Isolated Words, Speech Recognition System, Mel-frequency Cepstral Coefficients, Artificial

This study aimed to investigate self-reported hearing-related communication difficulties and compare pure-tone thresholds, speech recognition in quiet and noisy conditions,

Yan, &#34;Spectrum sensing methods and dynamic spectrum sharing in cognitive radio networks: A survey,&#34; International Journal of Research and Reviews in

(2006), &#34;A multiple attribute utility theory approach to lean and green supply chain management&#34;, International Journal of Production Economics, Vol. (1996), The

Keywords: Application of Information Technology to the Foundry Industry: Solidification Process: Numerical Tcchniqucs: Sensitivity Analysis; Borzndary Elcmcnt

“Speaker recognition” (the best known commercialized form of Voice Biometric) is the computing task of validating a user's claimed identity using

Owing to the issues in difference in results in simulation and in real time approach which is a very challenging one, this research journal will highlight a novel approach of