• Nenhum resultado encontrado

ParaMT: a Paraphraser for Machine Translation

N/A
N/A
Protected

Academic year: 2021

Share "ParaMT: a Paraphraser for Machine Translation"

Copied!
24
0
0

Texto

(1)

               

ParaMT:

A Paraphraser for Machine Translation

                             ! " #  $ %  &      

(2)

                     ! " #  $ %  &      

(3)





                             ! " #  $ %  &      

(4)

                             ! " #  $ %  &                                                                                            ! " # ! $ # % & ' & ( )       *   + ,                             -     . / .  0 1    /  2                      3  +                           ! " # ! $ # % & ' & ( )    *   + ,        2                               -     . / .  0                           4  2   5 6  7     8 !9 #: ! ;< ! $ # % & ' = (  >  ? 6   ? 6              0 @ 3  +   ? 6  6  7     8 ! 9 #: A!: ! ;< ! $ # % & ' = (  >  ? 6   ? 6              0                5 6  7     ( < !: ; < ! $ # % & ' = (  >     2          0 B  >      5 6  7     8 ! 9 #: ! ; < ! $ # % & ' = (  >  ? 6   ? 6              0 1    /   2  5 6      8 ! 9 #: ! ; < ! $ # % & ' (  >  ? 6   ? 6              0 5 C                D 7     ( < !: ;< ! $ # % & ' = (  >  ? 6   ? 6              0 E   F &  G   H  I  J     K  L  M N  O  L N PQ  R  S T     N  U     O       O  V K  N  Y   K  N  Z  N    O  O  %  &   [ &  H  I  J \    O   N N    ] O  X     O N   ^ Z   K  _  R  K    O O  K  ` a bca d P Q Oe S

(5)

                     ! " #  $ %  &      

(6)

                             ! " #  $ %  &      







(7)





http://www.nooj4nlp.net/



http://www.linguateca.pt/COMPARA)

http://www.linguateca.pt/metra)

                             ! " #  $ %  &      

(8)

                                       !  " #       $ %               "     !  " #   "  !   !     & "  '  !      $ %               & "  '  !  "  !   !         $ %                  (  ) *+ , -+ . /0 1 2 3 4 56 4 7 8 -94 : 3 ; ) *, -+ . / < * : , -< / 1 , = > 8 / 1 , / < ) * ? . 8 7< 2 3 4 5 6 4 7 1@ > 4 : 3 ; = > 8 -+ /A -< -+ , 1 B ) * + . 8 7< C B . 8 7< 2 3 4 56 4 7 1> -4 : 3 ; ) B . 8 7< : D -.+ * 1 E . 8 + > -+ 2 34 5 6 4 1 -F G . 1G 4 : 3 ; H < . : < , -I + / 8 -+ . . J > + 1 2 34 H6 K ; ) 3 E4 65 4 L < -+ . 4 : 3 ; I + / 8 -+ . .<> + <> ) M G + , 5 + > 8 > 1 , 2 34 H6 K ; )6 5A: 4 5 6 4 1 -F G . 1G 4 : 3 ; 5 + > > 8 > + ) M G , E 3 = 2 3 4 ) 3 4 1.N 4 : 3 ; = 3 P   Q  Q R O S     Q T    U      V     W  X  W Q       Y   Z S Y    W O [ Q V \ S S 

mesa,N+FLX=CASA+CO+surf+EN=table

cair,V+FLX=ATRAIR+INMO+IntoType+EN=fall

holandês,A+FLX=INGLÊS+AN+lang+EN=Dutch

actualmente,ADV+FLX=FACILMENTE+TEMP+punc+pres+EN=nowadays

alguém,PRO+IMPERS+INDEF+EN=somebody

porque,RELINT+why+EN=why

e,CONJ+JOIN+EN=and

durante,PREP+TEMP+EN=during

cada,DET+IMPERS+INDEF+SG+EN=each

terceiro+NUM+ord+EN=one third

a curto prazo,ADV+TEMP+EN=in the short run

a favor de,PREP+CAUS+EN=in favor of

cada um,PRO+INDEF+SG+EN=each one

de quem,INT+ThatType+EN=whose

quem quer que seja,REL+WhateverType+EN=whoever

além disso,CONJ+COOR+EN=besides

um quarto,NUM+frac+EN=one fourth

] ^ _` ^ ] a b _c d ] e f g hij k lm fn fo g pi g c qr s g m f kr t u _r t v ] _ ^ r ] w ` ^ c x ]y y` u _ ] e f g hij k lm fn fo g zo{ ` ` s g m f k w _` ` |y { a r } w c wa ^ ] ] s r` ~sa r ] e f g hij k z€  g l g sa  u g m f k ] s r` t ` sa r ^ _ a q } g ‚ f lƒ w c wa ^ ] ] s r` ~sa r ] e f g hij k z€  g l g sa  u g m f k w ` ` c„ g y s ] q b r` _ ^ c s ] _ ] q d ] e  g f … g  †_c ^ g m f k ` _ ] q bc yu s ‡ ] c | _ a r ]`q e  g hij k  iˆ o g  f g ^ c y g m f k € ` u { t  | c _ a r ] q ]r u _ { ` †_ ]` „ e  ‰ … g i ` r ˆ a | c g ˆ m l p g m f k a q { t c y t ` _ { _u q Š ` _ ] ^ c yc _ x a ‹` e  ‰ … g € ˆ ˆ g † t _ g m f k ` u { ` Š ` _ ^ c _ tŒ |u a { ` { c |`† e  ‰ … g i ` r ˆ a | c g ˆ m l p g †u qr † ]y { g m f k ] s ` q b { a | c ] b` a y { `  e zo fŽ g zo o  g m f k a  c  dŒ q ‘ ` e zo fŽ g zo o  g m f k`q s ` q bc _ | c y `| ]y y a | e zo fŽ g € ‚ ƒ g m f k c c x qy` d u q { ] c| q { c r` | e p m p g € € o z g m f k ] s ` qb ’ a { t “ ^a _c a { ] ^ c e p m p g i ` r g ˆ g m f k ] { { t c _ a b t { ` Š c |r` q Š ` _ | a ^ ] ^ c r` | e p m p g  io ” g m f k a qr` q b _uc qrc ’ a { t

General dictionary

sample representing all

PoS, variable and

invariable forms

Sample of the

dictionary of Terms

and

Multiword Expressions

DicTUM

Sample of invariable

compounds in the

general dictionary

Sample of the

dictionary of

Biomedical Terms

Sample of the

dictionary of

Proper Names

•– — • — – ˜™ ™ š › œ  ž Ÿ    ¡ ¢ ¢ Ÿ £ ¢ ¤ › ¥ Ÿ £ ¢ ¤ š § ¨™ © Ÿ ª « Ÿ ¬ ž Ÿ ¢ ˜™ ™ š

(9)



Representation abstract language



Hierarchical taxonomy (sets, supersets and (sometimes) subsets)



Based on Logos SAL ontology



Integrated in the dictionary



It represents both meaning (semantics), and structure (syntax)



Over 1,000 categories

•– — • — – ˜™ ™ š › œ  ž Ÿ    ¡ ¢ ¢ Ÿ £ ¢ ¤ › ¥ Ÿ £ ¢ ¤ š § ¨™ © Ÿ ª « Ÿ ¬ ž Ÿ ¢ ˜™ ™ š

(10)

oun Supersets

concrete

mass

animate

place

information

abstract

process (intr)

process (tr)

measure

time

aspective

Sets

and Subsets of the

CONCRETE Noun Superset

Click on

CONCRETE Superset,

sets

and

subsets

for explanations

functionals

receptacles

bearing surfaces

links/bridges

thresholds, focal

points, barriers

conduits

fasteners

devices, tools

cloth thing

structural elements

concretizations of

verbals

concretizations of

mass nouns

undifferentiated

functionals

product/brand

names

* * *

agentives

software

vehicles

meters

machines/systems

communication agents

concrete chemical

agents

undifferentiated

agentives

* * *

natural things

minute flora

plants

trees

trees/wood

miscellaneous natural

things

* * *

other concrete sets

*

impulses/lights

blemishes/marks

edibles (non-mass)

edibles/color

classifiers

amorphous

atomistic

undifferentiated

concrete things

* * *

*With one exception, these

sets

have no subsets

                          

(11)

                                  !   "   #     "   #     $ %  & '   $ %  () * + , - . () + , - / 0 1, 2 3 . , () 4 $  4'    4 5 6  4  7       4 5  6 2 / + / 8 9: + 0 ; / ( 3 . / < = 2 , < ): * 81> ( , 2 ) 6  4 5   ? @   6   6  4 5 A / ++ . (9 0 2 /B . ( / B = C * , - / 1 ) + ) D ( = 1, 2 /   5 4 7     5 4 + ( *2 E 0 : 3, F / * + ) B G H . 8 6   '    6   ' 2 8 ) 2 E 0 D / * D . B /-G B . + () 4 $ 6 6   4    $        4 $ 6 6 ( / <, ) 0 ( / < / ( ( = <, ) %  4   $   7     ! %  4 + ( , -E . + 0 ) (- /B .-+ ) (- / B .-+ ) !   4  ?  $ $ 7     $ $ 7 F 8, . ( : / 8, 2 / + . %    '    %   - / ,8 0 + .-< ) - F ( . D) #  '     ' %  4    ' % + / A8 . 0 : 3 . 8 1 B : . / '  4 I   4 7    '  4I A ) ++ 8 . 0 A / (( . 8 D / (( / 1 / 4 $  !      4 $  ! 2 3 * + . 0 / ( + . (9 / ( + J ( , /  5 '  5 $ 7 !  ?% $ 4  7 I$     ? # ''  '    # '' ; / 88 0 < ) ) ( F ) ( + / 7  K  ? # '  !     7  K 2 , (2 * , + 0 - . (H . 2 , (2 * , + ) 4 7 $  5  5      4 7 $  5 : 3, ( + 0 A8 /-E . + 2 / B , : ) 8 /   ' 4  '  7  7  6       ' 4 : F / ( 0 A ) - . ):: ) 4 $  4'    L    $  $ %  ' #  7    ' # + 3 ( . / <, - D 4 $  4'    L    $  $ % 6   $     6  /2 , < 8, -, - D I '$ ! 4  ? # '  !  6    # '  ! M , -< ) : ; NO M , -< ) : ; NO    '  7  5        "   #     "   #     6     %7 $ '    %7 $ '  / 8 D / . 0 : F ) ( . / 8 D / I 7     I 7   (): . 0 ; . . < . (H /  '      '   /F F 8 . 0 ; ,88 ) ; B /2 , . , ( /  '   ? &$ $ !    ' & ! ) / E 0 B /F 8 . 2 / (H / 8 3 ) 6   4 P    '  7  5      6    F . A A8 . 0 , 2 . A . ( D , 2 . A . ( D  ! #7  Q $  R 6   S    !  6 F ) ( E 2 3 ) F 2 ): + ) 8 . + /  ! #7  ? 4 $ 7 $ '    ! 4 $ 7 ) ( /- D . 0 2 3 . ((9 8 / ( /-T /  6 I 7   ?7  5    $ 7    5  8 /B F 0 A . / B 8U B F / < / #7 6   5  ? 6 ' K    #7  6 : 2 ( / + 2 3 0 1 ( .2 E8 . : / ( < / 4 7    % '    4 7   . 8 .B .-+ . 8 .B .-+ ) 6 $ ' I 5 $    6 $ ' A ( . .V . 0 + , < . A ( , : /   $ 6     4     $ 6 . 8 .2 + () -0 / + ) B = + ) B )  ! %% '       !   $ #W + ( , 18 . 0 2 * ( , )    $ '  $ %   X YZ Z [ $                             

(12)

                                                      ! " # $ % & '( "                 )  $ '*+ & & '"   ,           , -  . /     - 0 '* & $ 1 0 ' !   , 2          2   1  $ & ( 3  &+ " 4   , 2           2   ( 3 %% 0 ' $ % " &+   ,            ! & ( +

& & )* &

% )  )   ,  .  5 ,         6 3 " " $  0 + )& 1 0 6&+            7 3  + & '  & " $ 8 & % 9 '   ,             0 + $  ' " & $ # & 3 +      :    ,          0&+ )" & ! $ 0   $ 1  0 '   ,    5 2       2   ! 0 %% 3 + $ & + 0 $ 7 0 + ' "   ,   5           '*  $ # 3 + ! $  % & .     , ,           , ! & ( + & & $ ( + 0 ) ) $ )  3 + &    $ '  $ % ; Z < "= YZ $                             

(13)

X $   %7  4   $   7  '  !  6 < !W  4      %7  4   $   7  '  !  6  '$ $   %7  4   $   7  '  !  6  '   4  $    '   6 < !  ' #  %7  4   $   7  '  !  6    ' 6   '  %7  4   $   7  '  !  6   ''$        '$ $   %7  4   $   7  '  !  6 X $ 6    7 L    $   '      $   7  '  !  6                           

(14)

Translation and

bilingual paraphrasing

of simple sentences

Graph to translate simple

sentences

                          

(15)

NDRV

ADRV

VSUP

VCOP

                          

(16)

AVDRV

Npred

                          

(17)

Recognition and monolingual paraphrasing

of support verb constructions

(support verb construction / morphologically related lexical verb)

                          

(18)

Recognition and paraphrasing of elementary

support verb constructions

co-occurring with predicate nouns

of the biomedical field

(support verb construction / lexical verb or

stylistic variant / non-elementary support verb

construction)

                          

realizar efectuar

(19)

Interactive ReWriter

for word processing applications

such as text editing

                          

(20)

Recognition and bilingual paraphrasing of support verb constructions

(Portuguese support verb construction / corresponding English verb)

                          

(21)

SVC Recognition

Precision

SVC Recognition

Recall

SVC Paraphrasing

Precision

Pôr

73/73 - 100%

73/100 – 73%

72/73 - 98.6%

Tomar

75/75 - 100%

75/100 – 75%

68/73 - 93.1%

Ter

65/65 - 100%

65/100 – 65%

59/65 - 90.7%

Dar

57/60 - 95%

57/100 – 57%

46/51 - 90.1%

Fazer

43/45 – 95.5%

43/100 – 43%

40/45 - 88.8%

Average

62.6/63.6 - 98.4%

62.6/100 - 62.6%

57/61 - 93.4%

                          

Evaluation of recognition and paraphrasing

of support verb constructions

(22)





                          

(23)

Fundação para a Ciência e a Tecnologia

Fundação para a Computação Científica

Nacional

                          

(24)

               

ParaMT:

A Paraphraser for Machine Translation

                          

Referências

Documentos relacionados

Recent estimations (indigenoustweets.blogspot.com/2011/12/) approximate to 1500 the number of languages for which, on the web, one could find ”primary texts”: newspapers, blog

Having this same view, Sampson (1987) presents a new perspective to translation and the translator work using some authentic-language examples from his own

In general, the analysis of reading strategies made us believe that Google translation was a benefit to reading comprehension, and that such strategies are managed differently

Three approaches to train an NMT system for PT ↔ ZH were studied, namely the (i) direct approach, which only uses parallel corpora between Portuguese and Chinese; the (ii)

ANN model gives matching of equivalent Sanskrit word of English word which handles noun and verb, while rule based model generate Sanskrit translation of the given input

There will be a discussion on the gains and losses between light and full PE below. At this point in the text, Krings stresses that most discussions and papers on PE

table, but we colored as blue the outputs we consider as positive, red the negative output and black what we considered as neutral. The methods used in this work were deeply

Chapters 3 to 5 also describe the space and time results respectively from the Bilingual Framework and all its functionalities. Then, Chapter 6 – Related Work describes some