Translation API source with annotation


#1

I tried the translation api by inputting plain text files and set withSource and withAnnotation to true (synchronously). I also input custom dictionary with place name, etc. However, the returned translated text has no annotation or metadata. How do I display annotation, so that I can know which words is my user-defined words?

I use ja to en.


#2

Hi poonwu,

Annotations are provided as html attributes so you should send your plain text as html (and specify format=html):

POST https://api-platform.systran.net/translate?key=xxxxxxx&source=en&target=fr&format=html&rawBody=true&withSource=true&withAnnotations=true HTTP/1.1
Referer: http://www.systransoft.com/
X-User-Agent: SIT/8.1.2.14
Content-Type: text/plain; charset=utf-8
Host: api-platform.systran.net
Content-Length: 42

<html><body>The sky is blue.</body></html>

To convert plain text to html, you could just enclosed your text into an <html><body>...</body></html>. You wil receive a multipart body:

  1. one part with part-name: source and containing source html annotated,
  2. one part with part-name: output and containing target html annotated,

In my previous sample, you should receive:

HTTP/1.1 200 OK
Vary: X-HTTP-Method-Override
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: X-Requested-With,Content-Type,X-HTTP-METHOD-OVERRIDE,X-User-Agent
Content-Type: multipart/mixed; boundary="4011defdcf1a54cf0e19ca6fc1cc4b6123717655"
Content-Length: 1747
Date: Thu, 15 Sep 2016 08:39:37 GMT
Connection: keep-alive

--4011defdcf1a54cf0e19ca6fc1cc4b6123717655
part-name: source

<html>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<body><span class="systran_seg" id="Sp1.s1.1_o"><span class="systran_lemma" value="the" id="p1.t1.1_1"><span class="systran_token_word" value="3c3d/det" id="p1.t1.1_1">The</span></span> <span class="systran_lemma" value="sky" id="p1.t1.1_2"><span class="systran_token_word" value="1010/noun:common" id="p1.t1.1_2">sky</span></span> <span class="systran_lemma" value="be" id="p1.t1.1_3"><span class="systran_token_word" value="4004/aux:plain" id="p1.t1.1_3">is</span></span> <span class="systran_amb_adjective/noun" value="&lt;reference&gt;blue&lt;/reference&gt;&lt;choice value='adjective' default='yes'/&gt;&lt;choice value='noun'/&gt;" id="p1.t1.1_1"><span class="systran_lemma" value="blue" id="p1.t1.1_4"><span class="systran_token_word" value="2020/adj:base" id="p1.t1.1_4">blue</span></span></span><span class="systran_lemma" value="." id="p1.t1.1_5"><span class="systran_token_punctuation" value="cccc/punct" id="p1.t1.1_5">.</span></span></span></body></html>
--4011defdcf1a54cf0e19ca6fc1cc4b6123717655
part-name: output

<html>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<body><span class="systran_seg" id="Sp1.s1.1_o"><span class="systran_token_word" value="*" id="p1.t1.1_0">Le</span> <span class="systran_token_word" value="1010/noun:common" id="p1.t1.1_2">ciel</span> <span class="systran_token_word" value="4004/aux:plain" id="p1.t1.1_3">est</span> <span class="systran_token_word" value="2020/adj:base" id="p1.t1.1_4">bleu</span><span class="systran_token_punctuation" id="p1.t1.1_5">.</span></span></body></html>
--4011defdcf1a54cf0e19ca6fc1cc4b6123717655--

It works for me. Hope it can help you.
Best regards.

Olivier