Skip to content
Permalink
8bed31719c
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
295 lines (240 sloc) 11.6 KB
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"../../../docbook-xml-4.5/docbookx.dtd">
<chapter id="chapter.machine.translate">
<title>Machine Translation<indexterm class="singular">
<primary>Machine Translation</primary>
</indexterm></title>
<section>
<title>Introduction<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Introduction</secondary>
</indexterm></title>
<para>As opposed to user-generated translation memories (as in the case of
<application>OmegaT</application>) Machine translation (MT) tools use
rule-based linguistic tools to create a translation of the source segment
without the need for a translation memory. Statistical learning
techniques, based on source and target texts, are used to build a
translation model. Machine translation services have been achieving good
and steadfastly improving results in research evaluations.</para>
<para>To activate any of the Machine Translation services, go to
<guimenuitem>Options &gt; Machine Translate ...</guimenuitem> and activate
the service desired. Note that they are all web-based: you will have to be
on-line if you want to use them.</para>
</section>
<section id="introduction">
<title>Google Translate<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Google Translate</secondary>
</indexterm></title>
<para>Google Translate is a payable service offered by Google, for
translating sentences, web sites and complete texts between an
ever-growing number of languages. At the time of writing the list includes
more than 50 languages, from Albanian to Yiddish, including of course all
the major languages. The current version of the service is based on usage,
with the price of 20 USD per million characters at the time of writing.
<emphasis role="bold"/></para>
<para><emphasis role="bold">Important: </emphasis>Google Translate API v2
requires billing information for all accounts before you can start using
the service (see <ulink
url="https://developers.google.com/translate/v2/pricing?hl=en-US">Pricing
and Terms of Service</ulink> for more). To identify yourself as a valid
user for the Google services, you use your private unique key sent to you
by Google, when you have registered for the service. See chapter <link
linkend="chapter.installing.and.running">Installing and Running</link>,
section Launch command arguments, for details on how to add this key to
the OmegaT environment.</para>
<para>The quality of the Google Translate translation depends on one side
on the reservoir of target-language texts and the availability of their
bilingual versions, on the other hand on the quality of the models built.
It is pretty much certain that while the quality may be insufficient in
some cases, it will definitely get better with time and not worse.</para>
</section>
<section>
<title><application>OmegaT</application> users and Google
Translate</title>
<para>The <application>OmegaT</application> user is not forced to use
Google Translate. If used, neither the user's decision to accept the
translation nor the final translation are made available to Google. The
following window shows an example of a) the English source b) Spanish and
c) Slovenian Google Translate translation.</para>
<figure id="moby.dick">
<title>Google Translate - example</title>
<mediaobject>
<imageobject role="html">
<imagedata fileref="images/MobyDick.png"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="images/MobyDick.png" width="60%"/>
</imageobject>
</mediaobject>
</figure>
<para>The Spanish translation is better than the Slovenian. Note
<emphasis>interesar</emphasis> and<emphasis> navegar</emphasis> in
Spanish, are correctly translated as the verbs interest and sail
respectively. In the Slovenian version both words have been translated as
nouns. It is actually quite probable that the Spanish translation is based
at least partially on the actual translation of the book.</para>
<para>Once you have activated the service, a suggestion for the
translation will appear in the Machine Translate pane every time a new
source segment is opened. If you find the suggestion acceptable, press
<keycombo>
<keycap><indexterm class="singular">
<primary>Shortcuts</primary>
<secondary>Machine Translate - Ctrl+M</secondary>
</indexterm>Ctrl</keycap>
<keycap>M</keycap>
</keycombo> to replace the target part of the opened segment with the
suggestion. In the above segment, for instance, <keycombo>
<keycap>Ctrl</keycap>
<keycap>M</keycap>
</keycombo> would replace the Spanish version with the Slovenian
suggestion.</para>
<para>If you do not wish <application>OmegaT</application> to send your
source segments to Google to get translated, untick the Google Translate
menu entry in Options.</para>
<para>Note that nothing but your source segment is sent to the MT service.
The online version of Google Translate allows the user to correct the
suggestion and send the corrected segment in. This feature, however, is
not implemented in OmegaT.</para>
</section>
<section>
<title>Belazar<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Belazar</secondary>
</indexterm></title>
<para><ulink url="http://belazar.info/">Belazar</ulink> is a Machine
language translation tool for the Russian-Belarusian language pair.</para>
</section>
<section>
<title>Apertium<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Apertium</secondary>
</indexterm></title>
<para><ulink url="http://www.apertium.org/">Apertium</ulink> is a
free/open-source machine translation platform, initially aimed at
related-language pairs, like CA, ES, GA, PT, OC and FR but recently
expanded to deal with more divergent language pairs (such as
English-Catalan). Check the web site for the latest list of implemented
language pairs.</para>
<para>The platform provides</para>
<itemizedlist>
<listitem>
<para>a language-independent machine translation engine</para>
</listitem>
<listitem>
<para>tools to manage the linguistic data necessary to build a machine
translation system for a given language pair </para>
</listitem>
<listitem>
<para>linguistic data for a growing number of language pairs</para>
</listitem>
</itemizedlist>
<para>Apertium uses a shallow-transfer machine translation engine which
processes the input text in stages, as in an assembly line: de-formatting,
morphological analysis, part-of-speech disambiguation, shallow structural
transfer, lexical transfer, morphological generation, and
re-formatting.</para>
<para>It is possible to use Apertium to build machine translation systems
for a variety of language pairs; to that end, Apertium uses simple
XML-based standard formats to encode the linguistic data needed (either by
hand or by converting existing data), which are compiled using the
provided tools into the high-speed formats used by the engine.</para>
</section>
<section>
<title>Microsoft Translator<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Microsoft Translator</secondary>
</indexterm></title>
<para>In order to get credentials for MS Translator, follow these
steps:</para>
<orderedlist>
<listitem>
<para>Log into Microsoft Azure Marketplace: <ulink
url="http://datamarket.azure.com/">http://datamarket.azure.com/</ulink></para>
<para>If you do not already have an Azure Marketplace account you will
need to register one first.</para>
</listitem>
<listitem>
<para>Click on the My Account option at the top of the page.</para>
</listitem>
<listitem>
<para>Near the bottom you will see entries and values for:
<itemizedlist>
<listitem>
<para>Primary Account Key (corresponds to
<code>microsoft.api.client_secret</code> command-line
parameter)</para>
</listitem>
<listitem>
<para>Customer ID (corresponds to
<code>microsoft.api.client_id</code> command-line
parameter)</para>
</listitem>
</itemizedlist></para>
</listitem>
</orderedlist>
<para>To enable MS Translator in OmegaT edit its launcher or see chapter
<link linkend="chapter.installing.and.running">Installing and
Running</link> to learn how to start OmegaT from the command line.</para>
</section>
<section>
<title>Yandex Translate<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Yandex Translate</secondary>
</indexterm></title>
<para>In order to be able to use Yandex Translate in OmegaT, you need to
<ulink url="http://api.yandex.com/key/form.xml?service=trnsl">obtain an
API key from Yandex</ulink>.</para>
<para>The obtained API key needs to be passed to OmegaT at startup through
<code>yandex.api.key</code> command-line parameter. To do that edit OmegaT
launcher or see chapter <link
linkend="chapter.installing.and.running">Installing and Running</link> to
learn how to start OmegaT from the command line.</para>
</section>
<section id="Google.Translate.troubleshooting">
<title>Machine translation - trouble shooting<indexterm class="singular">
<primary>Machine Translation</primary>
<secondary>Troubleshooting</secondary>
</indexterm></title>
<para>If there's nothing appearing in the Machine Translate pane, then
check the following:</para>
<itemizedlist>
<listitem>
<para>Are you online? You need to be online to be able to use an MT
tool.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>What is the language pair you need? Check if the selected
service offers it.</para>
</listitem>
<listitem>
<para>Google Translate does not work: have you applied <ulink
url="https://developers.google.com/translate/v2/faq">Translate API
service</ulink>? Note that Google Translate service is not free of
charge, see chapter <link
linkend="chapter.installing.and.running">Installing and Running</link>
(runtime parameters) for more on that.</para>
</listitem>
<listitem>
<para>"Google Translate returned HTTP response code: 403 ...": check
that the 38-characters key, entered in the pinfo.list file, is
correct. Check that <ulink
url="https://developers.google.com/translate/v2/faq">Translate API
service </ulink>has been activated.</para>
</listitem>
<listitem>
<para>Google Translate does not work: - with the Google API key
entered as requested. Check in Options &gt; Machine Translate, that
Google Translate V2 is checked.</para>
</listitem>
<listitem>
<para>Google Translate V2 reports "Bad request" - check the source and
target languages for your project. Having no languages defined elicits
this kind or a response.</para>
</listitem>
</itemizedlist>
</section>
</chapter>