Skip to content
Permalink
8bed31719c
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
314 lines (254 sloc) 12.1 KB
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"../../../docbook-xml-4.5/docbookx.dtd">
<chapter id="chapter.glossaries">
<title>Glossaries<indexterm class="singular">
<primary>Windows and panes in OmegaT</primary>
<secondary>Glossary pane</secondary>
</indexterm><indexterm class="singular">
<primary>Glossaries</primary>
</indexterm></title>
<para>Glossaries are files created and updated manually for use in
<application>OmegaT</application>.</para>
<para>If an <application>OmegaT</application> project contains one or more
glossaries, any terms in the glossary which are also found in the current
segment will be automatically displayed in the Glossary viewer.</para>
<para>You define its location and name in the project properties dialog. The
extension must be <filename>.txt</filename> or <filename>.utf8</filename>
(if not, it will be added). The location of the file must be within the
<filename>/glossary</filename> folder, but it can be in a deeper folder
(e.g., <filename>glossary/sub/glossary.txt</filename>). The file does not
need to exist when setting it, it will be created (if necessary) when adding
a glossary entry. If the file already exists, no attempt is done to verify
the format or the character set of the file: the new entries will always be
in tab-separated format and UTF-8. As the existing content will not be
touched, damage to an existing file would be limited.</para>
<section>
<title>Usage</title>
<para>To use an existing glossary, simply place it in the<indexterm
class="singular">
<primary>Project files</primary>
<secondary>Glossary subfolder</secondary>
</indexterm> <filename>/glossary</filename> folder after creating the
project. <application>OmegaT</application> automatically detects glossary
files in this folder when a project is opened. Terms in the current
segment which <application>OmegaT</application> finds in the glossary
file(s) are displayed in the Glossary pane:</para>
<indexterm class="singular">
<primary>Glossaries, Glossary pane</primary>
</indexterm>
<figure>
<title>Glossary pane</title>
<mediaobject>
<imageobject role="html">
<imagedata fileref="images/GlossaryPane_25.png"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="images/GlossaryPane_25.png" width="60%"/>
</imageobject>
</mediaobject>
</figure>
<para>The word before the = sign is the source term, and its translation
is (or are) the words after =. The vocabulary entry can have a comment
added. The glossary function only finds exact matches with the glossary
entry (e.g. does not find inflected forms etc.). New terms can be added
manually to the glossary file(s) during translation, for example in a text
editor. Newly added terms will not be recognized once the changes in the
text file have been saved.</para>
<para>The source term does not have to be a single-word item, as the next
example shows:</para>
<figure>
<title>multiple words entries in glossaries - example<indexterm
class="singular">
<primary>Glossaries</primary>
<secondary>Glossary pane</secondary>
<tertiary>multiple-words entries</tertiary>
</indexterm></title>
<mediaobject>
<imageobject role="html">
<imagedata fileref="images/MultiTerm_Glossary_25.png"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="images/MultiTerm_Glossary_25.png" width="80%"/>
</imageobject>
</mediaobject>
</figure>
<para>The underlined item "pop-up menu" can be found in the glossary pane
as "pojavni menu". Highlighting it in the Glossary pane and then
rightclicking insets at the cursor position in the target
segment.<footnote>
<para>Note that in the above case, this is just half (or even less) of
the story, as the target language (Slovenian) uses declension. So the
inserted "pojavni meni" in the nominative form - has to be changed to
"pojavnem meniju" , i.e. to the locative. So it is probably faster to
type the term correctly right away without bothering with the glossary
and its shortcuts.</para>
</footnote></para>
</section>
<section>
<title><indexterm class="singular">
<primary>Glossaries</primary>
<secondary>File format</secondary>
</indexterm>File format<indexterm class="singular">
<primary>Project files</primary>
<secondary>User files</secondary>
<seealso>Glossaries</seealso>
</indexterm></title>
<para>Glossary files are simple plain text files containing three-column,
tab-delimited lists with the source and target terms in the first and
second columns respectively. The third column can be used for additional
information. You can have entries with the target column missing, i.e.
just containing the source term and the comment.</para>
<para>Glossary files can be either in system default encoding (and
indicated by the extension .tab) or in UTF-8 (the extension .utf8 or .txt)
or in UTF-16 LE (the extension .txt). Also supported is the CSV format.
This format is the same as the tab separated one: source term, target
term. Comment fields are separated by a comma ','. Strings can be enclosed
by quotes ", which allows having a comma inside a string:</para>
<para><literal>"This is a source term, which contains a comma","c'est un
terme, qui contient une virgule"</literal></para>
<para>In addition to the plain text format, TBX format <indexterm
class="singular">
<primary>Glossaries</primary>
<secondary>TBX format</secondary>
</indexterm> is also supported as a read-only glossary format. The
location of the <filename>.tbx</filename> file must be within the
<filename>/glossary</filename> folder, but it can be in a deeper folder
(e.g., <filename>glossary/sub/MyGlossary.tbx</filename>).</para>
<para>TBX - Term Base eXchange - is the open, XML-based standard
for exchanging structured terminological data, TBX has been approved as an
international standard by LISA and ISO. If you have an existing
terminology handling system it is quite possible it offers the export of
terminology data via TBX format. <ulink
url="http://www.microsoft.com/Language/en-US/Terminology.aspx">Microsoft
Terminology Collection</ulink> <indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Microsoft Terminology collection</secondary>
</indexterm> can be downloaded in nearly 100 languages and can serve as
a cornerstone IT glossary. </para>
<para>Note: the <filename>.tbx</filename> output of MultiTerm seems to not
be reliable (November 2013), it is better to use the
<filename>.tab</filename> output of MultiTerm instead. </para>
</section>
<section>
<title>How to create glossaries<indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Creating a glossary</secondary>
</indexterm></title>
<para>The project setting allows one can enter a name for a writable
glossary file (see beginning of this chapter). Right-click in the glossary
pane or press <keycap>Ctrl+Shift+G</keycap> to add a new entry. A dialog
opens, allowing you to enter the source term, target term and any comments
you may have:</para>
<mediaobject role="html">
<imageobject>
<imagedata fileref="images/GlossaryEntry_25.png"/>
</imageobject>
<imageobject role="fo">
<imagedata fileref="images/GlossaryEntry_25.png" width="80%"/>
</imageobject>
</mediaobject>
<para>The contents of glossary files are kept in memory and are loaded
when the project is opened or reloaded. Updating a glossary file is thus
rather simple: press <keycap>Ctrl+Shift+G</keycap> and enter the new term,
its translation and any comments you may have (ensuring you press tab
between the fields) and save the file. The contents of the glossary pane
will be updated accordingly.</para>
<para><indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Location of the writable glossary file</secondary>
</indexterm>The location of the writable glossary file can be set in the
<guimenuitem>Project &gt; Properties ... </guimenuitem>dialog. The
recognized extensions are <filename>TXT</filename> and
<filename>UTF8</filename></para>
<para><emphasis role="bold">Note: </emphasis>Of course there are other
ways and means to create a simple file with tab delimited entries. Nothing
speaks against using Notepad++ on Windows, GEdit on Linux for instance or
some spreadsheet program for this purpose: any application, that can
handle UTF-8 (or UTF-16 LE) and that can show white space (so that you do
not miss the required <keycap>TAB</keycap> characters) can be used.</para>
</section>
<section>
<title><indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Priorities</secondary>
</indexterm>Priority glossary</title>
<para>The results from the priority glossary (by default,
glossary/glossary.txt) appear in first places in the Glossary pane and in
TransTips.</para>
<para>As entries can mix words from priority and non priority glossaries,
the words from the priority glossary are displayed in bold.</para>
</section>
<section>
<title><indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Trados MultiTerm</secondary>
</indexterm>Using Trados MultiTerm</title>
<para>Data exported from Trados MultiTerm can be used as
<application>OmegaT</application> glossaries without further modification,
provided they are given the file extension <filename>.tab</filename> and
the source and target term fields are the first two fields respectively.
If you export using the system option "Tab-delimited export", you will
need to delete the first 5 columns (Seq. Nr, Date created etc).</para>
</section>
<section>
<title><indexterm class="singular">
<primary>Glossaries</primary>
<secondary>Problems with glossaries</secondary>
</indexterm>Common glossary problems</title>
<para><emphasis role="bold">Problem: No glossary terms are displayed -
possible causes:</emphasis></para>
<itemizedlist>
<listitem>
<para>No glossary file found in the "glossary" folder.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>The glossary file is empty.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>The items are not separated with a TAB character.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>The glossary file does not have the correct extension (.tab,
.utf8 or .txt).</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>There is no EXACT match between the glossary entry and the
source text in your document - for instance plurals.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>The glossary file does not have the correct encoding.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>There are no terms in the current segment which match any terms
in the glossary.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para>One or more of the above problems may have been fixed, but the
project has not been reloaded.</para>
</listitem>
</itemizedlist>
<para><emphasis role="bold">Problem: In the glossary pane, some characters
are not displayed properly</emphasis></para>
<itemizedlist>
<listitem>
<para>...but the same characters are displayed properly in the Editing
pane: the extension and the file encoding do not match.</para>
</listitem>
</itemizedlist>
</section>
</chapter>