Jump to content

Translate extension

From Wikimedia Belgium
Translate this page; This page contains changes which are not marked for translation.

The Translate extension allows for easy translation of pages on multilingual Wikimedia platforms. This extension is installed on all major Wikimedia platforms, except on Wikipedia that has its own specific translation tool.

Translated pages start with <languages/>. The core of the system are <translate> and </translate> tags that mark the start and the end of translated text. Empty lines after a <translate> tag initiate translation units. Paragraphs are the main translation units, besides heads, and lists.

Note: by Community convention, Wikipedia, by convention, is only serving one single language per platform. It has its own dedicated translation system that works across Wikipedia languages (and is automatically translating internal links, references, and templates, and updates the Wikidata sitelink).

Principles

[edit]

It allows Wikimedia platforms to really become multi-language, completely transparent for the end-user, and easy to be managed by the translators.

It is crucial that the master (source) page is well structured with the required translation tags, so that the translation process is straightforward for the translators. This is the responsibility of the authors, and the translation administrator.

Sections should be kept short, and should not contain specific wiki syntax. The work of the translator should be kept simple, and wiki code or HTML should be hidden for the translator. After the content of the source page becomes complete and stable, it is marked for translation by the translation administrator.

Notes:

  1. This tool is not to be confused with the dedicated Wikipedia article translation tool that is used to translate Wikipedia pages.
  2. The user can choose their GUI language, independent from the content language.
  3. Make sure to set the source language to the correct value; the platform default is often wrong; there is no automatic language detection.

Advantages

[edit]
  • Easy to use by translators, a bit more difficult for the base page editors, and for the translation administrators.
  • Keep page sections aligned amongst languages. When content on the source page is changed, translations can be easily updated, after performing another mark for translation.
  • Well integrated in the MediaWiki environment, e.g. using the Special:MyLanguage/ link prefix a language code suffix is automatically generated for the readers' language.
  • For the reader it is completely transparant.
  • The base language can be any language, but English is the standard (most translators understand English). You should choose the base language that is known to most translators.
  • Translation to any language can be incremental; untranslated sections show the original text (typically English, but it can be any supported language).
  • Work can be distributed amongst volunteers. Nobody knows all languages. A volunteer can be appointed for each language.
  • The history of the translated texts are maintained separately and in the Translations namespace.
  • Translated sections can be reused on other pages (so it is advantageous to keep translation sections short and unique).
  • Translated sections are automatically saved. (this has a caveat: multitude of small page updates; should have been aggregated?).

Generic requirements

[edit]
  1. The extension needs to be installed by the site administrator.
  2. Each page to be translated needs to be marked for translation by the translation administrator. This requires a special groups right. It needs to be marked again after (multiple) amendments are done.
  3. You need volunteers capable of translating from the source language to the target language.
  4. You better choose the source language to be the best known language amongst your community; doing so you will have more translators and translations.
  5. Sometimes you have no choice as to take the version that is already available. Maybe the source language is just the first version of the page, whatever the language is. Wikimedia platforms have a default language. The source page language does not always match this default language. Make sure to verify or change the language code before starting the translation.

General procedure

[edit]
  1. First write your text without tagging translations. Add a single empty line before a header, a paragraph, a list, or any other sections in your document. This makes it easier for the parser to automatically create translation units.
  2. Make sure to set the page language to the correct value. Otherwise translation will be impossible.
  3. Start your document with <languages/> on the first line to automatically build a language list, and a language menu option.
  4. The header and the footer of the article must not be translated: place <translate> after the header and </translate> before the footer.
    • Special templates at the top of the page are considered being part of the header, so are not translated.
  5. Add more <translate> and </translate> tags:
    • Add </translate> and <translate> tags where you (don't) want to translate text.
    • Lists could have one <translate> </translate> pair for each line after the * or the #
  6. Make sure to use internal links when possible (unique link prefixes are available).
  7. General cleanup of the Wiki text (make sure you have blank lines before headers, paragraphs, lists, or other sections).
  8. The translation administrator marks the page for translation:
    • Don't do this step until the basic document is completely ready, and stable.
    • Make sure that all necessary translation markup tags are included at their proper place.
  9. Translate the page into any wanted language; this task can be distributed to multiple translators.
    • Anyone can start to translate, after the base page is marked for translation.
  10. Maintain the updates. Please verify the that new translation units must start with an empty line.
  11. When the source text changes, the translation administrator has to mark the page again for translation.
  12. Then editors can filter for changed or new sections.
  13. Readers are warned for obsolete sections. Anyone can solve the problem by adding the missing translaton.

Languages

[edit]
  • Use {{PAGELANGUAGE}} to insert the ISO language code, e.g. in URLs or language specific templates.
  • Use the Special:MyLanguage/ prefix to automatically refer to other translated pages.

Categories

[edit]

Categories are considered being part of the footer, so are not translated. They require suffixes to provide language specific categories.

  • Categories start with a </translate> tag (to mark the page footer)
  • Every Category require a {{#translation:}} suffix.

Restrictions

[edit]
  • After marking the base page for translation, only the base text (source page) can be edited.
  • The translated pages can only be maintained by the tool. They can't be edited in the normal way.
  • There is a suffix version of the base text without translation units, that can be used to copy the wiki text to other platforms (or a new issue).
  • You might export the translated text in wiki code
  • You need to be member of the translation administrator group in order to be able to mark a page for translation. A bureaucrat can grant this group membership.
  • Any user can translate the article to any language once it is marked for translation by the administrator.
  • There is an internal version control, so anyone can verify if a translation is out-of-date.
  • Never remove the space character after the special translation unit comment (translation units typically start at a new line).

Unit markers

[edit]

Text sections and translation unit markers are bound to each other:

  • Do not manually insert nor remove the translation unit comments.
  • Make sure translation units are separated from each other (e.g. one blank line between the header and the next paragraph, one space after the translation marker = structured comment).
  • When copying text from other (translated) pages, only paste the text, not the "foreign" translation unit comments. When necessary copy wiki code from the /en (base) version of the page (translation unit markers being removed).
  • When you want to remove all text of a translation unit, also remove the structured comment, to avoid conflicts amongst translation units.
  • You can move sections, together with there markers. Caveat:
    • make sure to keep unit markers on a separate line after an empty line;
    • images - put and keep images on separate lines, together with their marker;
    • paragraphs - when moving paragraphs, pay attention to line spacing
  • Tip: use the text editor to move section texts; the unit markers are easily identified.
  • When you merge two paragraphs, remove the second translation marker.
  • The </translate> and </translate> tags don't have empty lines. They imply an empty line for the generated HTML.

Specific rules and tips

[edit]

Translators should only see short textual paragraphs and sentences to translate. Never complicated Wiki code, nor tables, if possible.

For simple documents, i.e. only containing headings and paragraphs, only one single <translate> tag (after the header) and </translate> tag (before the trailer) is sufficient. <translate> and </translate> must always be balanced. You should put <translate> and </translate> on separate lines when possible. Don't add extra blank lines.

Insert one blank line before section headers, or a <translate> or </translate> tag.

Make sure you start every paragraph with a blank line, or a <translate> or </translate> tag (even the first paragraph after a head). Doing so the software can correctly parse heads and paragraphs.

You can mark <syntaxhighlight> and </syntaxhighlight> between </translate> and <translate> tags to prevent translation.

(long) Lists should have one <translate> </translate> pair for each line after the * or the #. Before and after the list you need to add </translate> and <translate>.

URL labels (both internal and external) need to be embedded in <translate> </translate> pairs.

You need to add one extra </translate> and <translate> tags before and after a list or a (group of) URLs.

Use the {{lwp}} template to embed a Wikipedia page.

Categories require a {{#translation:}} suffix, to discriminate them from the base categories. Categories belong to the footer, so come after the final </translate> tag.

{{PAGELANGUAGE}} can be used everywhere... it translates to the "local" ISO language code. This way you can automatically insert language sentisive prefixes or suffixes.

Use the Special:MyLanguage/ prefix to refer to other Wikimedia language pages.

Migrate older translation techniques

[edit]

Translate an existing page

[edit]

You might try the Special:PagePreparation interface to automate this:

  1. Check the name of the base page (avoid accents or UTF characters, choose an English name).
  2. If necessary rename the pagename (with or without redirect, depending if there are referencing pages?).
  3. The page language should reflect the text language. If necessary change the page language (or update via Page information).
  4. Replace any special language templates by tags supported by the Translate extension (read more below).
  5. Continue with the general procedure above.

Migrate explicit language subpages

[edit]

When language subpages exist:

  1. Rename the subpages to reflect the ISO language code.
  2. Prepare and mark the source page.
  3. Run the Import subpage tool to align every translated page. The previous version of each translated page is used to perform the alignment.

Insitu translation with LangSwitch

[edit]

Using LangSwitch. Requires to split the language text. Remove the LangSwich templates.

Migrate {{Page translated}} - using LangSwitch

Migrate {{Langmajor}}

Migrate {{Langmenu}} - Insitu with title update

Migrate other translation mechanisms

[edit]

Specific techniques exist, depending on the used templates. Those templates should be recoded to equivalent language tags.

Migrate {{LangSwitch}} - coding language text sections

Remove the LangSwitch templates.

Migrate {{Pages translated}} - separate pages

Migrate {{Langhead}} - Individual translated pages

Requires renaming to ISO language suffix pages.

Migrate {{Languages}} - language suffixes

Migrate {{GetFallback}} or {{int:Lang}} => {{PAGELANGUAGE}}

Terms and definitions

[edit]
Term Definition Remarks
Translation extension Tool to semi-automatically translate pages on Wikimedia platforms There exist two mechanisms, Wikipedia and a generic one (this one).
Translation administrator Person responsible for translation Groups right is assigned by a burocrat, after screening of the volunteer.
Translation tags Special tags recognized by the translation tool Examples: <languages/> <translate> </translate>
Translation unit Piece of text to translate. Should be kept short, and should not include Wiki tag codes or constructs.
Translation unit marker Structured comment to indicate a translation unit Unit markers uniquely belong to the section of text. They are automatically generated. Translated texts are stored in the database for later reuse.

Components and functions

[edit]
Edit rights Component Description Remarks
Translation administrator 1. Set source page language Setup or verify the page language. Every site has a default language. Language can also be updated via "page information", language.
2. Activate a source page for translation Automatically add <languages/> and <translate> tags Editing subpages is disabled after marking the page for translation.
3. Mark a page for translation Allow translators to translate a (new version of the) source page Needs to be repeated after each edit of the source page.
Import a subpage Initially align translated page sections Do it once only for each language subpage. Could be cumbersome.
Deactivate a page for translation Deactivate page translation and remove the translation tags Enable free subpage editing again. Should only be used in emergency situations.
Any user List translated pages Show the list of translated pages Manage the pages.
Translate Translate any page ready for translation The actual translated page is amended.

Tips and tricks

[edit]
  1. You can copy the complete language page wiki text, including the translation units (structured comments) to make the next issue of a series of campaign pages (example). This will much simplify the editing.

Known problems

[edit]

Out-of-date translations

[edit]

The translated pages don't show the same content as the source language page. But the system seems to indicate that the translations are up-to-date, which is not the case.

Solution: The original text has been modified, but the translation administrator did not mark the page for translation. Some languages still need to be updated.

Redundant verbatim translate tags

[edit]

When you see verbatim <translate> and </translate> tags while saving your updates, the obvious problem is a missing empty line between a header and a paragraph, or a missing space after a translation unit marker. This leads to two "concatenated" and conflicting translation units. Remember every paragraph has to start with an empty line for the translation engine to work properly. Another problem can be redundant or missing <translate> and </translate> tags.

Wrong source language

[edit]

Any language can be source language. Default is the wiki default language. English is most used, since it is internationally best known. You can only properly translate when the page language corresponds to the actual language of the text. Setting the language is restricted to the translation administrator.

  • Use Special:PageLanguage or page information to change the source language when it was wrongly (or not) set during the first mark for translation.

Bot automatically adds untranslated English page

[edit]

Restrict the list of languages that are required.

Marking for translation

[edit]

After the source page has been changed, you must refresh the page before you get the option to mark the updated page for translation.

False outdated translation warning

[edit]

When translating and choosing for outdated sections, the last translated section is still shown as untranslated after it has been modified.

Solution: Filter again for untranslated sections.

A lot of updates in the page history

[edit]

The translation tool saves every translated paragraph (translation unit) separately. It makes many changes in the Translations namespace. It can become very confusing when viewing the long history of (user) updates.

It might be needed to explicitly select the (main) namespace when searching for contributions. Otherwise you might not find your contribution.

  • Filter for specific namespaces.
  • Uncheck the Translations namespace.
  • Only show the last update.

Categories not shown in visual editor

[edit]

Categories having a {{#translation:}} suffix are not shown in the primary page with the visual editor. This can be confusing to the user.

Swich to the tag editor to update the categories.

Line spacing anomalies

[edit]

Redundant whitespace can be generated because of:

  • redundant <br> at end of a section
  • multiple blank lines
  • empty line before </translate>
  • empty line after structured comment

Solution: remove redundant <br> or empty lines.

Obsolete translation tasks after last translation task

[edit]

The status of the translation is not correct after the last task has been performed; it still shows outdated pending tasks.

Solution: Click on Untranslated or Outdated.

Moving images with translation marker

[edit]

When moving images, take care to also move the translation marker. Use Wiki code to be sure.

Multiple unite markers

[edit]

After moving or deleting paragraphs, or images, or other sections, multiple unit markers prevent saving the page.

Indication:

  • A dangling unit marker at the end of a paragraph

Solutions:

  • Only move Wiki text with the code editor; avoid moving text with the visual editor.
  • Take care when removing a paragraph (unit markers might arrive at the end of the previous paragraph).
  • Verify, and compare with the previous version; put the unit marker back on its right place. Keep an empty line before the marker. Put the "back" marker on a separate line.

Redundant mark for translation indication

[edit]

You get a false mark for translation warning after the page was protected for edits.

Solution: ignore this warning.

Paragraphs and sections

[edit]

When adding headers and paragraphs, the visual editor doesn't add the necessary empty lines.

As a consequence, the translation units aren't identified properly.

  • Verify and fix the wiki code.

Can't edit with the visual editor without marking for translation

[edit]

After adding <languages /> and the initial <translate> and </translate> tags, you need to perform a mark for translation.

Until then, only text editing can be done.

See also

[edit]