Jul 5, 2013

Translating video captions in memoQ

Since my first explorations of editing video caption files in a text editor last week, I've learned quite a few ways to improve the process. I found a free, cross-platform Open Source tool, Aegisub, for editing the captions. It is particularly helpful when the timing needs to be adjusted, and its use is fairly intuitive. It beats working in Notepad or Microsoft Word by a long shot.

For translating caption files I also discovered a useful resource on Kilgray's Language Terminal: a Regex text filter designed to filter out the segment numbers and time codes in the caption files. Useful exclusion rules to configure for this are as follows:


The resource file for the filter settings (MQRES) and some sample cation files in English can be downloaded here.

Here is an example (preview) of how text is filtered:


To ensure that the correct filter settings are used for the captions text file, use Import with options... in memoQ and set the file type to "All files (*.*)" for the likely case that the file extension is not recognized by memoQ:


If you want to change the text breaks in a given time segment, use the Join function to combine segments, and place the tag for the line break wherever it makes sense to do so:


12 comments:

  1. OK, but there is no Video captions filter configuration in memoQ. I downloaded VideoCaptions.zip, and it really has a RegexTextConverter#VideoCaptions.mqres file, which I suppose is the memoQ filter, but how to use it, or make it seen by memoQ?

    ReplyDelete
    Replies
    1. Import the filter configuration, then import the file to memoQ, set the Regex text filter and the particular saved configuration you imported for that.

      Delete
  2. How is possible to import filter configuration? I tried several ways, but unsuccessfully.

    ReplyDelete
    Replies
    1. Use the relevant section in the Resource Console.

      Delete
    2. can you help me import file srt on notepad in detail ?

      Delete
    3. I'm sorry, I do not understand what you are asking.

      Delete
  3. In the MemoQ menu there is a resource console option. There, in filter configuration,you have to import the mqres file. You can also upload it into a MemoQ server if you work in collaborative environment. Then, when you have the filter in place, you have to import files through the 'Import with options' link. There you can configure filters to be used with the particular files to be imported.

    ReplyDelete
  4. I have some text like this:
    !_awa="WA"
    %s. Limits.="%s. Limits."
    _***="United States Territorial Waters"
    _a**="Australia"
    _act="Australian Capital Territory"
    _AFG="Afghanistan"
    _AHO="Netherlands Antilles"
    _AIA="Anguilla"

    How to make a filter to show me what is between apostrophes only?

    ReplyDelete
    Replies
    1. In this particular case I would use the JSON filter.

      Delete
    2. BTW, those are not apostrophes, they are quotation marks.

      Delete
  5. I am following the rule that you presented above, but my .srt uses a "comma" instead of a "period" after the timecode and I don't know how to change the rule so that it would work in memoQ. The subtitles below are some examples. Can you help me?
    1
    00:00:06,236 --> 00:00:07,379
    - Hey, welcome back to
    "Right on the Money,"

    2
    00:00:07,379 --> 00:00:09,101
    the show that features financial advisors,

    ReplyDelete
  6. It works! Very useful workaround, thank you Kevin!
    Regards

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)