There is, of course, another simple way to translate the embedded objects in a Microsoft Office document that does not involve purchasing other software licenses. I don't usually talk about it, because there are a few limitations, and until recently I had not figured out how to avoid corrupting the files when I tried to do things the "easy" way. This approach is not limited to memoQ and will actually work with most CAT tools - so SDL Trados Studio users can do this as well, for example.
It is useful to know that the Microsoft Office 2007/2010 file formats (DOCX, PPTX, XLSX) are really just ZIP files containing XML and a bunch of other stuff. That stuff includes a folder with the embedded objects in formats that can be dealt with directly.
If you have an older, binary MS Office document (DOC, PPT, XLS) with embedded objects, convert it to a 2007/2010 format.
If you rename the file extension DOCX, PPTX or XSLX to ZIP and unpack the ZIP file, inside the folder you will find a folder called "embeddings". The files in that folder can be copied elsewhere and usually handled directly in your CAT tool. But problems usually arise when you put them back, rezip the folder and change back to the original extension. The compression gets screwed up, and the Microsoft Office file is corrupted and won't open.
The only reliable method I have found for avoiding this is to use the Windows Explorer (under Windows 7) to open the ZIP file:
Here's what the "guts" of one DOCX file with a bunch of embedded Excel tables looks like:
Simply copy the embeddings folder somewhere safe, translate its contents, then copy them back to the ZIP file using Windows Explorer. Then rename the ZIP extension to the original extension for the file.
If you open the file and look at it, you'll get a shock. When you see all the objects in their original language, you might think something went wrong. Nothing bad has happened; you merely need to refresh the objects. This can be done by opening each briefly to edit or using a macro to open each object and close it again quickly. In a job with dozens of embedded objects in a long file, this macro is a helpful shortcut.
Given how easily accessible this embedded content actually is, one has to wonder why other major CAT tool providers like SDL and Kilgray have failed to offer the option of importing embedded content in their filters up to now. Let's hope they do soon. In the meantime, this workaround should enable many people to deal with this complex and irritating file format challenge.
Here's a summary of the procedure once again:
- Rename the *.???x file to *.zip
- Under Windows 7, right-click on the ZIP file and open it using the Windows Explorer. Using ZIP tools of any kind risks corruption by changing the compression ratios.
- Find the embeddings folder inside the ZIP structure. Copy this elsewhere and use it as the source for translation. It will contain all the embedded objects as single files.
- Copy the translated content back into the embeddings folder in the ZIP structure.
- Rename the ZIP file to its original extension.
- Open the file and refresh each embedded object (which will initially appear not to have been translated) by right-clicking and opening it from the context menu or running a macro to do that.