Showing posts with label TransPDF. Show all posts
Showing posts with label TransPDF. Show all posts

May 10, 2018

Zooming inside iceni InFix for PDF translation: web meeting on 21 June 2018


Over the course of the last nine years, I have published a few articles about ways that I have found the PDF editor iceni InFix useful for my translation and terminology research work. Throughout that time iceni has continued to improve that product as well as develop other technologies for PDF translation assistance, such as the online TransPDF service now integrated with memoQ.
It's one thing to have a tool and in many cases quite another thing to know how to make the best use of it. This situation is further complicated by the very wide range of scenarios in which an editor like iceni InFix might be useful and the great differences one often finds in the needs and expectations of the clientele from one translator to another. In the product's early days I followed the commentaries of José Henrique Lamensdorf, a Brazilian engineer with long experience in technical translation, desktop publishing and other fields, and while I consider him to be among the most useful sources of good technical information for me in my early days as a commercial translator, his project needs were very different from mine, and most of the things he mentioned a decade or more ago, though very relevant to people heavily involved with publishing, weren't a fit for my clientele.
That changed as iceni expanded the feature set over the years and I began to encounter many cases where OCR and a full Adobe Acrobat license did not quite do what I needed in a simple way.


Some weeks ago I had an online meeting scheduled with a client company to discuss the advantages of certain support technologies with that company's translation and project management staff. We tried to use TeamViewer for the discussion, but unfortunately my license could not accommodate the 6+ people involved, and I was reluctant to fork over the extra cash needed for a 15 or 25 participant license, especially because some other clients had issues with TeamViewer which I never clearly understood, leading their IT departments to ban it. And the TVS recording files, while generally quite decent for viewing and of manageable size due to an excellent compression CODEC, are a nightmare to convert cleanly to MP4 or other common video formats. Just as I was caught in this dilemma, my esteemed Portuguese to UK English translation colleague and gifted instructor at Universidade Nova de Lisboa, David Hardisty, enthusiastically re-introduced me to Zoom videoconferencing.

I had seen Zoom before briefly when IAPTI decided to ditch the Citrix conferencing solutions and use it for webinars and staff meetings, but at the time I was too distracted by other matters to remember the name or notice the details. And, as we know, there one finds the Devil.

Zoom is powerful and flexible. For about €13 a month for my Pro license, I can invite up to 100 people for a web meeting, with quite a few useful options that I am still getting a grip on. Being used to the relative simplicity of TeamViewer, I am a little overwhelmed sometimes, and I have had a few recorded client meetings where the video was flawed because I got the screen sharing options mixed up. But the basics are actually dead simple if one pays a bit of attention.

A Zoom "web meeting", by the way, is what I would call a webinar, but that term means something else in Zoomworld, involving up to 50 speakers and something like 10,000 participants for some monthly premium. Not my thing. If the crowd is bigger than 10 in an online or a face-to-face class, I start to feel the constrictions of time and individual attention like an unruly anaconda around me.

But in any case, for someone who has spent many years looking for better teaching tools, Zoom is looking pretty good right now. And it enables me to share what I hope is useful professional information without dealing with the organizational nonsense and politics often associated with platforms licensed by some companies and professional associations. All for the monthly price of a cheap lunch.

So I've decided to do a series of free public talks using Zoom, not only to share some of a considerable backlog of new and exciting technical matters for translators, translation project managers and support staff and language service consumers, but also to get a better handle on how I can use this tool to support friends, colleagues and students around the world. Previously I announced a terminology talk (on May 24th, mostly about memoQ); now I have decided to share some of the ways that iceni InFix helps me in my work and what it might do for you too.

Soon Thursday, June 21st at 16:00 Central European Time (15:00 Lisbon time) I'll be talking about how you can get your fix of useful PDF handling for a variety of challenging situations. You are welcome to join me for this.

The registration link is here.


Jun 24, 2017

The other sides of Iceni in Translation


The integration of the online TransPDF service from Iceni in memoQ 8.1 has raised the profile of an interesting company whose product, the Infix PDF Editorhas been reviewed before on this blog. TransPDF is a free service which extracts text content from PDF files, converts it to XLIFF for translation in common translation environments, and then re-integrates the target text from the translated XLIFF to create a PDF file in the target language.

This is a nice thing, though its applicability to my personal work is rather limited, as not many of my clients would be enthusiastic if I were to send PDF files as my translation results. Sometimes that fits, sometimes not. And of course, some have raised the question of whether using this online service is compatible with some non-disclosure restrictions.

I think it's a good thing that Kilgray has provided this integration, and I hope others follow suit, but for the cases where TransPDF doesn't meet the requirements of the job, it is useful to remember Iceni's other options for preparing text for translation.

Translatable XML or marked-up text export
As long as I can remember, the Infix PDF Editor has offered the option to export text on your local computer (avoiding potential non-disclosure agreement violations) so that it can be translated and then re-imported later to make a PDF in the target language. Only the location of this option in the menus has changed: the menu choices for the current version 7 are shown below.



This solution suffers from the same problem as the TransPDF service: not everyone will be happy with the translation in PDF, as this complicates editing a little. However, I find the XML extract very useful to put the content of PDF files into a LiveDocs corpus for reference or term extraction. The fact that Infix also ignores password protection on PDFs is also helpful sometimes.

"Article" export
The Article Tool of  the Iceni Infix PDF Editor enables various text blocks on different pages of a PDF file to be marked, linked and extracted in various translatable formats such as RTF or HTML. The quality of the results varies according to the format.


Once "articles" are defined, they are exported via the command in the File menu:


The RTF export has some problems, as this view in Microsoft Word with the format characters made visible reveals:


However, the Simple HTML export opened in Microsoft Word shows no such troubles (and can be saved in RTF, DOCX or other formats):


Use of the article export feature requires a license for the Infix PDF editor, unlike the XML or marked-up text exports for translation. In demo mode, random characters are replaced by an "X" so that one can see how the function works but not receive any unjust enrichment from it. However, this feature has significant value for the work of translators and is well worth an investment, as the results are typically better than using OCR software on a "live" (text-accessible) PDF file.

But wait... there's more!
Version 7 also has an OCR feature:


I tested it briefly on some scanned Portuguese Help Wanted ads that I'll probably use for a corpus linguistics lesson this summer; the results didn't look too awful all considered. This feature is worth a closer look as time permits, though it is unlikely to replace ABBYY FineReader as my tool of choice for "dead" PDFs.