tag:blogger.com,1999:blog-20155610.post8642723377562985894..comments2024-03-06T02:46:19.929+00:00Comments on Translation Tribulations: Cleaning up a crappy OCR job for translationKevin Lossnerhttp://www.blogger.com/profile/14727800526216764023noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-20155610.post-71766256635285573072018-03-16T15:08:46.759+00:002018-03-16T15:08:46.759+00:00Great article! PDFs are probably among the worst w...Great article! PDFs are probably among the worst when it comes to transferring information that needs to be worked on.ChrisGuerrahttps://www.blogger.com/profile/16885004483482435619noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-29702771448212227562014-03-04T12:27:13.740+00:002014-03-04T12:27:13.740+00:00I think I figured it out. Just put ‘CodeZapper 2_9...I think I figured it out. Just put ‘CodeZapper 2_9_4.dot’ in this folder: ‘C:\Users\usr\AppData\Roaming\Microsoft\Word\STARTUP’<br /><br />Word now starts with CodeZapper loaded!Michael Beijerhttps://www.blogger.com/profile/12826804655385764008noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-57734372286090790702014-03-04T12:14:41.778+00:002014-03-04T12:14:41.778+00:00Hi Kevin,
Yes, the disappearing CodeZapper toolba...Hi Kevin,<br /><br />Yes, the disappearing CodeZapper toolbar in Word 2013 is really annoying. If anyone reading this has solved this, please post your solution here in the comments! I was wondering if it might be possible to create a macro that would automate the steps to re-attach it…<br /><br />MichaelMichael Beijerhttps://www.blogger.com/profile/12826804655385764008noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-3496062957626764912014-03-01T19:12:13.981+00:002014-03-01T19:12:13.981+00:00Great article! PDFs are probably among the worst w...Great article! PDFs are probably among the worst when it comes to transferring information that needs to be worked on.<br /><br />As far as Office goes, on can download an ISO file of the relevant edition (i.e. Home&Business, Professional, etc.) and in the architecture (i.e. 32 or 64-bit) and language of one's choice from http://www.heidoc.net/joomla/technology-science/microsoft/73-office-2013-direct-download-links. The license should work just the same. MS is bluntly and rather crudely attempting to drive people to choose their subscription based plan, and the end justified the means.<br /><br />Another set or Macros similar to CodeZapper can be found at http://www.translatortools.net/, but nothing beats having the original file format.Anonymoushttps://www.blogger.com/profile/01282909295316996770noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-44398848968017566752014-02-27T17:51:27.741+00:002014-02-27T17:51:27.741+00:00Last night I received the Mother of All Disastrous...Last night I received the Mother of All Disastrous PDFs from someone - the most complicated survey form I have seen in years, with text boxes split between pages, even through the middle of words. Good luck with OCR for that: it is literally impossible with any techniques I know. The PDF page order does not even display the split boxes in a double page spread so I might use something like a screenshot OCR utility. This insane mess was created in InDesign. Here once again, the translation will be simple if the INDD file can be obtained.<br /><br />We can use techniques like I describe in this post to clean up a lot of messes. But it is much, much better not to step in these cow patties in the first place and to communicate with our clients about the formats with which we can do our best work. Whether we want to admit it or not, eery bit of energy that goes into dealing with messed-up formats somehow subtracts from the energy remaining for quality translation.Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-32162872565296130272014-02-27T13:21:02.950+00:002014-02-27T13:21:02.950+00:00I believe anything of it involves stupidity :-) Ac...I believe anything of it involves stupidity :-) Actually, if they like PDFs, you can have those generated free on Kilgray's Language terminal server. I did several InDesign INDD translations this week (the integration with that new tab in memoQ is very nice), and each time I created target files, I logged in to the web interface of Language Terminal and retrieved the full ZIP package (instead of just the IDML available off the tab in the Translations window). It contains a PDF for proofreading purposes.<br /><br />So if these idiots want to make extra work for themselves, they can send you an InDesign file and you send back just a PDF and let them do an OCR and waste lots of time correcting and importing the text so they feel like they have lots of "work". And some day they'll probably be fired when the company realizes the waste.Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-81598792913778316652014-02-27T11:11:40.276+00:002014-02-27T11:11:40.276+00:00very interesting ...
well, a couple of years ago I...very interesting ...<br />well, a couple of years ago I found my 1st final customer, so I think I can add some insight to the matter<br />J<br /><br />I received an enormous and quite undigestible PDF, as usual, but considering that it was a "final customer" I tried to have the InDesign or PageMaker file instead<br /><br />then I tested it finding it very digestible and easy to work, then I did a mock automatic translation to leave them consider the result, adding that using a more digestible file would have been translated in a lower expenditure for them<br /><br />well, can you believe it?<br /><br />they refuse this method because:<br />the graphic departement was accustomed to manage PDFs, even if it meant a lot of troublesome conversions<br />the IT departement, idem<br />the the big boss, was accustomed to read PDFs<br /><br />the moral of the story is that nothing beats the psychological sclerosis!<br /><br />that is BTW a problem of mine too, considering how much time was needed to switch to memoQ from Trados, or to Office 2013 from Office XP ...<br />;-)<br />ClaudioPorcellanahttps://www.blogger.com/profile/05843055554711851532noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-63354559205068567642014-02-25T15:58:27.662+00:002014-02-25T15:58:27.662+00:00Hi Kevin,
You did a great cleaning job here.
Re: M...Hi Kevin,<br />You did a great cleaning job here.<br />Re: MS Office language versions. I had the same issue a few weeks ago when I bought Home&Business 2013. I decided not to get the 365 version but the one-off purchase to download on my machine (although it's still click-to-run, not the full MSI installer). MS kept taking me to a page where I could only download the Spanish version, and after much coming and going I managed to reach an English version here:<br />http://www.microsoftstore.com/store/msusa/en_US/pdp/Office-Home-and-Business-2013/productID.259321600<br />It's now running happily in English on my Win 8.1 OS (which originally came in Spanish and I was able to change to English by adding a language pack, fairly painlessly). <br />EmmaEmma Goldsmithhttps://www.blogger.com/profile/05032244553430573646noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-89912811616235833592014-02-24T20:06:44.621+00:002014-02-24T20:06:44.621+00:00Oh yes... that update? It wasn't. Same text, j...Oh yes... that update? It wasn't. Same text, just a different file format. But with the previous format chaos those involved could not determine this. One more reason to stick to original formats where possible. I suppose I could use the word <i>"<a href="http://www.translationtribulations.com/2014/02/medienbruch.html" rel="nofollow">Medienbruch</a>"</i> to describe this problem too.Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-60093561270445773192014-02-24T20:03:31.966+00:002014-02-24T20:03:31.966+00:00Well, there is actually a better solution for all ...Well, there is actually a better solution for all this, one which is usually an option, but which most translators or agency PMs are too afraid to try. Insist on the original file. It works like a charm.<br /><br />Today there was an "update" to this job. Or so it seemed. The agency called up in a panic, a new PDF had been received from the client and they wanted me to compare and make changes. Life's too short for such BS. I basically pointed out that this was going to be very expensive and that really, for all concerned, the best thing to do now was to ask the client nicely for the original InDesign file (which I think actually was not in existence before but it clearly was now). Ten minutes later I had it, and the text was the cleanest and most trouble-free of any process yet. Everyone saves time and money. Except the translator who now makes more money in the time spent. We all win.<br /><br />So honestly... all these damned converters and techniques are fine things in a real "emergency", but we do everyone involved a great service if we just dig our heels in and insist that original format files be produced wherever possible. This has been the best and most cost-effective solution for more projects than I can count, but in those cases where I am dealing with an agency, it takes longer to convince a PM to ask for the file than it takes for the customer to send it. Such timidity is in nobody's interest in cases like this.Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-27505624496003379692014-02-24T19:51:04.170+00:002014-02-24T19:51:04.170+00:00Wow, what a nightmare. I hope you charged triple f...Wow, what a nightmare. I hope you charged triple for this. Thanks for describing your process, though; I learned a lot!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-20155610.post-8931687510936025812014-02-23T10:46:48.661+00:002014-02-23T10:46:48.661+00:00Wow, you definitely deserve a good pat on the back...Wow, you definitely deserve a good pat on the back for this.<br /><br />______________<br />Karolina Karczmarek-Giel<br />Office Assistant<br />wantwords.co.ukAnonymoushttps://www.blogger.com/profile/18234283795971838285noreply@blogger.com