tag:blogger.com,1999:blog-20155610.post6232109688762163719..comments2024-03-06T02:46:19.929+00:00Comments on Translation Tribulations: How would you translate the chart in this DOCX file?Kevin Lossnerhttp://www.blogger.com/profile/14727800526216764023noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-20155610.post-80965350578016973292013-07-19T12:37:38.530+01:002013-07-19T12:37:38.530+01:00It seems that incessant complaining might actually...It seems that incessant complaining might actually work. After some back and forth between me and the AnyCount (AIT) support guy (in which I politely reminded him that AnyCount is not cheap (at €85 for the Enterprise edition) and really should be able to do this if it is going to offer us translators any added value), I received the email below. <br /><br />Incidentally, this just goes to show that we should all be more vocal when it comes to features we believe our tools should offer. Because if they don’t give us what we want, we can just choose another solution. There are enough good ones around. For example, Kilgray refused to add a way to remove duplicates from TBs in memoQ pro. Instead, they added it to qTerm. So what did I do, after asking and asking and not being listened to? Wrong. I didn't do nothing and just wait some more. I switched to CafeTran (http://cafetran.wikidot.com/). Igor (the developer of CafeTran) has added more feature his (freelance) users actually want in the last week than Kilgray has in the last year! Have a look at this: http://wordbook.nl/new-features-added-to-CafeTran-in-the-last-week-alone.html<br /><br />---------------------------------------------*<br />Alexander Artamoshkin<br />12:07 PM (1 hour ago)<br /><br />to me <br />Dear Michael,<br /><br />Your suggestion has been reviewed by our lead software developer and entered into our corporate suggestion database. We will consider your suggestion while working on upcoming versions of AnyCount. <br /><br />Do not hesitate to contact us if you have any other questions or suggestions.<br /><br />Best regards,<br />Alexander.<br /><br />----------<br />Alexander Artamoshkin,<br />AIT Software Development TeamMichael Beijerhttps://www.blogger.com/profile/12826804655385764008noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-91446974935982474242013-07-18T17:21:55.712+01:002013-07-18T17:21:55.712+01:00So today I decided to count the file in various pr...So today I decided to count the file in various programs, and finally to do it myself. After all, how hard could it be, right? Here are the results:<br /><br />------------------------------*<br />PractiCount (FAILED):<br /><br />words: 31<br />characters with spaces: 202<br />characters without spaces: 173<br />lines: 3<br />pages: 1<br />------------------------------*<br />AnyCount (FAILED):<br /><br />Text: 32 words <br />Text Boxes: 0<br />Shapes: 0<br />Running Headers: 0 <br />Running Footers: 0 <br />Footnotes: 0 <br />End Notes: 0<br />Embedded Object: 0 <br />Linked Object: 0 <br />Comments: 0 <br />Hidden Text: 0 <br />File Total: 0<br />------------------------------*<br />MS Word (FAILED):<br /><br />words: 32<br />------------------------------*<br />LibreOffice Writer (mangles document; doesn't display chart at all):<br /><br />31 words<br />------------------------------*<br />Michael Beijer (SUCCEDED):<br /><br />1. normal text (accessible in .docx file):<br /><br />A survey was conducted to determine feelings regarding the best communication strategy.<br />Figure 5: What topics were particularly suited to communicate the content of the fire safety plan at public events?<br /><br />2. words in chart (just rename .docx as.zip, navigate to: \zip\Figure_5_chart_eng-US (1)\word\charts and count):<br /><br />Importance of representing specific action proposals (n=64)<br />Importance of representing the draft of the fire safety plan (n=55)<br />Extreme <br />Strong<br />Moderate <br />Somewhat<br />None<br /><br />= 56 words total<br /><br />That is, 56 - 31 = 25 missing words! (or put another way: you just lost 44.6% of your rate for this job)<br />Michael Beijerhttps://www.blogger.com/profile/12826804655385764008noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-83829758344435129162013-07-18T17:21:18.019+01:002013-07-18T17:21:18.019+01:00So after trying to count the words in this documen...So after trying to count the words in this document, and failing (because I stupidle forgot to use the rename to .zip trick), I all of a sudden remembered that I actually bought the Enterprise edition of AnyCount recently, for around €95! To my not so surprise it also couldn’t get at the words in those pesky charts. I therefore decided to ask AnyCount’s support department what they thought, which went a little something like this:<br /> <br /><br />Michael Beijer (Client) Posted On: 17 July 2013 01:34 PM<br />________________________________________<br /><br />Subject: Could you tell me how to get an accurate word count from the attached file? <br />I was hoping that AnyCount would catch the words in the figure as well. I managed a correct count using ABBY Screenshot Reader, and was wondering why AnyCount didn't use OCR on the image portion... Is there a setting to make it do this?<br /><br />Michael <br /><br /><br />Alexander Artamoshkin<br /> Jul 17 (1 day ago)<br /> <br /> <br />to me<br /> <br /> <br />Dear Michael,<br /><br />Thank you for you question.<br /><br />Unfortunately, there is no possibility to count text in the chart.<br /><br />The only way to do that is to make a printscreen of this chart and to remove all the diagrams and lines leaving only the text.<br /><br />Thereafter you will be able to count the text from the printscreen you have got.<br /><br />We are sorry for inconvenience.<br /><br />Please feel free to contact us if you have any questions.<br /><br />Best regards,<br />Alexander.<br /><br />----------------------------------------------<br />Alexander Artamoshkin,<br />AIT Software Development Team<br /><br /><br /><br />Michael Beijer <br /> 12:37 PM (5 hours ago)<br /> <br /> <br />to support<br /> <br /> <br />Hi Alexander,<br /><br />Two things.<br /><br />1. I was under the impression that AnyCount could use OCR to count words in images. Why can't it figure out that there is text there that it can't get at and apply OCR?<br /><br />2. The text in question is actually present inside the file. Please have a look at Kevin Lossner's blog post on this exact problem (http://www.translationtribulations.com/2013/07/how-would-you-translate-chart-in-this.html) and how to get at it. Couldn't this be programmed into AnyCount? It's just a matter of renaming the .docx as a zip file and locating the text displayed in the charts.<br /><br />Michael<br /><br /><br />Michael Beijer<br />Translator & Terminologist<br />(Dutch/Flemish into English)<br />Skype/Twitter: michaelbeijer<br />iMessage: michael@wordbook.nl<br /><br />Alexander Artamoshkin<br /> 1:49 PM (4 hours ago)<br /> <br /> <br />to me<br /> <br /> <br />Hello Michael,<br /><br />Thank you for your questions.<br /><br />>Why can't it figure out that there is text there that it can't get<br />at and apply OCR?<br /><br />AnyCount may recognize the pictures like a text. Because of this it may give you false results.<br /><br />>Please have a look at Kevin Lossner's blog post on this exact problem (http://www.translationtribulations.com/2013/07/how-would-you-translate-chart-in-this.html)<br />and how to get at it<br /><br />I would like to mention that AnyCount uses its own OCR with its own abilities. <br /><br />Please feel free to contact us if you have any additional questions.<br /><br />Best regards,<br />Alexander.<br /><br />----------------------------------------------<br />Alexander Artamoshkin,<br />AIT Software Development Team<br />Michael Beijerhttps://www.blogger.com/profile/12826804655385764008noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-21547132573668828042013-07-17T16:41:46.493+01:002013-07-17T16:41:46.493+01:00Very close to my procedure, yes. Time to update th...Very close to my procedure, yes. Time to update the blog post now....Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-14683825725699804552013-07-17T14:31:35.744+01:002013-07-17T14:31:35.744+01:00Hi! This is how I managed to edit the chart text. ...Hi! This is how I managed to edit the chart text. However I don't know how the file will react when the Excel file becomes accessible again (It might synchronize and the text elements of the chart might revert to their original values).<br /><br />1. I renamed the .docx file to .zip<br />2. I opened the zip file with WinRar and navigated to the folder \word\charts\<br />3. There is a file there named "chart1.xml" <br />4. Opened the file with a text (or XML) processor. The text elements are present within this file (I used Find/Replace to replace the text safely, because if a tag is altered by error, the file will be considered corrupt by Word)<br />5. After all the text elements have been replaced, I closed WinRar, and it prompted me to update the file in the archive. I selected "Yes".<br />6. Finally, I renamed the file to .docx and it opeped correctly with my updated text.<br /><br />It works on my computer, I don't know if this will work for you but that may be worth the try!<br /><br />StanStanislas Bironnoreply@blogger.comtag:blogger.com,1999:blog-20155610.post-3949048960904449362013-07-17T14:13:21.477+01:002013-07-17T14:13:21.477+01:00Sometimes LibreOffice gets it (after breaking the ...Sometimes LibreOffice gets it (after breaking the chart or Excel file up), but not here. <br /><br />You may print it as PDF and open it with a suitable PDF reader (PDF X-Change Viewer did and does a brilliant job here, while others did not get it), copy all text and paste it into a Word file to count it. This sounds complicated, but may be the faster solutions for huge files. This way works only for word counting, as layout screws up and words get shortened (!) and separated by tabs. I count 74 words, including numbers.Torstenhttps://www.blogger.com/profile/11115731755158723704noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-44303221509782343922013-07-17T11:43:52.225+01:002013-07-17T11:43:52.225+01:00ABBYY Screenshot Reader? Good idea. That number so...ABBYY Screenshot Reader? Good idea. That number sounds about right. The much-favored PractiCount that everyone depends on so much told me 17 words. Even Microsoft can do better than that.Kevin Lossnerhttps://www.blogger.com/profile/14727800526216764023noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-81627605922557624792013-07-17T11:31:29.772+01:002013-07-17T11:31:29.772+01:00PS: I count 65 words using ABBYY Screenshot Reader...PS: I count 65 words using ABBYY Screenshot Reader.<br /><br />MichaelMichael Beijerhttp://wordbook.nl/noreply@blogger.comtag:blogger.com,1999:blog-20155610.post-52727237781970359652013-07-17T11:27:45.561+01:002013-07-17T11:27:45.561+01:00The text in the figure is on another computer (in ...The text in the figure is on another computer (in a .xlsx file). Would it be possible that your CAT tool would import this text too if it could find it? If you had this file you could do change source in File | Info | Edit Links to Files | Change Source. You might also be able to break the link, and then manually create the missing .xlsx file and re-link it? Just guessing really.Michael Beijerhttp://wordbook.nl/noreply@blogger.com