Oct 18, 2013

Computer-aided translation tool survey


This file is licensed by Nolij.services under the Creative Commons Attribution-Share Alike 3.0 Unported license.
A bit over three years ago I conducted a small survey to get an idea of what working tools were popular among visitors to this blog and how widespread the use of multiple tools is. A lot has changed in the meantime, and another look at current pattens of use could be interesting.

The two survey questions will be found in the left margin at the top until the end of this year. If your work habits change between now and then, you can return and change your answers. Let's try to get the best statistical sample we can - spread the word!


UPDATE: The results are here:  http://www.translationtribulations.com/2014/01/the-2013-translation-environment-tools.html

33 comments:

  1. Just a small note: CafeTran (or maybe Cafetran) has been mispelled.

    ReplyDelete
    Replies
    1. That "or maybe" is telling, Torsten :-) That generation of programmers who began to corrupt writing habits in our languages have surely earned their place in the Eternal Flames. I actually spelled it wrong on purpose in a twist of clever psychology designed to garner the sympathy vote for a rather nice example of single-developer software ;-)

      Delete
  2. A few of comments:
    1) Note the correct spelling: Wordfast, MemSource, CafeTran
    2) Wordfast Anywhere ought to be included IMO, as it's more widely used than quite a few tools that have a separate entry in your survey
    3) You appear to have announced your survey on the mailing list of only some of the tools listed, which means you will likely get distorted results; of course, not all tools have a dedicated mailing list, but those that have one should be treated on an equal ground.

    ReplyDelete
    Replies
    1. Thank you, Dominique - I had a few other details I wanted to change, but one a vote has been case on those poll modules, no editing is allowed. But given the long tradition of bad spelling in IT, I'm not too bothered. Even (or perhaps especially) Trados users are challenged to get the name of their tool(s) right. Given that I posted a notice on the Trados, Déjà Vu, OmegaT and memoQ lists and the fact that there is a significant overlap between those for major tool groups and user groups for other tools, I assume that in the two and a half months remaining for the survey any temporary imbalances will be compensated as word gets around and those passionate Snowball users make their presence felt :-) As for the tools left off the list - Swordfish, Multitrans, etc. - there is that magical catch-all category "Other". If it assumes a very great proportion we'll know that future surveys should be broader in scope.

      When I did the first survey in June 2010 I was mostly interested in getting an idea of the tool usage habits for people involved with technologies I was likely to write about. I didn't expect to find the results of that small sample quoted in discussions of OmegaT, for example. I think that the sample likely to be obtained here will be just as useful as the last one, possibly more; for example, it will give me a rough idea of the relative importance of providing good information sharing guidelines between the major tools and the significant minor ones. This continues to be a challenge for many colleagues and project managers, and the information often ages quickly as the software evolves.

      I would love to see the same poll or a very similar one conducted in different venues. I would not expect the same results, but it would be very interesting to compare the results and the contexts and draw conclusions about the venues used. Perhaps you'd care to take up the challenge?

      Delete
  3. Nice survey! and I like the immediaate results. Wel done, Kevin!

    ReplyDelete
  4. All caps, light grey on white, 5 points. Designers only. No people need apply.

    ReplyDelete
    Replies
    1. Sorry, I don't write Google's CSS, I'm just stuck using it with Blogger software, but I wouldn't dream of defending its ergonomic shortcomings. My eyesight is not great, but I still manage with reading glasses, though the running results at the bottom, being a lighter shade of gray, are an unpleasant challenge.

      Delete
  5. Hi Kevin. Such a shame that those of us who don't use CAT tools don't have a "0" option for the first question! Though nice to see that we're catered for in the second question.

    ReplyDelete
    Replies
    1. Major OOPS there. Thank you, Caroline, that helps to explain some past data as well. In the first questions I was mostly focused on some ergonomic considerations for those who actually do use CAT tools. With some of the discussions I read, it's hard to tell how widespread the practice of tool-hopping really is, and because this can have some significant implications, I like to see just how common that really is.

      Delete
  6. Good survey, this will be interesting!

    ReplyDelete
  7. %32 other, so it means that some CAT tools are missing in the list, e.g. MetaTexis, Swordfish.

    ReplyDelete
    Replies
    1. More like 12% at the moment, roughly what it was 3 years ago. Still a significant number, but when I analyze and discuss the data, I'll probably be grouping quite a few of the named options under "Other". For me personally, it's not terribly interesting to know if Tool XYZ has a 3% rate of use and Tool ZYX has 7% and various others have a small share; their cumulative total, however, may indicate strongly the need for standardized project exchange formats such as XLIFF_doc or TIPP. What too may fail to realize is that this is really not a Trados World nor a memoQ one nor a world in which it is really helpful to focus very sharply on a single tool and bludgeon your contractors into using it. My philosophy for more than a dozen years has been that the technology should not stand in the way of matching the right people with the right linguistic and subject matter skills to the project. It is usually not that difficult to adapt to the ergonomic requirements of the translators or editors to do their best work, and forcing them to work with an unfamiliar tool, no matter how "good" that tool may be, can have unfortunate consequences. In a market where good specialists are hard to come by, it's important to be aware of how much one might unnecessarily restrict choices and put quality or delivery times at risk by an unwarranted focus on a single technology platform.

      Delete
    2. Hi Kevin,

      excellent remark! I could not have worded it better myself. I fully agree with your comment "technology should not stand in the way of matching the right people with the right linguistic and subject matter skills to the project." This is so fundamental because we all know that a fool with a tool remains a fool. Some clients and colleagues tend to forget that a smart translation does not come as a result of pushing a couple of buttons.

      Thanks for the survey, Aniello

      Delete
  8. Interesting results, I did not have any idea MemoQ became THAT popular these days. There is also one dilemma: whether to report if I use the memsource as a CAT tool. The reason is - I use it just to download the project for translation and the to upload it back to the cloud. And MemoQ takes care of translation & QA part.

    ReplyDelete
    Replies
    1. Good point to raise. I sort of forgot that's how I've used Ontram in the past. I listed Trados for myself because on occasion I do significant project preparation with that suite of tools, although I refuse to actually translate a single segment in the working environment. Trados filters are an important part of my workflow. If someone else just put SDLXLIFF or SDLPPX files on a server for me to download/upload I would not consider myself an active Trados user, even if I had to log in to some SDL system to grab the files. But it's a matter of judgment - if what you do requires that you actually know something significant about the tool's environment or part of it, then it's probably fair to say that you are a user of that tool.

      Delete
  9. 1. Are these really "votes" though? I thought it was just a survey. :-)

    2. How will you be using these statistics? Right now, MemoQ is the forerunner, but do you take that to mean that MemoQ is more popular "out there" or simply that most of your blog readers are MemoQ users? And suppose MemoQ gets twice as many votes as anything else, will you be reducing the weight of MemoQ users' votes in the first question by 50%, so as to get a more balanced picture that is now skewed by MemoQ users?

    -- leuce/Samuel

    ReplyDelete
    Replies
    1. Responses, votes... how boring life would be if people didn't have the opportunity to argue over a choice of words :-) Votes, of course, and whichever tool is elected is the one you will be compelled to use. Enough of this democratic interoperability nonsense!

      If I can't think of a suitably immoral purpose, I'll probably just use the data to get a qualitative idea of where things stand with tool usage among those likely to encounter a discussion here and possibly elsewhere. If I want to discuss a particular tool in some context and its distribution seems very small, then more background may be needed. If there is a greater portion of memoQ users here now that's a contrast to 2010 when I asked more or less the same questions. I had actually forgotten about that survey until I found it again on the OmegaT Wikipedia page as a reference a few days ago. That surprised me, but perhaps there is a shortage of data on the subject of CAT tool usage that drove those editing the page to such an extreme. When one considers the changes in the tools market in the past years, most of the results are qualitatively plausible, but whatever the true picture may be in the wider world, Samuel, don't you think it's useful to have a rough idea of where things stand in the area of one's reach?

      From time to time we get requests from students gathering data for their theses, and I think over the years these have sometimes included CAT tool usage. I don't know that the choices offered were any more comprehensive nor do I know whether the venues in which the surveys were announced were likely to draw a representative sample of professional translators using CAT technology. Perhaps the various large associations have data to share which someone might point us to? Perhaps PrAdZ has a poll or two or ten in this area? It would be interesting to see these data, and it would be equally interesting to hear what better use I might have from them. And if anyone feels sufficiently provoked by my simplistic attempts at information gathering I very much look forward to their more representative data from another survey, which will surely invite much interesting comparison and analysis.

      Delete
  10. Will "other" include dBase in this survey? Because that's what I ticked.
    Good luck with this survey.
    M.A.T.

    ReplyDelete
  11. Great imitative, Kevin.
    I have a suggestion for a couple more questions, although they might not fall under the scope or purpose of this survey so feel free to ignore them :):
    1) Did you choose your main and/or secondary TEnT or do you feel that you were forced into using them?
    2) Do you use TEnTs although you don't see or understand their value as an independent professional service provider?

    ReplyDelete
    Replies
    1. Hmmmm. On another occasion perhaps. I do hear that sort of thing often enough, along with other victimized utterances of colleagues who fail to realize that they do have choices and need not act like fallen leaves blown about by the winds of misfortune.

      Delete
    2. Hi Kevin! In response to "colleagues who fail to realize that they do have choices (etc.)" - I agree, but sometimes it's hard to take action, and not be pushed into doing what you don't want: we tried hard to persuade an agency that we had worked with for a few years to let us continue translating with the tools of our choice (typically, they send us Studio 2011 packages, we translate the files in Deja Vu X2, we QA them in Studio, everyone's happy) - but they insisted we work exclusively with Across for a major customer. We tried, and after 3 weeks we just had to stop, and lost the work. No way round it. We gave them detailed examples of Across "features" that made it ineffiecient and non-user-friendly (I'm being kind expressing it like that!), but they said it was non-negotiable as they had guaranteed their particular customer that only Across would be used "for reasons of data security". A tough choice to make! Regrets? I have a few ... Duncan

      Delete
    3. Duncan, although Across is listed among the tools of the questionnaire, anyone seriously involved in translation technology is aware of the fact that it is really not a serious, modern working environment and more of a Satanic plot to return to the Dark Ages of data incompatibility and the ergonomics of Hell. I consider companies who use Across servers as collections of lost or at least endangered souls, and while I might remember them in my prayers from time to time, I draw the line at actually consorting or doing business with them. The German company which produces across shameless represents the worst data practices for incompatibility to be found in the field of translation today, and they sell this to their gullible victims as a "competitive advantage". I turn my back of Across operators with absolutely no regrets. Life's too short to waste on garbage like that :-) Anyone who feels like being ground up into the brand of linguistic sausage with which Across is associated is welcome to stuff themselves on whatever scraps fall from that filthy table. No reason to regret saving yourself from ergonomic torture.

      Delete
    4. That's fine, Kevin. I assumed that this is more of a plain market share oriented survey, However, I think that a plain market share statistics are just part of the story, and more data is need to establish some context. Some colleagues that I know, and a lot more (although I wouldn't necessarily refer to many of those as colleagues) use TEnTs because they were told to and as a way to take advantage of their ignorance. Many of whom don't understand the benefits, complain about using these tools and quite frankly, have no clue about how to use them properly (which accounts for large parts of their grief).

      Across is awful. In my opinion it cannot even considered as a true TEnTs because its purpose is allegedly completely different. Any client (that is agencies) that try to dictate the technology to be used or take control of one's internal professional work process should be thrown out without even a basic courtesy. The problem is that too many translators, even of the "good" kind spend too much time in their heads inside their self-created bubble and they they much prefer to keep their head in the sand. Well, when one's head is stuck in worm comfort of the sand, guess which body part is left exposed...

      And if you ask me, Duncan (and you didn't :)), you don't have anything to regret and you made the right decision. Sure, losing a business is never a good experience in the short-term, and yes, even when you know you acted correctly doubts tend to creep in, but the main thing to remember that a short-term benefit often turns into a long-term pain. For example, just the other day I talked with a colleague of mine who complained that business are slow and that he only gets insulting and ridiculous offers from the lower of the lowest tier brokers. He then continue to say that he feels tempted to take some of them because something is better then nothing. I acknowledge the difficulty in the situation, but something is never better then nothing. Something my have a short-term benefit (i.e. pay next month bills) but a long-term damages (i.e. it will become harder and harder to pay the next bills after that). The same goes with letting clients go for other reasons than the fee alone.

      Delete
  12. Hi Kevin,

    I also think it's interesting to see what CAT tools translators are currently using. The more people there are who take part in the survey, the more representative it will be and the more revealing the data will be. Ideally, a survey of this kind ought to be done in a number of languages to reach out to as many CAT users as possible. I bet the results would vary! (By the way, I think I'd add Heartsome Translation Suite to the list, which is made in Asia, but is well known elsewhere; your "Other" category contains a number of alternatives, as some readers have already pointed out.)

    Apart from that, wouldn't it also be interesting to find out how many tools we use to pre- and post-process texts that we translate with a CAT tool? I take "CAT/TEnT tools" to mean tools like Verifika, CodeZapper and Xbench as well as memoQ or Studio, for example. But what about software that's not specifically made for that purpose like Notepad, Notepad++, Word, Excel or CSV editors, or even Acrobat Pro to generate a translatable Word file from a PDF file or OmniPage to create one from a scan? These valuable software tools obviously aren't included in a standard definition of "CAT" or "TEnT", are they? But I use them a lot.

    Regards

    Carl

    Amper Translation Service

    ReplyDelete
    Replies
    1. Well, Carl, I suppose for marketing purposes SDL or Kilgray might want to know how the distribution of CAT tool use falls among Mandarin or Hindi speakers, but if they don't understand English reasonably well, or at least German, it's unlikely that I'll be interacting with them much or sharing information with them. I have enough trouble communicating with the French, and the warnings I received from the translator of my recent article in the SFT journal made it clear that my forms of communication do not translate at all into their language, which might explain why so many of them would go ballistic over little things in the days when I tried to give some assistance on the PrAdZ forums. And as I mentioned, I think that for a minor too, "other" is as good a classification as any. Look on Wikipedia in the entries for "computer-assisted translation". How many of those tools have you heard of before? Probably half at best. A large share of "other" to me simply emphasizes the need for good data exchange interfaces. It's insane to worry about proprietary formats for small distribution tools. I'm personally inclined to think we should not have to be held hostage by the Dark Lords at SDL and elsewhere with their proprietary formats, but as long as the practical marketing considerations of other toolmakers ensure that much of this proprietary content will be accessible somehow, I'll leave the details of that Holy War to others.

      I do think the staging question is a useful one, and this will depend a lot on the kinds of projects people do and their level of awareness. Many people are simply unaware of what the tools they use can actually do, and if one has a filter, for example for PowerPoint, most are unlikely to compare the performance of the filter in a second tool. The often awful PowerPoint filter in memoQ might be a compelling reason to do just that to cope with excessive line breaks or embedded objects.

      As for the not-uncommon concern many have with the scope of definition for CAT tools, well... if someone is concerned that dBase or Notepad isn't considered a CAT tool by me, I think he may face other, more serious communication challenges in his daily routine ;-) I think instinctively most have the feeling that these TEnTs integrate certain basic functions, and while some of the software you mention might cover some aspects of these, they probably would not be considered integrated working environments by most colleagues with the sort of common sense that doesn't come from a special-interest Advisory. It would make sense to assess auxiliary tool use in some way at some point, however, because these tools do concern a lot of us, and most of those in our field who write about tools discuss them and their possible value. Before I did any sort of a survey like that, however, I would have to decide what my goals are. I'm not an agency looking to fill a database with a wide range of tool info for whatever purpose, so I might instead look at one specific tool category, like OCR tools, and there if I only list 5 and cover 95% of the market by that I'll surely be questioned on why the next 45 that share the all important remaining 5% were left off :-)

      Delete
  13. Hi Kevin,

    I agree with you when you say: "Many people are simply unaware of what the tools they use can actually do, and if one has a filter, [...] most are unlikely to compare the performance of the filter in a second tool". Perhaps that's something you'd also consider focusing on here or in a short video from time to time - comparing a feature in a small number of well-known CAT tools to show viewers what can be done and what can't, and ultimately which tool does it best. It's knowledge of this kind that's hard for users to come by.

    By the way, what I meant by the tools that can't be defined as CAT tools was simply that they are part of the file-preparation process, too. Again, it would be worth while finding out which tools are used for which purpose and are most popular. All of this data acquisition is time-consuming, but CAT-tool makers would see which pre- and post-processing features they could integrate in their CAT tools. Atril did exactly that with CodeZapper when it developed DVX2, for instance. So data of this kind can be of real value in product development.

    Regards

    Carl

    ReplyDelete
    Replies
    1. Good point - I think the only time I did that was the PowerPoint video with Trados and memoQ a few months ago, and then only because one particular file was driving me crazy. Not sure if Dominique has done something like that, but his videos have a lot of interesting comparisons. I feel like I'm sitting in a fun and serendipitous master class with a lot of his clips, and while I often take the same themes in a very different direction, that's what I like about seeing others cover this material: most take a different approach, and one can synthesize a more useful personal approach after seeing a few good individual interpretations. The notion that there is one right way to do any of this stuff is usually nonsense.

      I've been trying to get several features for certain kinds of post-processing added to certain CAT tools for years, but in the end I'll probably have to learn how to use watched folders and scripts to handle what I want.

      Delete
    2. Hi Kevin, hi everyone!

      Just to mention that maybe your experience would be different now. I had a look at your video, and saw the "old" filter used there. The current filter in memoQ does allow to set what to do with soft breaks, and starting with memoQ 2013 R2, it will even not use MS Office anymore, so preview will be both faster to generate and look better than the years-old HTML export by Office.

      Delete
    3. Thanks, Denis. I knew there was a lot in the works with filters for MS Office formats, but I haven't looked at much beside DOCX in the past month, and the change I'm waiting to see there hasn't happened yet. Whatever the actual state of a particular filter for a given tool may be, I think Carl's suggestion of performance comparisons for specific filters in a group of tools is useful. All of us probably have some war story of a particular file which could only pass the filter in one particular tool; given all the crazy variables possible with file options, that's almost inevitable. Sometimes memoQ's filter may be the only working option, sometimes, Studio, sometimes OmegaT, another time who knows what? All the more reason to be "armed" with good interoperability strategies for the next step....

      Delete
  14. 24% of users still using Trados 2007?... And I can personally confirm that - by far the most popular article on my blog keeps on being "How to run Trados 2007 with Word 2010". I wonder how much the dissatisfaction of certain translators with CAT tools in general stems from the fact that they are stuck (for whatever reason) with tools that are obsolete, and that no longer play nice with the newer releases of MS Word.

    ReplyDelete
  15. Hi Kevin,
    Interesting survey, but I think it may be hard to analyse results fairly. I suspect lots of people come to your blog for MemoQ-related issues and along the way they may complete the survey. That would make it clearly biased towards MemoQ users. If I ran the same survey on my blog, I think the results would be quite different;)
    However, the 3-yr comparison with your previous survey will be interesting and I look forward to reading the results.
    Emma

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)