Pages

Jan 14, 2019

Specialist terminology taxonomies from Cologne Technical University

Click and thou shalt go there!

Early in the last decade when I lived near Düsseldorf and began translating full time, the nearby technical university in Cologne had an excellent terminology studies program run by Prof. Klaus-Dirk Schmitz, who also had a long history in Saarbrücken back in my exchange student days there. I had the pleasure of meeting this gentleman at various professional events for Passolo (before it was swallowed by SDL) or other occasions, and I remain impressed by the professional qualities of some of the colleagues he helped to educate. At some point he or one of his students pointed me to an interesting online collection of specialist terminologies created by students at the university as part of their degree work. While student work must be viewed carefully, on the whole I found these collections to be of better quality than quite a few put together by "professionals", and their structured taxonomies were also interesting to people like me who enjoy such things. And occasionally the terminologies were rather helpful for certain technical topics I translate.

But over the years I simply forgot about them for the most part, and when they did come to mind I assumed that the old MultiTerm engine used to handle the data on the site would no longer work. That latter assumption may be partly correct; I found the collection again, noted that the most recent addition to the term library was a bit over a decade ago and that the search functions don't seem to work with Chrome, though I am able to browse the structured taxonomies without difficulty.


Looking through the list of term collections, I saw one that would be particularly useful for a current personal effort: beekeeping. One of my projects for the year ahead is to add some hives to the garden to see if I can improve some of the vegetable, fruit and nut yields. A local Portuguese beekeeper and I have been trading poultry, and he kindly provided me with a copy of his thesis on apiculture and offered assistance to get me started. So I am reading up on the subject in several languages, thinking to put together a good terminology to make cross-referencing the concepts between English, German and Portuguese a little easier.

One thing I never tried to do before was to extract data from the FH Köln (Cologne Technical University) site into any sort of terminology management tool. I don't think they were ever intended to be used that way, and at the time most of the collections were put together, translation environment tools were much less widely used by professionals and university study programs than they are today. But after a little thought and experimentation, mining the pages proved to be quite simple.

Here's how I did it:
  1. Opened a collection of interest and expanded the folder tree for a particular language completely, then selected and copied all the text in that frame:

  2. Pasted the copied content as plain text (no formatting) into Microsoft Word. The numerical codes were followed directly by the text entries.
  3. Removed parentheses by searching and replacing with nothing.
  4. Inserted a tab between the number codes using search and replace with wildcards (regex of a sort):

  5. Switched to the other language in the term collection and repeated steps 1 through 4.
  6. Transferred the contents to Excel (various ways to do this).
  7. Imported the Excel file with the specialist terms into a term base in my translation environment tool of choice.




No comments:

Post a Comment

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)