Glossaries and word lists are used by most legal translators, but they get awful results and create documents barely intelligible to lawyers. Such legal translators are incapable of producing anything of value, and clients are best served by avoiding them. Nonetheless, there is hope for people needing accurate legal translation services since there are a variety of law databases available for legal translators to use to effectively determine the correct choice of language. In this article, I will explain the problem with glossaries and describe how the SEC EDGAR database can be used, drawing on my experience with Mandarin Chinese to English translation projects.
Glossaries are Useless for Legal Translation
While glossaries are used throughout the translation industry to ensure consistency across the translation, glossary reliance negatively impacts a legal translation for several reasons. The first and primary reason is that a glossary is based on the concept of one-to-one relationships, where a translator mindlessly substitutes one word for the other. However, terminology describing social systems rarely organize meaning at the word level, and there is often no one-to-one relationship between different countries’ legal systems and languages. A good example is the Mandarin Chinese vocabulary that corresponds with the word “cousin;” English has only one word, and in legal English, there are first cousins and second cousins first removed – which can be used optionally. In Mandarin Chinese, these kinds of words are mandatory in colloquial language, and there are separate words based on a cousin’s gender, age, and whether they are on the paternal or maternal side. There is no one-to-one relationship; someone relying on a glossary will get the order of succession backwards.
Another good example of how glossaries can cause problems is in the translation of the Chinese words for a Limited Liability Company and Corporation. These are both translated to “Company Limited” or Co., Ltd. This causes huge problems for anyone actually using the translation, such huge problems, in fact, that the United States sent a delegation to China to discuss which entity is a Corporation and which is a Limited Liability Company. The results of that agreement are codified in federal statute, which defines Gufen Youxian Gongsi as a Corporation and allows the Youxian Zeren Gongsi to request Limited Liability Company tax treatment, that is, pass-through taxation. Translators relying on glossaries originally translated the Chinese LLC word-for-word as Limited Responsibility Company, because the word Zeren can, in some contexts, such as legal, mean liability, accountability in others, and responsibility in others. Reliance on glossaries caused translators of olde to miss out on what the Chinese Limited Liability Company was intended to parallel and was even a word-for-word translation of the American LLC into Mandarin Chinese.
The diplomatic solution was to introduce a regulation requiring government officials in the US to ignore the translation; if an accountant gets the treatment of shares held in a foreign Corporation wrong due to a translation, the IRS is still entitled to impose penalties. This should be especially worrying for clients since the federal government has repeatedly used criminal tax evasion prosecutions as part of the China Initiative. The federal government has also used entity characterization similar to that described above and put Huawei’s Meng Wanzhou in jail for three years, all because a translator didn’t know to characterize two entities as being in a “partnership,” referring to it instead as a “cooperation.” By contrast, Tencent, who hired native English translators using a non-glossary method at the time as documented in Laura Brown’s excellent online portfolio, translated that same word as “partnership” throughout its documents and survived persecution in federal courts by Donald Trump.
I would theorize that half of Huawei’s revenue was wiped off the books due to their reliance on glossary-based translators at the time. Huawei responded to this concern, which I raised in 2018, by releasing a video showing interviews with its new translation team demonstrating non-glossary reliant translation approaches. The glossary-reliant translator will most certainly be headed for extinction, and organizations that use glossary-reliant translators have already started melting under international pressure.
For similar reasons, parallel corpora of Chinese legal translations available online are generally unreliable and should be avoided. Most of these were developed based on a glossary and are thus inaccurate, leading to the same problems. Regardless of the quality of the corpora, they will negatively impact translation service quality. Any translator attempting to use such corpora will generally only be able to see one-to-one relationships between words, failing to understand the numerous other choices that go into producing language.
The job of a legal translator is not to rely on “pre-fabricated” translations already done by someone else. The translation service has value because the translator is able to analyze and determine which sentences have a close correspondence to the original. So, if the translation service requested was so simple that it could be provided purely with parallel corpora and glossary lookup, then it would be a job for a machine translation tool. However, machine translation tools are not yet accurate enough for legal documents and rely on bias prediction to provide the illusion of accuracy. An analysis is still required, and this is where databases come into play.
SEC Edgar Database
Corporate transactional documents make up the bulk of a Chinese legal translator’s work, but the discourse community of corporate lawyers primarily happens behind closed doors in conference rooms and is protected by legal privilege. How can a legal translator even gain basic familiarity with the language of corporate lawyers? You could, of course, get a law degree and practice corporate law first, but this path is not feasible for everyone. It will not address every question about transactional legal language – especially points still emphasized in Chinese law but no longer major topics in American law, such as statutory representation. Fortunately, the SEC has opened some of these doors by requiring certain corporate documents to be uploaded and displayed online, and they cover the full spectrum of the relevant topics.
Most Chinese translators working with the SEC database for the first time will attempt to force the SEC to produce answers that conform to what available online glossaries say. In over 100 desktop recordings of novices learning the technique, even if told explicitly not to, over 90% of novice translators will take a Chinese to English glossary entry and simply keyword the SEC for that entry. This inevitably displays a document translated from Chinese to English using that keyword, and the novice translator will treat a document that was obviously nothing more than Google Translate output as an authority on the English language when, in reality, no native English documents are using that glossary’s terminology.
When translating a contract, it’s virtually impossible that searching for terms from a translation glossary or parallel corpora will take you to an SEC-filed contract having the relevant provision and language. This is because said glossaries and parallel corpora are written in a fake language only superficially related to English. This fictional language was developed many years ago to justify employing incompetent translators; producing a façade of competence was enough to justify translators’ paychecks, and the practice continues to this very day.
If you do not already know the relevant language, finding anything useful in the SEC will generally require looking up a document in the same genre, for example, a lease agreement, and looking for language that means the same thing as the original. For example, parallel corpora will often refer to a 30-story office tower as a “house.” A native English contract will call this a “building” and a space within a building a “premises.” The parallel corpora will refer to taking a sledgehammer to a retail shop’s interior walls as “decoration.” In contrast, a native English contract will call this the “build-out” process, which involves removing all previous tenant fixtures and building a completely new concept. A novice translator will attempt to use a glossary to keyword the SEC and get translated documents about decorating a house, which is illogical because the process involves sledgehammers.
Thus, mindset is the biggest challenge for translators using the SEC database. Looking up equivalent legal provisions to determine how English describes various legal concepts is, in itself, not complicated at all. What is complicated for legal translators is foregoing the attachment to word lists and glossaries offering pre-packaged answers. This is hard for people who were once foreign language students because most foreign language programs still revolve around bilingual word list memorization as a crutch. Most legal translators never outgrow the techniques they learned in language school. After learning a form of pidgin English, they continue using the same techniques to produce more broken English in their translations.
In addition to the SEC, other legal databases can also be searched, in particular Cornell’s database of federal regulations and several other platforms can be used for case searches. Those will be covered in a future post. In this article, we have learned how the glossary method for legal translation generally produces extremely bad results that mislead readers because they do not accurately describe how the language is used. For everyday legal documents translated to English, the correct English terms and equivalent language can be identified by simply looking up equivalent legal provisions appearing in native English documents.