Most of us in Web Dev are familiar with making CMS based websites, but adding languages is a very occasional task. The good news is that multiple languages don't have to be very hard, but the bad news is that each language added increases the content management task's complexity, and done poorly translations are a budget killer. Most CMS platforms like Drupal, Wordpress, & Expression Engine can handle translations in some capacity. When I did my research I found my preferred CMS of Apostrophe 1.5 did a good job, and I was happy with the results. However my best advise is to pick a good team and do what they suggest... after verifying of course. If you're looking for an experienced team with competitive U.S. rates please contact me to discuss, even if I can't help you I'll probably be able to point to you to someone that can.
Web Site Translation Terms:
- I18n: short for Internationalization, it means translations
- L10n, short for localization changing the content or product mix based off the culture chosen
Web Site Translation Gotchas:
- The Russian language is the Ivan Drago (rocky 4) to your layout. It's going to come in and blow stuff out just because it's bigger.
- Every Language is likely to need some tweaking, this needs to be mitigated by having a layout with high tolerances for variable length content.
- Facilitate the language specific changes by appending the language code into the body tag's class. This allows CSS rules to be written for specific languages. Example: <body class="sinatra en"> OR <body class="sinatra ru">
- Each alphabet is going to need it's own set of fonts. Very few fonts have every character of every alphabet. Seriously don't ignore this, if you do you'll be wondering why everything looks different in IE9, and I'll forward you this URL and be a jerk about it. Examples of different alphabets would be Cyrillic for Russian & Greek, & every Asian variant of characters.
- A good reference for Chinese Fonts: "... recommendation for a sans-serif font stack is: font-family: arial, 黑体, 微软雅黑, 宋体, sans-serif;"
- Images:
- avoid text in images
- avoid images in the CSS files that change based on language
- depending on the CMS image captions may be difficult to translate
- often images will need to be re input into the CMS for each language this is not fun
- Video:
- Avoid embedded text, it's expensive to re edit videos
- Make sure you plan on translating any voice overs
- You may find that services like YouTube.com & Vimeo.com are not available in all countries, like China
- Arabic & other right to left languages are tough, if you don't need them call that out to help lower your estimate
- Start looking for a translator service ASAP, they can be a PITA, perhaps worse than developers ;)
- Try to get your translators to use the CMS for translations, that way they are more likely to fit the space provided & context. Plus you don't have to migrate the content from one format to another and deal with the inevitable human errors in languages you don't know.
- Make sure each language has it's own URL, & language meta data is set correctly, that way search engines, like google, will index the site correctly
- You'll need to decide if each translation is representing a language, country, or culture. This is so you can use the correct ISO code in your URL's and database meta data. ISO code resources are linked in line above.
- Language codes are for the language family only
- Country codes are just for the country and imply a site that also has it's content tailored to the country
- Cultures are a combination of country & language (British English, American English), you may prefer using cultures if you'll have different content for different regions using the same language, this is L1on.
- Content change monitoring across languages is going to be tough. Most CMS's don't have logging for when content is changed so you have to keep track of where each language is. For example if you've changed the English version, but not the Spanish version.
- Most of your problems are going to be in CSS, a frontend developer experienced in multiple language websites will help the project go smoothly.
- Testing is a MUCH bigger problem. Instead of all pages in all browsers, it's all pages in all browsers in ALL LANGUAGES. And the types of issues that come up require eyeballs & discipline to find.
- Avoid flags or other symbols of nationality, and make sure to respect things like simplified & traditional chinese, which implies mainland China or Taiwan, and both sides may have hurt feelings when you mix them up, so don't be rude.
- Google Translate is not very reliable, but it looks good in a pinch.
All these things will cause you pain and suffering if you are not prepared for them.
Got more tips? Please post them in the comments, and thanks :)
Thanks to Betsey Kershaw of Sugarbeet Creative for commiseration on these hard knocks & the headline.