We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Finding Translations: Localization and Internationalization

00:00

Formal Metadata

Title
Finding Translations: Localization and Internationalization
Title of Series
Part Number
34
Number of Parts
89
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Translation, be it a word, sentence, concept, or idea, for different audiences has always been a challenge. This talk tackles problems of translation, especially those that tend to crop up in building software. We'll dive into the eminently practical—how to design apps for easier localization, common pitfalls, solutions for managing translations, approaches to version control with translations—and the more subjective—possible impacts of cultural differences, and what makes a "good" translation.
81
CompilerInternationalization and localizationSoftware developerRevision controlType theoryComputer fileString (computer science)Mobile appConnectivity (graph theory)Formal languagePoint (geometry)Maß <Mathematik>View (database)Latent heatBitKey (cryptography)Moving averageMathematicsProcess (computing)Software10 (number)TwitterCodeDefault (computer science)Electronic mailing listContent (media)Term (mathematics)Line (geometry)Inheritance (object-oriented programming)Pay televisionSelf-organizationMereologyElectronic program guideWordSingle-precision floating-point formatExterior algebraComputing platformNumberKognitionswissenschaftCoefficient of determinationInformationCartesian coordinate systemFormal grammarMultiplication signVideo gameCASE <Informatik>Observational studyWebsiteCodeComputer animationLecture/Conference
Key (cryptography)AdditionInternationalization and localizationWordCompilerArchaeological field surveyRule of inferenceString (computer science)Term (mathematics)Software frameworkProjective planeUser interfaceSpacetimeLengthVariable (mathematics)QuicksortAxiom of choiceLink (knot theory)Different (Kate Ryan album)Formal languageInformationError messageGenderLine (geometry)Context awarenessGoogolConjugacy classIntegerEndliche ModelltheorieRow (database)Computer fileBitAreaAttribute grammarRight angleBlogTask (computing)Object (grammar)CountingMobile appData storage deviceService (economics)SineSemiconductor memoryMultiplication signLevel (video gaming)Latent heatRevision controlPoint (geometry)Software developerMessage passingDefault (computer science)Web pageGraphical user interfaceFlow separationText editorIP addressDatabaseFormal grammarFitness functionData structureInterface (computing)Cache (computing)Online helpFreewareView (database)BootingCodeLecture/Conference
Alphabet (computer science)NP-hardMultiplication signDatabaseInformationSoftware repositoryRevision controlMobile appComputer fileTrailData storage deviceElement (mathematics)Group actionSet (mathematics)Formal languageWebsiteServer (computing)Source codeAxiom of choiceNormal (geometry)Uniform resource locatorInterface (computing)Module (mathematics)PhysicalismFitness functionKey (cryptography)String (computer science)Automatic differentiationClique-widthInternationalization and localizationSoftware testingCompilerQuicksortTouch typingForm (programming)Information privacyCatastrophismDifferent (Kate Ryan album)Decision theoryCodeFunctional (mathematics)UsabilitySpacetimeExistential quantificationRollback (data management)Context awarenessPlanningContent (media)CodeWordSlide ruleSoftware developerMessage passingLine (geometry)Product (business)TwitterFront and back endsRule of inferenceComputer animationLecture/Conference
Computer animation
Transcript: English(auto-generated)
We're at 10.50, so I will go ahead and get started. I'm going to be giving this talk on localisation and translation, and internationalisation in your Rails apps. Cleverly entitled finding
translations, although someone yesterday suggested the alternate title of many, many
for this talk as well. We will start out by defining our terms. Localisation is the process of adapting internationalised software for a specific region or language by adding the locale-specific components and actually translating the text. Internationalisation refers to the process of actually setting up your app in such a way that it can be
translated. We will be talking mostly about how to set yourself up for success in terms of when you are designing your app, even if maybe you're not even planning on having
it translated off of the bat, some of these processes and things will be good things to have in mind. In any case, so thanks Wikipedia for those definitions. Let's get rolling. In plain English, we're talking about translating and, more importantly, getting ready to translate your app. So, what are we talking about when we talk about translation?
There's the most straightforward type which most people are going to be thinking about when I say the word translation which is English to French, French to Arabic, English to German, Spanish to Cantonese, et cetera. Other things that are important in this translation
process, though, is what country you're actually talking to. So, are you talking to British consumers? Are you talking to American consumers? Are you talking to Portuguese consumers? Are you talking to Brazilian consumers? Those populations are going to speak the same language
but are not necessarily going to have the same content be delivered to them. Another thing you want to be aware of is the register, whether that's formal or informal, professional or like your Twitter feed, AP style or MLA style, I guess, things like formatting,
so on and so forth. We're mostly going to be talking about the first type, but you should always have in mind the second type. When you're doing a translation, you should be thinking about what country the people using that language are going to be coming from. Things
that will be relevant to that are things like units of measurement, government legal terms, date formatting, so on and so forth. The third type is not something we're going to talk about really, but your translators should be aware of where you're coming from in terms
of the register that you want to convey in your app, so that they can write to the same kind of audience. I could also see using these same tools to transpose between registers, so say you had a kids and parents version of your site, you could use some
of these I18n conventions to do that as well. I've never done it, but it could be fun. So who am I? Points if you get this joke, 24601. I'm Valerie Willard. On Twitter,
I'm at Valerie codes, and you should tweet this talk because at mentions are my lifeblood. I'm a Rails developer at Panoply, which is a podcasting platform. Yes, podcasts, we're part of Slate, and if you want to geek out about podcasts, please find me after the talk, and I
will give you all my recommendations and take all of your recommendations, and my subscription list will continue to balloon uncontrollably. I have interests in linguistics, translation,
and language studies, so prior to becoming a developer, I was a French major. I studied cognitive science with an emphasis in linguistics, so I've done a bit of translating in an academic setting, and was hoping I could maybe add some insight to folks who are maybe not as familiar
with the translation process who are getting ready to translate things, so without any further ado, so you want to localise your app. You might feel like this dog, especially if you're
not familiar with a lot of the pitfalls that can come along in this process. So, when should you think about localisation? You should think about it now. Even if you don't actually foresee a future in which you want to have localised versions of your app,
if you think about your possible audience, it is probably not just US-based English speakers. If you think about just the number of languages that are spoken just in the United States, if you're limiting yourself only to English speakers, you are really limiting yourself. So
it's something that should be on your radar as a possible thing that might come down the line. Even if you never localise it, you won't be hurt by using some of these conventions, and they can give you other wins in your development process. So, when should you internationalise?
So you want to be thinking about this again before you need to. For example, don't hard-code strings into your views. This is a very easy win. You can use locale keys in your
views, have those reference a YAML file, and that way, you can have all of the copy for your app separated out from the actual code. So, say if you have someone who is not a developer who wants to make changes to copy, you don't have to have them dig through the code and make
those changes. You can have them edit a YAML file which is probably going to be much easier for everyone involved. So there are lots of built-in tools for Rails localisation. There's this i18n guide which goes over the basics of how to use those tools, how
to create keys that you then reference in your app. The default set-up for Rails localisation is to have a YAML file, as I've alluded to a couple of times, where you will have
a key and a string, and, when you reference that key, in your application, that string will be pulled from the YAML file. And then you will have YAML files for each locale, so you will have a French one, and an English one, and so forth, and, based on the locale
setting on your app, the correct string will get pulled in. And there's a lot of things built in for you. There's also a localise, so there's i18n.t which is translate
that refers more to the pulling the correct strings, and there's also localise, which refers more to, as I was mentioning, the units of measurement, things like that, that will be related to where the person is from. Okay, so you've got some YAML files with
some strings in them. This can get annoying really fast, especially if you have, say, maybe hundreds of strings, or thousands of strings, or tens of thousands of strings in your app. Depending on how complex your app is, these files can get really unmanageable.
So you should be thinking about whether this is something that is practical for you and for your app and your organisation. One way you can maybe make it a little easier on yourself is to customise these YAML files to be per feature, or something like that,
so that you don't have one single YAML file that is storing every single string that you use, but, at the same time, this is still like not something that you probably want to use for a super, super complex app. Here are some helpful gems in your local
organisation journey. One of them is Rails i18n. This provides tons of translations in different languages, different locales, for the errors that are kind of baked into
Rails, active record things, default date formatting, things like that, so that you don't have to waste your time doing those things. So that is super helpful to kind of get your localisation stuff off the ground. A locale app is a gem and provides a web
interface for storing translations that translators can log into. It is a paid service. I'm not sure if there is a free tier, but something worth looking into if you're looking to add translations. It is also tied to paid translation services, so you can pay
someone from there to actually go in and translate all your strings for you. Globalise is a tool you're going to want to use if you're adding translations to your active record models. So, if you want to say localise attributes on a model, say
you have a blog, and your posts are stored as active record models, and you want to have alternate versions of your blog posts in different languages, this would be the tool that you want to use for that. Through geocoder, you can the geocoder gem which also does a lot of other things, but one tool that it will provide you is being able
to set a locale based on a user's IP address. So that is also super helpful. i18n tasks is a gem that will go through and report keys in your YAML files that are
missing or unused. It will remove unused keys optionally, and it can also pre-fill missing keys from Google Translate if you want to play it fast and loose. One possible work-around for this YAML nightmare that we've described is proposed in this
Rails cast episode that you can look into. It provides a sort of framework for a redis-based back-end for the locale keys. Another possibility would be to do, like, an active
record or another database-based back-end. The things that you want to keep in mind, though, are that these keys are going to, there's probably going to be tens, dozens of them loaded on every page, so they will need to be accessed all of the time,
and so an in-memory store or some sort of cache is probably going to be preferable to having to do a database lookup at every time a key is referenced. If you decide to just stick with the YAML, you can edit it the usual way, maybe in a graphical
YAML editor. One thing that you might want to keep in mind when you're thinking about how exactly you want this translation back-end to work is that the people who are doing your translations who are going to be entering these keys are not necessarily going to be
developers. So if you've got people from the marketing department, if you've got professional translators, you probably don't want to tell them to boot up Sublime Text and write some keys in. You probably want to provide some sort of GUI or graphical interface for them to make their lives a little bit easier. So these are the things that I feel like
are most important to consider when you're talking about how you're going to localise something. First, you need to know what needs to be translated, kind of the scope of is this a 10,000-line project, is this a 100-line project? Do we need to hire
a professional translator because we need really polished translations, or will a Google translate situation be adequate to our needs? Do we need to translate the attributes of a model?
What are maybe the length of the strings that we want to translate related to like how readable maybe they will be in like a YAML interface? Are there special characters that you will need to think about whether your database or data store of choice supports? And what information,
in addition to just those strings, is it helpful to provide to your translators? So what tools do they need to give you a really good translation? Maybe some contextual information, maybe a nice GUI so that they don't have to edit things in Sublime Text
and push them to GitHub is maybe the best solution there. So this is something that I've come across in looking over apps that people have tried to localise. Anytime you
are concatenating say locale keys together to form a single sentence, that's probably a point at which you need to look at your life and look at your choices and find out a way not to do that. The reason for that being the first in my parade of foolish assumptions,
and that is that fragments can be translated with any accuracy. The reason for that is probably clear to you if you've studied a foreign language, but syntaxes are different in different languages. The subject-verb-object ordering may be completely different. The verb
may go at the end of the sentence. The context of the full sentence may be needed for conjugation or for gendering of nouns, things like that. Instead, what you will want to do is use, so there's an example here of a variable in a full sentence. If you need to
pass in the name of a column where you have an error or something like that, or a proper noun, so the i18n in Rails will support this passing of variables, so you
can just pass in a key, and a variable, and the variable will get dropped into that key like so. So here has the errors variable, and that way, you can provide your translator with the full sentence and context in addition to the variable that will just be replaced.
Another pitfall is assuming that pluralisation works the same in other languages. I included a link to this very thorough sort of survey of pluralisation rules in tons of different languages, but the gist of it is that, in English, we generally have the same pluralisation
for zero and more than one thing, kind of the same grammatical structure, and then a separate pluralisation rule if there's one thing. Other languages do not necessarily do this in the same way. They may make different distinctions where zero things follows
one rule, one thing follows another rule, more than one thing follows a third rule. So don't hard code these strings. Instead, i18n provides you with a very useful count variable where you can just define these different keys for one other and zero, and
pass an integer in as a count, and, based on the value of that integer, the one other or zero key will be dropped into your view. Another thing to be aware of when you're
translating is that other languages are not necessarily going to use the same level of specificity. You may need to provide more information to your translator than an English string will provide. So things to be aware of here are like gender, we talked about
register, they will need to know whether you're hoping to address your users in a more formal or informal register. Also, the words, they're just maybe more specific words in the other language that they will need to be aware of what exactly you're talking
about. So, for example, in Korean, there are multiple words for the English word in, one to denote a snug fit in a container, one to denote a loose fit. So these are just things that you should know, things that your translator might need to be aware
of. Another thing to know is that a message cannot necessarily be conveyed in another language in the same physical space as it can in English. They're in languages that
use other character sets. The messages may be much shorter. You may have something that takes one line in English and two lines in French, a general rule that I followed when I was translating French was that it always seems to take more words and more
characters to say the same thing in French that it does in English. For that reason, you'll need to think about what you want to do if you need to fit more characters in a space, or if something will look weird if you have a much shorter string than the
English one. Do you want to shrink the text down in certain situations? You probably want to avoid fixed height or width containers unless you're going to test whether that string fits in the fixed height or width container in every language that you support. So this
is something to be aware of when writing CSS and doing more of the front-end work. Another scary thing, the text may not always be left to right. So there are a lot of design implications here. So this is screenshots from the BBC's website, one from their Arabic
website and one from their North African French website. So you'll notice it's not just the text that's flipped, it's also the logo and some of the design elements, the search
bar is on the other side, because the idea is that the eye will gravitate toward the right side rather than the left, so there are a lot of other design implications beyond just flipping the alignment of the text in the container. Your character set will not
always be the Roman alphabet necessarily, so you should make sure that other character sets are supported if you're planning to support non-Roman alphabet languages. Okay, so you've got all this. You have all of these pitfalls in mind. You've got
these wonderful YAML files that are filled with wonderful translations that people are making updates to all of the time, and you have merge conflicts constantly because everything
is in a YAML file and everyone is editing it, and it's terrible. You're very likely to get merge conflicts, especially if everything is in one place, and because the people doing your translating, adding your copy are not necessarily developers, you probably
don't want everyone who makes copy to have access to, say, your code repo, so some things to do here can break up keys based on functionality, based on features. You can
add your YAML files to make them a submodule of your main repo so that you can give people access to just the YAML files rather than giving them access to the whole repo. You can have a database store that people edit from that is then pulled into a YAML file in some way, or you
can just have a database back-end for your translation keys, or you can do something like which is a web-based interface where people can edit the keys, and that can sort of serve as your external source of truth, so you don't have to trust those YAML files that are in your
repo. You can just pull from that source and trust that everything is fine. The concerns that you will have in mind, again, are ease of use based on who is doing your translations. You also probably want some sort of audit trail if someone accidentally deletes your English.YAML
file. You want to have some rollback like catastrophe scenario, so whether that's doing a daily database back-up, whether that's keeping things in version control in some way,
shape, or form, those are things that you should have on your radar. So, another thing that I sort of want to touch on is why do you want to translate your app, which is not to say that I think it's a bad idea. I think it's great to support speakers of other languages
to make our technology accessible in other places to non-English speakers, but when you are deciding to internationalise for a given region, for a given language, there are things
that you want to think about before making that choice, before deciding who you want to translate your app for. An example is other countries have different norms around privacy. In general, I would say in the US, we're probably more fast and loose with some of our private information
than in, say, Europe. There are things that may be considered more taboo to share, such as religion. People are maybe more likely to be sensitive about
tracking about location or, like, physical location information of theirs being shared. So if your app uses those things, you should think about how that's going to be perceived in the language group in the country and the culture that you're adapting it for. There may
also be legal issues if you're actually intending for another country market. There are things like do not track in Europe. There are issues around, say, copyrighted information and the trouble that you can get into for sharing copyrighted information or copyrighted content in other
countries. There are issues of defamation. The US actually, I believe, has one of the more lenient stances towards defamatory content. So if there's a chance that defamatory content can be
included or disseminated through your app, be aware of the potential legal repercussions that you can face in any country that you're expanding to. Another issue is that the same
needs may not exist in that place. So if your app centres around something related to, say, the US healthcare market, that's probably not a thing that's going to translate in other countries. So maybe that leads you to the decision to translate your app for US-based
Spanish speakers but not for Mexico-based Spanish speakers, for example. So my hope is that from this talk, you take away the following things. One of them is to think about
localisation now rather than later, and hopefully there are things that you can win even if you don't actually decide to do any translating. When you're translating something,
be aware of the quality of translation that you need, and if you want a quality translation, your translators should have a good understanding of your app, of how it works, of who it's intended for, of what it's intended to do. Also, translation is hard. It's a hard problem that I don't think there are lots of easy solutions to, and it's something that you, if you care
about having a quality product for non-English speakers that you should invest your time in thinking about. And also to know your audience when you're deciding who you want to translate
for. And with that, I'm going to be hanging out if anyone has questions. And then there's this slide which is just all of my information. If anyone wants to get in touch, tweet at me,
look at my GitHub, look at my website, I've got Valerie.codes which I think is awesome. And thank you so much for coming.