What is localization testing and how do you get it done?

Today, we are going to explain how localization testing works, why it should be done, and what happens to the text in the process. Elizaveta Morgacheva, senior project manager here at INLINGO will tell us all about it.

And now, without further ado, what does localization testing actually check? The answer to that question is a short list of classifications of sub-types of localization testing:


Most of the time, testing checks the text: does it convey the correct meaning and style, and is it spelled correctly. The naturalness of the translation, the correspondence of the words with the images. For example, if a word has multiple meanings, like bat, may be a baseball bat in translation but be a vampire bat in the image (this can happen with any homonym, especially if the translator does not have pictures or descriptions on hand). 


Here the integration of text into the game is checked: the position of the words, the readability and unity of fonts, etc. The correctness of the fonts for the given languages is also checked (for example, some fonts don’t support diacritic marks. Our job is to spot that problem and suggest a better font to clients), does the text fit into the dialog boxes (if not, we can shorten the text or place line breaks appropriately).

We can test using emulators on various devices (for example, an iPad emulator that runs on PC), or on the devices themselves (PC, Android devices, and iOS). 

What is required for testing?

In order to test a game, we need at the very least a build of the game, but ideally we would also get the following from the developer:


  • Build
    In a perfect world, we get a build with cheats to speed up the process so we can focus on checking the text, not on the game and its mechanics. Or a build with a balance of currency so that we can easily buy new weapons, upgrade buildings, etc. If the build has both provided, then the testing will be a breeze. We will be able to focus on the task at hand and cover a large portion of the story in a relatively short amount of time.
  • Checklist or test plan
    A list of things that we need to check (for example, interface, dialog, chapters of the story, item collections, missions, and so on). The plan of the test helps organize the process. But testing is still doable without a checklist. In those cases, we check all the windows and texts we can find as we play through the game linearly.Here is a 75-point checklist for game testers—these points can be included in your test plan.

    And here’s a sample from a SlideShare presentation about mobile game testing. We can’t post screenshots of our own checklists because of NDAs 🙁
  • Lockit or file with game’s text
    In order to enter corrections made as a result of the test, we need this. We also need a loc kit to check any dubious spots in the game. Sometimes old texts can find their way into a game, but in the loc kit they’ve been updates. That’s one of the things we do our best to catch.

    This is generally what a loc kit looks like: this one is maintained in Google Sheets, but you can just as easily use Excel, Strings, or any table format. MemoQ can convert it (source of screenshot)

Some developers send screenshots and gameplay videos to be checked. That significantly reduces the amount of time spent testing, but is demanding for the client themselves (you can never have too many screenshots). One of our clients once sent us a folder of screenshots that wsa more than 10 Gb. Why bother when you can download the 500 Mb build?

Who does testing? 

Usually these tasks are taken on by experienced testers who are old salts when it comes to gaming. They play different genres, and they know the specifics of each of them. We’ve run into testers who only liked farm games and some who prefer shooters. It’s a question of taste 🙂 But testers can play everything. 

Being a good player is not all it takes to be a good localization tester, you need to be a good linguist, know the specifics, subtleties, and nuances of the language. Of course, cosmetic bugs can be found by those without special knowledge, but one should always strive for testers who can see logical, grammatical, and stylistic mistakes. We split out testers into two groups—natives and non-natives. 

Native testers are people for whom the tested language is their first, and non-natives, naturally, are those who have studied the language. These groups have different rates and different “depths” at which they can check the game. Naturally, a native speaker can give a deeper check of the quality of the translation and how it fits into the game. 

A non-native can miss something. But our experience tells us that some non-natives are just as good as a native thanks to their incredible familiarity with the language. Nevertheless, the advantage of a native tester is that they know the facts of life in their country and can see if a translation or aspect of the game (a picture, video, or music) is unsuitable or even offensive to the culture in their homeland.

Ideally, you can always have everything checked by natives, but they are noticeably more expensive—so if your budget is tight, you may be better off with a non-native.

What is the process of localization testing?

The process of testing looks like this:

  1. We get a testing task from a client (including the required materials, a statement of work, and some additional comments about what to look out for especially). We discuss the timeline and deadlines. Often the client tells us how many hours need to be tested, sometimes they give us a testing plan (and in this case there’s no need for a fixed amount of time). 
  2. We assemble a team (of native or non-natives, or sometimes we use a mixed team).
  3. Then comes the testing itself, during which we create testing reports and introduce changes into the loc kit.
  4. If we translated the project and then are asked to test it, we consult with the translators about the validity of the suggested corrections. Sometimes, we want to change an unclear moment in the text, and the translator actually purposefully made it that way. Like if a tester wants to fix a certain character’s strange speech pattern, but in fact, that odd manner of speaking was part of the author’s intention—as a way of showing that the character is not of this world.
  5. Afterward, we give the results of the test to the client. The client implements the changes in the game.
  6. Then the next stage, regression testing, begins.

What is regression testing? 

It’s the final testing at the very end of localization. The goal is to ensure that all the mistakes that were found throughout the process have been fixed before release. It is not part of localization testing; it is a category all its own.

Regression can be done for linguistic and functional testings. The first stage of testing creates a number of reports and enters corrections into the loc kit. After that, the client implements the changes and provides an updated build which we use for the secondary (regression) testing, and we check to make sure that all the errors found in the first go-through have been resolved and no new issues have arisen. For example, if in the first round of testing, we found that the text didn’t fit into the dialog boxes, and we shortened it, we may find in regression testing that the text needed to be even shorter or that the line breaks needed to be placed differently. Sometimes, certain corrections are not implemented in the build for one reason or another, and in this second round, we are trying to catch that. Regression testing gives us the chance to lock in the results of the first round and ensure that the text of the game is now at its best. If necessary, this step can be repeated, but that is typically not necessary.

What is a bug report?

Bug reports are the result of testing—they are forms explaining the bugs. Each report includes the following basic information:


  1. The type of the mistake found—so that it is easy to classify bugs and pass that information on to the appropriate people. Some types of mistakes are cosmetic and linguistic.
  2. The steps to produce the bug (how to find the mistake in the game) or just the ID of the text in the loc kit.
  3. A description of the mistake. 
  4. The current state (how the sentence looks now with the mistake)
  5. How it should be
  6. A screenshot of the mistake or a link to one

Some clients request the document containing the report (e.g. an Excel file), and they have a special bug tracking platform for that purpose (e.g. Jira or Redmine). 

Now we’ll show a few screenshots of bug reports—unfortunately, they are not ours, because our projects are all under NDAs. 

A sample bug in a Redmine report. The description of the bug includes the steps that lead to the error, how it should be, and a note. It also mentions the version of the game and the status of the bug. Files can also be uploaded, like screenshots.
This is how a bug report looks in JIRA. Here you can assign the task to specific people, place tags, track the status, and see the date of the creation of the bug. 
And while we were preparing the materials for this article, we discovered the program Instabug for Mobile Games. This is a screenshot from their beta test. Here, just like in JIRA, you can assign, track the status, prioritize bugs. 

How do you tell the depth of testing?

The depth of testing is essentially the length and thoroughness of the testing. It depends on the client’s goals. For example: 

  • The game is very big, and testing the whole game would be a long, expensive process. The client may as for the first 3-5 hours of the game to be tested to ensure that those hours are perfect. At that point, they’ve got players hooked and they won’t leave because of a few mistakes. Furthermore, users typically write reviews after just a few hours of play, and they don’t have such a strict attitude toward issues discovered later in the game.
  • The client wants to test the game up to level 30. And based on that, they calculate the number of hours and the deadlines.
  • A new update is being released, and it needs to be fully tested.
  • Even this: a client has a budget for 15 hours of testing—so we do 15 hours of testing. 

In conclusion: we always recommend testing by a native speaker—even just the first 3-5 hours. That will protect you from any errors that may slip through during translation because of lack of context or which you may not have noticed if you’re not an expert in the target language. 

We hope this material has been useful and helped everyone who wanted to learn more about localization testing. If you feel you still have some gaps in your knowledge, never fear! Write us at order@inlingogames.com and we’ll fill them in together. Remember? Less than 30 minutes 😉


Leave a Reply

Your e-mail address will not be published. Required fields are marked *