The Text Domain in WordPress Internationalization

In this post I want to address a common question / misunderstanding about the role of the text domain when internationalizing WordPress plugins and themes. This topic has been addressed in the past, but it comes up again from time to time. Time to re-address it!

Some Background

Over the last few months I helped build and shape a new command for WP-CLI that makes it easier for developers to fully internationalize and localize their WordPress plugins and themes. It’s meant as a successor to the makepot.php script that tries to achieve the same and is the currently used by thousands of WordPress developers as well as the WordPress.org translation platform.

Unfortunately, makepot.php is outdated, buggy, and not really future-proof (think JavaScript internationalization). That’s why I proposed replacing it with the new WP-CLI command on WordPress.org.

By running wp i18n make-pot /path/to/my/wordpress/wp-content/my-plugin you can create a so-called translation catalog with the .pot file extension. This catalog contains all the strings from your plugin that have been internationalized using the available gettext functions like __(), _n(), and _x().

Check out the plugin developer handbook for a more thorough list of localization functions.

Where The Text Domain Comes Into Play

Let’s take __( 'Translate me', 'my-plugin' ) as an example.

The first argument of this function call is the actual text that should be translatable, the second argument is your text domain. One requirement for plugin developers is that the text domain must match the slug of the plugin.

If your plugin is a single file called my-plugin.php or it is contained in a folder called my-plugin, the text domain should be my-plugin. If your plugin is hosted on WordPress.org, it must be the slug of your plugin URL (wordpress.org/plugins/<slug>).

In the WP-CLI command we automatically try to guess your plugin’s slug (and thus the text domain) from the folder name. After that, it only extracts gettext calls with that text domain. Any other text domain will be ignored. This means it finds and extracts __( 'Translate me', 'my-plugin' ), but skips __( 'Translate me', 'another-plugin' ).

Don’t Repeat Yourself

Now, if you have lots of strings, you might want to save yourself some typing and use a variable or a constant instead of writing 'my-plugin' every time. After all, repetition is bad and using a variable makes sure you don’t make any spelling mistakes.

However, you’re actually still repeating the same variable over and over again, so you don’t really save any time. Also, variables are useful when a value needs to change. But the text domain of a plugin never really changes, especially when it is hosted on WordPress.org where you cannot change it once you’ve submitted the plugin.

If the text domain does change for whatever reason, you can do simple string replacements to make this change. There’s no need for a variable. Also, if you fear spelling mistakes, the WordPress Coding Standards for PHP_CodeSniffer has got you covered as they can detect incorrect text domains.

Most importantly, the WordPress plugin developer handbook explicitly forbids using variables for text domains:

Do not use variable names or constants for the text domain portion of a gettext function. Do not do this as a shortcut: __( ‘Translate me.’ , $text_domain );

WordPress Plugin Handbook

But why are variables not allowed as text domains? Let’s have a look at how this whole process works to better understand this.

How Localization Works in WordPress

Let’s say we have a WordPress site set up in German (de_DE) and running our plugin (my-plugin) from the previous examples. When WordPress encounters a function call like __( 'Translate me', 'my-plugin' ), the following happens:

  1. If translations for that text domain have already been loaded, WordPress tries to translate the given string.
  2. If translations haven’t been loaded yet, WordPress looks for a file my-plugin-de_DE.mo in the folder wp-content/languages/plugins and loads the translations from there if found.

Since all these PHP files are executed, we could actually use something like __( ‘Translate me.’ , $text_domain );. Given that $text_domain = 'my-plugin', this works exactly the same.

String Extraction

To really answer the question of why variables as text domains are discouraged, we need to understand the process of how we actually get to this plugin-de_DE.mo file.

It all starts with wp i18n make-pot (or makepot.php, for that matter).

As mentioned before, that command looks for all instances of __() and the like in your plugin to extract translatable strings. During that process, the code isn’t executed, but only parsed. That means it has no idea what the value of $text_domain is in __( 'Translate me', $text_domain ). It just knows that it’s a variable.

We could just as well omit the variable entirely and write __( 'Translate me' ) as it provides no additional value. But can we?

A closer look at the makepot.php script reveals that the second argument holding the text domain is actually completely ignored. Let’s say we have a plugin that’s hosted on WordPress.org and contains the following code:

__( 'Translate me', 'my-plugin' );

__( 'Translate me too! Please?', $text_domain );

__( 'Translate me too!', MY_PLUGIN_TEXTDOMAIN );Code language: PHP (php)

In this case, all three strings will be extracted and made available for translation on translate.wordpress.org. This seems to support the theory that the text domain doesn’t need to be a string at all.

There is a caveat though.

Multiple Text Domains

Let’s say your plugin bundles a third-party library like TGM Plugin Activation. By default this library contains lots of gettext calls like __( 'Install Plugins', 'tgmpa' ). When running makepot.php, this string would be extracted as well. However, TGMPA provides its own language files and everything, so you don’t want to duplicate efforts there.

There’s no other way to solve this without limiting the string extraction to a specific text domain. And for this, the text domain needs to be a string, not a variable.

Note: You will also run into the these issues with tools like node-wp-i18n, as they use makepot.php under the hood. The same applies to Poedit, a popular translation software for WordPress projects. Since gettext wasn’t intended to be used with multiple domains inside a single project/file, the xgettext command line utility doesn’t support limiting the text domain either.

A similar situation arises when adding customized WooCommerce shop templates to your WordPress theme. Usually you don’t need to add these to your theme unless you really need to change the markup.

Since these templates are coming from the WooCommerce plugin, all localizable strings use the woocommerce text domain. And when you don’t change any of these strings you might consider just keeping the text domain so WordPress will still translate these.

However, not changing the WooCommerce text domain is a bad idea. The reasons are simple:

  1. Strings with a different text domain than your theme’s might not be extracted in the future.
  2. It’s unreliable.
    When WooCommerce changes its templates in a new version, your strings might suddenly not be localized anymore.
  3. You take control away from users.
    Users and translators have no way to translate your customized shop templates.
  4. Context might change.
    When you heavily customize the WooCommerce templates, some of the strings in them might not be 100% accurate anymore. At this point you have to rephrase and use your own text domain anyway.

For the same reasons you shouldn’t use WordPress core strings, without your project’s text domain, in your plugin or theme either.

Conclusion

To distinguish between strings coming from WordPress core and the different plugins and themes on your site, WordPress uses a so-called text domain.

While it might sound convenient to use a variable for the text domain in order to not repeat it all the time, there are some serious drawbacks to that method when a plugin or theme contains strings with multiple text domains.

As mentioned at the beginning of the article, I proposed replacing makepot.php on WordPress.org with the new WP-CLI command to extract strings from themes and plugins. If that proposed change is made, any string with a text domain that doesn’t match the project’s slug or isn’t a string literal will be ignored.

However, this wouldn’t be an overnight change and we probably would soften that requirement in the beginning until all developers have caught up and fixed their text domains.

Nevertheless, if your plugin or theme is affected, you should make some changes today. Update your plugins and themes now to ensure all internationalized strings use a string literal text domain which matches the plugin’s slug, so that string extraction will continue to work for these in the future.


Comments

2 responses to “The Text Domain in WordPress Internationalization”

  1. nicolas bourdon Avatar
    nicolas bourdon

    I guess that .po files were enought to wordpress. But no, wordpress requieres to get .mo files.
    I imagined that .po files could be transformed to .mo automatically by wordpress but it’s not the case, at least on WAMP.
    I loose few times because of not knowing that. I hope I can help with this comment.

    1. Just PO files are not enough, and this has been the case ever since and I think is clearly documented everywhere.

      PO is the human-readable format of the translation files, whereas MO is the machine-readable format for use by the software (in this case WordPress).

Leave a Reply

Your email address will not be published. Required fields are marked *