Localization in ASP.NET Core 1.0: Pluralization Syntax

Mar 06, 2016     Viewed 47261 times    2 Comments
Posted in #Localization  #Pluralization 

Pluralization is a complex problem, as different languages have a variety of complex rules for pluralization. English language is one of the simplest languages because it have two plural forms: one for the singular and another for plural, which is make it easy to implement pluralization for English Language.

Let us have an example:

1 apple, 2 apples and 100 apples

As we said before "1 apple" is a singular form, while the others are plural form, some of us will said it's easy to implement the pluralization .. wait a minute!! and have a look to plural forms link or this link, perhaps you will not believe that there are some language has more complex rules for pluralization such as Arabic Language, which is my mother language :)

It's code time, let us simplify the entire process of pluralization, as we know there's no one solution for this problem and if you look to many programming languages and frameworks there are different flavors, so let us see what can I come up with.

First of let us implement a simple pluralization for English language. As we mentioned before English language has two plural form, so it's easy to create a simple function that give us a proper form.

public static string Plural(this IStringLocalizer localizer, bool isPlural, string name, params object[] arguments)
{
string value = localizer[name,arguments];
int index = (isPlural ? 1 : 0);
return value.Split('|')[index];
}

I presume that value of the key in the resource file is separated by "|" to distinguish between the singular and plural forms, and this is will applied into the underneath examples.

In the above example I extend the IStringLocalizer interface to have a new method named Plural, which will give us the right form, and resource file will look like

<data name="apple" xml:space="preserve">
<value>{0} apple|{0} apples</value>
</data>

After that we can simple used as T.Plural("apple", false) to get the singular form and T.Plural("apple", true) to get the plural form.

Now let us dig into more realistic code, because there are many language other than English.

In the following section I will dig into two ways to implement the pluralization:

1- Implicit

In this way the pluralization rules are implicit, all the magic will happen behind the scene.

msgid "%s apple"
msgid_plural "%s apples"
msgstr[0] "
msgstr[1] "
"Project-Id-Version: Space9\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2014-10-24 00:49+0200\n"
"PO-Revision-Date: 2014-10-24 00:49+0200\n"
"Last-Translator: Anastis Sourgoutsidis <anastis@cssigniter.com>\n"
"Language-Team: CSSIgniter LLC <info@cssigniter.com>\n"
"Language: el\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Poedit-SourceCharset: UTF-8\n"
"X-Poedit-KeywordsList: __;_e;__ngettext:1,2;_n:1,2;__ngettext_noop:1,2;"
"_n_noop:1,2;_c,_nc:4c,1,2;_x:1,2c;_nx:4c,1,2;_nx_noop:4c,1,2;_ex:1,2c;"
"esc_attr__;esc_attr_e;esc_attr_x:1,2c;esc_html__;esc_html_e;esc_html_x:1,2c\n"
"X-Poedit-Basepath: .\n"
"X-Textdomain-Support: yes\n"
"X-Generator: Poedit 1.6.10\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"X-Poedit-SearchPath-0: .\n"
"X-Poedit-SearchPath-1: ..\n"

There are some interesting lines

"Plural-Forms: nplurals=2; plural=(n != 1);\n" which is define the plural forms for the English language

msgid "%s apple"
msgid_plural "%s apples"
msgstr[0] "
msgstr[1] "

which define the singular and plural keys, and the values which is in this case are two.

When I was thinking to implement that I asked myself should I have n keys per language? the answer it depends but for generic case the resource file will be large, specially for those languages which have more than two plurals forms, again I'm think for Arabic language which have six forms :) , so I come up with an idea to have all the values per key separated by "|" pipe symbol, in this case I will reduce the amount of key value pair in the resource file regardless what the language is.

public static string Plural(this IStringLocalizer localizer, string name, params object[] arguments)
{
string value = localizer[name,arguments];
int count = Convert.ToInt32(arguments[0]);
int plural = GetPluralForms(count);
return value.Split('|')[plural];
}
The code is quite simple, using IStringLocalizer to get the value of the passed key, after that I called the magic function GetPluralForms() which gets the number of the plural forms for the current language as the following:
private static int GetPluralForms(int n)
{
string code = Thread.CurrentThread.CurrentCulture.TwoLetterISOLanguageName;
int plural=0;
switch (code)
{
// nplural=1
case "ay":
case "bo":
case "cgg":
case "dz":
case "fa":
case "id":
case "ja":
case "jbo":
case "ka":
case "kk":
case "km":
case "ko":
case "ky":
case "lo":
case "ms":
case "my":
case "sah":
case "su":
case "th":
case "tt":
case "ug":
case "vi":
case "wo":
case "zh_CN":
case "zh_HK":
case "zh_TW":
plural = 0;
break;
// nplural=2
case "ach":
case "ak":
case "am":
case "arn":
case "br":
case "fil":
case "fr":
case "gun":
case "ln":
case "mfe":
case "mg":
case "mi":
case "oc":
case "pt_BR":
case "tg":
case "ti":
case "tr":
case "uz":
case "wa":
plural = (n > 1 ? 1 : 0);
break;
case "af":
case "an":
case "anp":
case "as":
case "ast":
case "az":
case "bg":
case "bn":
case "brx":
case "ca":
case "da":
case "de":
case "doi":
case "el":
case "en":
case "eo":
case "es":
case "es_AR":
case "et":
case "eu":
case "ff":
case "fi":
case "fo":
case "fur":
case "fy":
case "gl":
case "gu":
case "ha":
case "he":
case "hi":
case "hne":
case "hu":
case "hy":
case "ia":
case "it":
case "kl":
case "kn":
case "ku":
case "lb":
case "mai":
case "ml":
case "mn":
case "mni":
case "mr":
case "nah":
case "nap":
case "nb":
case "ne":
case "nl":
case "nn":
case "no":
case "nso":
case "or":
case "pa":
case "pap":
case "pms":
case "ps":
case "pt":
case "rm":
case "rw":
case "sat":
case "sco":
case "sd":
case "se":
case "si":
case "so":
case "son":
case "sq":
case "sv":
case "sw":
case "ta":
case "te":
case "tk":
case "ur":
case "yo":
plural = (n != 1 ? 1 : 0);
break;
case "is":
plural = (n % 10 != 1 || n % 100 == 11 ? 1 : 0);
break;
case "jv":
plural = (n != 0 ? 1 : 0);
break;
case "mk":
plural = (n == 1 || n % 10 == 1 ? 0 : 1);
break;
// nplural=3
case "be":
case "bs":
case "hr":
case "lt":
plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 <10 || n % 100 >= 20) ? 1 : 2);
break;
case "cs":
plural = ((n == 1) ? 0 : (n >= 2 && n <= 4) ? 1 : 2);
break;
case "csb":
case "pl":
plural = ((n == 1) ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 10 || n % 100 >= 20) ? 1 : 2);
break;
case "lv":
plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n != 0 ? 1 : 2);
break;
case "mnk":
plural = (n == 0 ? 0 : n == 1 ? 1 : 2);
break;
case "ro":
plural = (n == 1 ? 0 : (n == 0 || (n % 100 > 0 && n % 100 < 20)) ? 1 : 2);
break;
// nplural=4
case "cy":
plural = ((n == 1) ? 0 : (n ==2 ) ? 1 : (n != 8 && n != 11) ? 2 : 3);
break;
case "gd":
plural = ((n == 1 || n == 11) ? 0 : (n == 2 || n == 12) ? 1 : (n > 2 && n < 20) ? 2 : 3);
break;
case "kw":
plural = ((n == 1) ? 0 : (n == 2) ? 1 : (n == 3) ? 2 : 3);
break;
case "mt":
plural = (n == 1 ? 0 : n == 0 || ( n % 100 > 1 && n % 100 < 11) ? 1 : (n % 100 > 10 && n % 100 < 20 ) ? 2 : 3);
break;
case "sl":
plural = (n % 100==1 ? 1 : n % 100 == 2 ? 2 : n % 100 == 3 || n % 100 == 4 ? 3 : 0);
break;
case "ru":
case "sr":
case "uk":
plural = (n % 10 == 1 && n % 100 != 11 ? 0 : n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 10 || n % 100 >= 20) ? 1 : 2);
break;
case "sk":
plural = ((n == 1) ? 0 : (n >= 2 && n <= 4) ? 1 : 2);
break;
// nplural=5
case "ga":
plural = (n == 1 ? 0 : n == 2 ? 1 : (n > 2 && n < 7) ? 2 :(n > 6 && n < 11) ? 3 : 4);
break;
// nplural=6
case "ar":
plural = (n == 0 ? 0 : n == 1 ? 1 : n == 2 ? 2 : n % 100 >= 3 && n % 100 <= 10 ? 3 : n %100 >= 11 ? 4 : 5);
break;
}
return plural;
}

2- Explicit

You may also create more explicit pluralization rules easily:

apples => "[0] There are no apples|[1-19] There are some apples|[20-*] There are many apples"

This technique is inspired by laravel framework, and if you notice the pluralization rules are more explicit in the values in the resource file.

Here there are three case

[0] which means if the count is equal zero you will get There are no apples

[1-19] which means if the count between one and nineteen you will get There are some apples

[20-*] which means if the count is greater than or equal twenty you will get There are many apples

The explicit rules is more powerful, but you need to write them your own, the code of this technique may looks like

public static string Plural(this IStringLocalizer localizer, string name, params object[] arguments)
{
string value = localizer[name,arguments];
var parts = value.Split('|');
var plural=";
int n = Convert.ToInt32(arguments[0]);
foreach (var part in parts)
{
var tmp = part.Substring(1, part.IndexOf(']')-1);
if (tmp.Contains("-"))
{
var tokens = tmp.Split('-');
int min = Convert.ToInt32(tokens[0]);
int max = (tokens[1]=="*"?int.MaxValue:Convert.ToInt32(tokens[1]));
if (n >= min && n <= max)
{
plural= part.Split(']')[1];
break;
}
}
else if(tmp.Contains(","))
{
var tokens = tmp.Split(',');
if (tokens.Any(t=>Convert.ToInt32(t)==n))
{
plural= part.Split(']')[1];
break;
}
}
else
{
if(Convert.ToInt32(tmp) == n)
{
plural= part.Split(']')[1];
break;
}
}
}
return plural;
}
You may need sort of caching to avoid string processing for each requested key, for the sake of the demo I didn't implement that.

Twitter Facebook Google + LinkedIn


2 Comments

David (7/12/2016 12:49:22 PM)

Hello! For po localization projects, you can use a translation management platform like https://poeditor.com/ that can automate the workflow and make things easier for users.

Hisham Bin Ateya (9/18/2016 11:15:12 PM)

@David poeditor is great for translation management, but my aim is how to support pluralization syntax in ASP.NET Core


Leave a Comment