-
-
Notifications
You must be signed in to change notification settings - Fork 145
[8.5] Add locale_is_right_to_left
#527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 1.x
Are you sure you want to change the base?
[8.5] Add locale_is_right_to_left
#527
Conversation
30b4e10
to
beacebd
Compare
src/Php85/Php85.php
Outdated
@@ -33,4 +34,8 @@ public static function get_exception_handler(): ?callable | |||
|
|||
return $handler; | |||
} | |||
|
|||
public static function locale_is_right_to_left(string $locale): bool { | |||
return (bool) preg_match('/^(?:ar|he|fa|ur|ps|sd|ug|ckb|yi|dv|ku_arab|ku-arab)(?:[_-].*)?$/i', $locale); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right implementation. What is right to left is a script, not a language. And locales might specify a script explicitly which is not the most likely script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, any suggestions? Sadly the original code does not help here: unicode-org/icu@53dcbe6
- A script is right-to-left according to the CLDR script metadata
- which corresponds to whether the script's letters have Bidi_Class=R or AL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what Gemini provides, let's take inspiration from this snippet?
function locale_is_right_to_left(string $locale): bool
{
// Define the list of RTL scripts within the function scope.
// These are 4-letter ISO 15924 script codes.
static $rtlScripts = [
'Adlm', 'Arab', 'Armi', 'Hebr', 'Mani', 'Mend', 'Nkoo',
'Phnx', 'Rohg', 'Samr', 'Syrc', 'Thaa',
];
// This is a minimal, hardcoded version of the CLDR "likelySubtags" data.
// It maps a language code to its most likely SCRIPT code.
// This list is NOT exhaustive and is the primary weakness of this approach.
static $languageToLikelyRtlScript = [
'ar' => 'Arab', // Arabic
'fa' => 'Arab', // Persian (Farsi)
'ur' => 'Arab', // Urdu
'ps' => 'Arab', // Pashto
'sd' => 'Arab', // Sindhi
'ug' => 'Arab', // Uyghur
'ckb' => 'Arab', // Sorani Kurdish
'he' => 'Hebr', // Hebrew
'yi' => 'Hebr', // Yiddish
'dv' => 'Thaa', // Dhivehi
'nqo' => 'Nkoo', // N'Ko
];
if (empty($locale)) {
return false;
}
// Normalize separators and split the locale into parts.
$localeParts = preg_split('/[_-]/', $locale);
$language = strtolower($localeParts[0] ?? '');
$script = null;
// Look for an explicit script subtag (always 4 letters).
foreach ($localeParts as $part) {
if (strlen($part) === 4 && ctype_alpha($part)) {
// Capitalize the first letter for standard format (e.g., "Arab", "Latn").
$script = ucfirst(strtolower($part));
break;
}
}
// If no explicit script was found, try to infer it from our map.
if ($script === null) {
$script = $languageToLikelyScript[$language] ?? null;
}
// If we couldn't determine a script, we can't determine direction.
if ($script === null) {
// Fallback for languages where the code itself is a strong indicator
if (in_array($language, ['ar', 'he', 'fa', 'ur', 'ps', 'sd', 'ug', 'ckb', 'yi', 'dv'])) {
return true;
}
return false;
}
// Check if the determined script is in our list of RTL scripts.
return in_array($script, $rtlScripts, true);
}
}
our |
locale_is_right_to_left
locale_is_right_to_left
locale_is_right_to_left
See https://php.watch/versions/8.5/locale_is_right_to_left-Locale-isRightToleft