Skip to content

[8.5] Add locale_is_right_to_left #527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 1.x
Choose a base branch
from

Conversation

alexander-schranz
Copy link

@alexander-schranz alexander-schranz force-pushed the feature/locale_is_right_to_lef branch from 30b4e10 to beacebd Compare May 26, 2025 11:33
@@ -33,4 +34,8 @@ public static function get_exception_handler(): ?callable

return $handler;
}

public static function locale_is_right_to_left(string $locale): bool {
return (bool) preg_match('/^(?:ar|he|fa|ur|ps|sd|ug|ckb|yi|dv|ku_arab|ku-arab)(?:[_-].*)?$/i', $locale);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right implementation. What is right to left is a script, not a language. And locales might specify a script explicitly which is not the most likely script.

Copy link
Author

@alexander-schranz alexander-schranz May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, any suggestions? Sadly the original code does not help here: unicode-org/icu@53dcbe6

  • A script is right-to-left according to the CLDR script metadata
  • which corresponds to whether the script's letters have Bidi_Class=R or AL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what Gemini provides, let's take inspiration from this snippet?

    function locale_is_right_to_left(string $locale): bool
    {
        // Define the list of RTL scripts within the function scope.
        // These are 4-letter ISO 15924 script codes.
        static $rtlScripts = [
            'Adlm', 'Arab', 'Armi', 'Hebr', 'Mani', 'Mend', 'Nkoo',
            'Phnx', 'Rohg', 'Samr', 'Syrc', 'Thaa',
        ];

        // This is a minimal, hardcoded version of the CLDR "likelySubtags" data.
        // It maps a language code to its most likely SCRIPT code.
        // This list is NOT exhaustive and is the primary weakness of this approach.
        static $languageToLikelyRtlScript = [
            'ar' => 'Arab', // Arabic
            'fa' => 'Arab', // Persian (Farsi)
            'ur' => 'Arab', // Urdu
            'ps' => 'Arab', // Pashto
            'sd' => 'Arab', // Sindhi
            'ug' => 'Arab', // Uyghur
            'ckb' => 'Arab', // Sorani Kurdish
            'he' => 'Hebr', // Hebrew
            'yi' => 'Hebr', // Yiddish
            'dv' => 'Thaa', // Dhivehi
            'nqo' => 'Nkoo', // N'Ko
        ];

        if (empty($locale)) {
            return false;
        }

        // Normalize separators and split the locale into parts.
        $localeParts = preg_split('/[_-]/', $locale);
        $language = strtolower($localeParts[0] ?? '');
        $script = null;

        // Look for an explicit script subtag (always 4 letters).
        foreach ($localeParts as $part) {
            if (strlen($part) === 4 && ctype_alpha($part)) {
                // Capitalize the first letter for standard format (e.g., "Arab", "Latn").
                $script = ucfirst(strtolower($part));
                break;
            }
        }

        // If no explicit script was found, try to infer it from our map.
        if ($script === null) {
            $script = $languageToLikelyScript[$language] ?? null;
        }

        // If we couldn't determine a script, we can't determine direction.
        if ($script === null) {
            // Fallback for languages where the code itself is a strong indicator
            if (in_array($language, ['ar', 'he', 'fa', 'ur', 'ps', 'sd', 'ug', 'ckb', 'yi', 'dv'])) {
                 return true;
            }
            return false;
        }

        // Check if the determined script is in our list of RTL scripts.
        return in_array($script, $rtlScripts, true);
    }
}

@stof
Copy link
Member

stof commented Jun 24, 2025

our Locale polyfill in symfony/polyfill-intl-icu should also support Locale::isRightToLeft to support the case of projects requiring PHP 8.5+ but making ext-intl optional thanks to the polyfill (copying the same implementation there).

@OskarStark OskarStark changed the title Add locale_is_right_to_left polyfill Add locale_is_right_to_left Jun 24, 2025
@OskarStark OskarStark changed the title Add locale_is_right_to_left [8.5] Add locale_is_right_to_left Jun 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants