Skip to content

gh-82088: Improve performance of PyLong_As*() for multi-digit ints #135585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Jun 16, 2025

This is a continuation of the work in #15363 by @sir-sigurd.

Performance is improved by

  • Unrolling the part of the loop where no overflow can occur
  • Improving the overflow check so no temporary variable is used

Performance: main

pack n dt 460.21 [ms]
pack N dt 476.32 [ms]
pack l dt 479.76 [ms]
pack L dt 451.84 [ms]

PR

pack n dt 401.02 [ms]
pack N dt 391.06 [ms]
pack l dt 408.83 [ms]
pack L dt 402.37 [ms]

Script:

import time
from struct import Struct

if __name__ == "__main__":
    for key in 'nNlL':
        N = 1000
        pack = Struct(key*N).pack
        values = tuple(range(2**30, 2**30 + N))
            
        t0=time.perf_counter()
        for _ in range(50_000):
            pack(*values);
        dt=time.perf_counter()-t0
        
        print(f'pack {key} dt {dt*1e3:.2f} [ms]')

(performance gain is similar to earlier reported improvements)

assert(ULONG_MAX >= ((1UL << PyLong_SHIFT) - 1));
x = digits[i];

#if ((ULONG_MAX >> PyLong_SHIFT)) >= ((1UL << PyLong_SHIFT) - 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also unroll the second digit for the other methods. It gains a bit of performance at the cost of a bit of extra code. The unrolling can be factored out to a helper function (the loop itself is not so easy because there are small differences in how to handle the overflow)

eendebakpt and others added 2 commits June 17, 2025 16:46
Co-authored-by: Victor Stinner <vstinner@python.org>
@eendebakpt eendebakpt changed the title Draft: gh-82088: Improve performance of PyLong_As*() for multi-digit ints gh-82088: Improve performance of PyLong_As*() for multi-digit ints Jun 17, 2025
@eendebakpt eendebakpt requested a review from vstinner June 24, 2025 11:26
@vstinner
Copy link
Member

I rewrote the benchmark using pyperf:

import pyperf
from struct import Struct

N = 1000

runner = pyperf.Runner()
values = tuple(range(2**30, 2**30 + N))
for key in 'nNlL':
    pack = Struct(key*N).pack
    runner.bench_func(f'pack {key}', pack, *values)

Result:

Benchmark ref change
pack n 7.86 us 5.80 us: 1.35x faster
pack N 7.06 us 5.77 us: 1.22x faster
pack l 9.00 us 6.31 us: 1.43x faster
pack L 7.66 us 6.00 us: 1.28x faster
Geometric mean (ref) 1.32x faster

@@ -0,0 +1 @@
Improve performance of ``PyLongObject`` conversion method ``PyLong_AsLongAndOverflow``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mention other modified functions:

Make the following functions 1.3x faster for integer larger than 2**30:

  • PyLong_AsLongAndOverflow(),
  • PyLong_AsSsize_t(),
  • PyLong_AsUnsignedLong(),
  • PyLong_AsSize_t(),
  • PyLong_AsUnsignedLongMask(),
  • PyLong_AsUnsignedLongLongMask(),
  • PyLong_AsLongLongAndOverflow().

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have two last minor suggestions.

@serhiy-storchaka: Would you mind to double check that the integer overflow checks are correct?

Replace:

            prev = x;
            x = (x << PyLong_SHIFT) | v->long_value.ob_digit[i];
            if ((x >> PyLong_SHIFT) != prev) ...

with:

            if (x > (ULONG_MAX >> PyLong_SHIFT)) ...
            x = (x << PyLong_SHIFT) | v->long_value.ob_digit[i];

Co-authored-by: Victor Stinner <vstinner@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants