Skip to content

Commit 359a31d

Browse files
Kenovchuravy
authored andcommitted
Record NI types in add exprs
This fixes a case where loop-reduce introduces ptrtoint/inttoptr for non-integral address space pointers. Over the past several years, we have gradually improved the SCEVExpander to actually do something sensible for non-integral pointer types. However, that obviously relies on the expander knowing what the type of the SCEV expression is. That is usually the case, but there is one important case where it's not: The type of an add expression is just the type of the last operand, so if the non-integral pointer is not the last operand, later uses of that SCEV may not realize that the given add expression contains non-integral pointers and may try to expand it as integers. One interesting observation is that we do get away with this scheme in shockingly many cases. The reason for this is that SCEV expressions often have an `scUnknown` pointer base, which our sort order on the operands of add expressions sort behind basically everything else, so it usually ends up as the last operand. One situation where this fails is included as a test case. This test case was bugpoint-reduced from the issue reported at JuliaLang/julia#31156. What happens here is that the pointer base is an scAddRec from an outer loop, plus an scUnknown integer offset. By our sort order, the scUnknown gets sorted after the scAddRec pointer base, thus making an add expression of these two operands have integer type. This then confuses the expander, into attempting to expand the whole thing as integers, which will obviously fail when reaching the non-integral pointer. I considered a few options to solve this, but here's what I ended up settling on: The AddExpr class gains a new subclass that explicitly stores the type of the expression. This subclass is used whenever one of the operands is a non-integral pointer. To reduce the impact for the regular case (where the SCEV expression contains no non-integral pointers), a bit flag is kept in each flag expression to indicate whether it is of non-integral pointer type (this should give the same answer as asking if getType() is non-integral, but performing that query may involve a pointer chase and requires the DataLayout). For add expressions that flag is also used to indicate whether we're using the subclass or not. This is slightly inefficient, because it uses the subclass even in the (not uncommon) case where the last operand does actually accurately reflect the non-integral pointer type. However, it didn't seem worth the extra flag bit and complexity to do this micro-optimization. I had hoped that we could additionally restrict mul exprs from containing any non-integral pointers, and also require add exprs to only have one operand containg such pointers (but not more), but this turned out not to work. The reason for this is that SCEV wants to form differences between pointers, which it represents as `A + B*-1`, so we need to allow both multiplication by `-1` and addition with multiple non-integral pointer arguments. I'm not super happy with that situation, but I think it exposes a more general problem with non-integral pointers in LLVM. We don't actually have a way to express the difference between two non-integral pointers at the IR level. In theory this is a problem for SCEV, because it means that we can't materialize such SCEV expression. However, in practice, these expressions generally have the same base pointer, so SCEV will appropriately simplify them to just the integer components. Nevertheless it is a bit unsatisfying. Perhaps we could have an intrinsic that takes the byte difference between two pointers to the same allocated object (in the same sense as is used in getelementptr), which should be a sensible operation even for non-integral pointers. However, given the practical considerations above, that's a project for another time. For now, simply allowing the existing pointer-diff pattern for non-integral pointers seems to work ok. Differential Revision: https://reviews.llvm.org/D75072
1 parent ae2638d commit 359a31d

File tree

4 files changed

+155
-23
lines changed

4 files changed

+155
-23
lines changed

llvm/include/llvm/Analysis/ScalarEvolution.h

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,19 @@ class SCEV : public FoldingSetNode {
120120
NoWrapMask = (1 << 3) - 1
121121
};
122122

123+
/// HasNonIntegralPointerFlag are bitfield indices into SubclassData.
124+
///
125+
/// When constructing SCEV expressions for LLVM expressions with non-integral
126+
/// pointer types, some additional processing is required to ensure that we
127+
/// don't introduce any illegal transformations. However, non-integral pointer
128+
/// types are a very rarely used feature, so we want to make sure to only do
129+
/// such processing if they are actually used. To ensure minimal performance
130+
/// impact, we memoize that fact in using these flags.
131+
enum HasNonIntegralPointerFlag {
132+
FlagNoNIPointers = 0,
133+
FlagHasNIPointers = (1 << 3)
134+
};
135+
123136
explicit SCEV(const FoldingSetNodeIDRef ID, SCEVTypes SCEVTy,
124137
unsigned short ExpressionSize)
125138
: FastID(ID), SCEVType(SCEVTy), ExpressionSize(ExpressionSize) {}
@@ -156,6 +169,10 @@ class SCEV : public FoldingSetNode {
156169
return ExpressionSize;
157170
}
158171

172+
bool hasNonIntegralPointers() const {
173+
return SubclassData & FlagHasNIPointers;
174+
}
175+
159176
/// Print out the internal representation of this scalar to the specified
160177
/// stream. This should really only be used for debugging purposes.
161178
void print(raw_ostream &OS) const;
@@ -745,7 +762,7 @@ class ScalarEvolution {
745762
const BasicBlock *ExitingBlock);
746763

747764
/// The terms "backedge taken count" and "exit count" are used
748-
/// interchangeably to refer to the number of times the backedge of a loop
765+
/// interchangeably to refer to the number of times the backedge of a loop
749766
/// has executed before the loop is exited.
750767
enum ExitCountKind {
751768
/// An expression exactly describing the number of times the backedge has
@@ -758,7 +775,7 @@ class ScalarEvolution {
758775
};
759776

760777
/// Return the number of times the backedge executes before the given exit
761-
/// would be taken; if not exactly computable, return SCEVCouldNotCompute.
778+
/// would be taken; if not exactly computable, return SCEVCouldNotCompute.
762779
/// For a single exit loop, this value is equivelent to the result of
763780
/// getBackedgeTakenCount. The loop is guaranteed to exit (via *some* exit)
764781
/// before the backedge is executed (ExitCount + 1) times. Note that there

llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h

Lines changed: 70 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,13 @@ class Type;
228228
return getNoWrapFlags(FlagNW) != FlagAnyWrap;
229229
}
230230

231+
void setHasNIPtr(bool HasNIPtr) {
232+
if (HasNIPtr)
233+
SubclassData |= FlagHasNIPointers;
234+
else
235+
SubclassData &= ~FlagHasNIPointers;
236+
}
237+
231238
/// Methods for support type inquiry through isa, cast, and dyn_cast:
232239
static bool classof(const SCEV *S) {
233240
return S->getSCEVType() == scAddExpr || S->getSCEVType() == scMulExpr ||
@@ -264,26 +271,63 @@ class Type;
264271

265272
Type *Ty;
266273

274+
protected:
267275
SCEVAddExpr(const FoldingSetNodeIDRef ID, const SCEV *const *O, size_t N)
268276
: SCEVCommutativeExpr(ID, scAddExpr, O, N) {
269-
auto *FirstPointerTypedOp = find_if(operands(), [](const SCEV *Op) {
270-
return Op->getType()->isPointerTy();
271-
});
272-
if (FirstPointerTypedOp != operands().end())
273-
Ty = (*FirstPointerTypedOp)->getType();
274-
else
275-
Ty = getOperand(0)->getType();
277+
276278
}
277279

278280
public:
279-
Type *getType() const { return Ty; }
281+
// Returns the type of the add expression, by looking either at the last operand
282+
// or deferring to the SCEVAddNIExpr subclass.
283+
Type *getType() const;
280284

281285
/// Methods for support type inquiry through isa, cast, and dyn_cast:
282286
static bool classof(const SCEV *S) {
283287
return S->getSCEVType() == scAddExpr;
284288
}
285289
};
286290

291+
/// This node represents an addition of some number of SCEVs, one which
292+
/// is a non-integral pointer type, requiring us to know the type exactly for
293+
/// correctness.
294+
class SCEVAddNIExpr : public SCEVAddExpr {
295+
friend class ScalarEvolution;
296+
PointerType *NIType;
297+
298+
SCEVAddNIExpr(const FoldingSetNodeIDRef ID, const SCEV *const *O, size_t N,
299+
PointerType *NIType)
300+
: SCEVAddExpr(ID, O, N), NIType(NIType) {
301+
SubclassData |= FlagHasNIPointers;
302+
}
303+
304+
public:
305+
Type *getType() const { return NIType; }
306+
307+
/// Methods for support type inquiry through isa, cast, and dyn_cast:
308+
static bool classof(const SCEV *S) {
309+
return S->getSCEVType() == scAddExpr && S->hasNonIntegralPointers();
310+
}
311+
};
312+
313+
inline Type *SCEVAddExpr::getType() const {
314+
// In general, use the type of the last operand, which is likely to be a
315+
// pointer type, if there is one. This doesn't usually matter, but it can
316+
// help reduce casts when the expressions are expanded. In the (unusual)
317+
// case that we're working with non-integral pointers, we have a subclass
318+
// that stores that type explicitly.
319+
if (hasNonIntegralPointers())
320+
return cast<SCEVAddNIExpr>(this)->getType();
321+
322+
auto *FirstPointerTypedOp = find_if(operands(), [](const SCEV *Op) {
323+
return Op->getType()->isPointerTy();
324+
});
325+
if (FirstPointerTypedOp != operands().end())
326+
return (*FirstPointerTypedOp)->getType();
327+
else
328+
return getOperand(0)->getType();
329+
}
330+
287331
/// This node represents multiplication of some number of SCEVs.
288332
class SCEVMulExpr : public SCEVCommutativeExpr {
289333
friend class ScalarEvolution;
@@ -293,6 +337,18 @@ class Type;
293337
: SCEVCommutativeExpr(ID, scMulExpr, O, N) {}
294338

295339
public:
340+
Type *getType() const {
341+
// In general, we can't form SCEVMulExprs with non-integral pointer types,
342+
// but for the moment we need to allow a special case: Multiplying by
343+
// -1 to be able express the difference between two pointers. In order
344+
// to maintain the invariant that SCEVs with the NI flag set should have
345+
// a type corresponding to the contained NI ptr, we need to return the
346+
// type of the pointer here.
347+
if (hasNonIntegralPointers())
348+
return getOperand(getNumOperands() - 1)->getType();
349+
return SCEVCommutativeExpr::getType();
350+
}
351+
296352
/// Methods for support type inquiry through isa, cast, and dyn_cast:
297353
static bool classof(const SCEV *S) {
298354
return S->getSCEVType() == scMulExpr;
@@ -531,9 +587,12 @@ class Type;
531587
/// instances owned by a ScalarEvolution.
532588
SCEVUnknown *Next;
533589

534-
SCEVUnknown(const FoldingSetNodeIDRef ID, Value *V,
535-
ScalarEvolution *se, SCEVUnknown *next) :
536-
SCEV(ID, scUnknown, 1), CallbackVH(V), SE(se), Next(next) {}
590+
SCEVUnknown(const FoldingSetNodeIDRef ID, Value *V, ScalarEvolution *se,
591+
SCEVUnknown *next, bool ValueIsNIPtr)
592+
: SCEV(ID, scUnknown, 1), CallbackVH(V), SE(se), Next(next) {
593+
if (ValueIsNIPtr)
594+
SubclassData |= FlagHasNIPointers;
595+
}
537596

538597
// Implement CallbackVH.
539598
void deleted() override;

llvm/lib/Analysis/ScalarEvolution.cpp

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -389,12 +389,13 @@ Type *SCEV::getType() const {
389389
case scSignExtend:
390390
return cast<SCEVCastExpr>(this)->getType();
391391
case scAddRecExpr:
392-
case scMulExpr:
393392
case scUMaxExpr:
394393
case scSMaxExpr:
395394
case scUMinExpr:
396395
case scSMinExpr:
397396
return cast<SCEVNAryExpr>(this)->getType();
397+
case scMulExpr:
398+
return cast<SCEVMulExpr>(this)->getType();
398399
case scAddExpr:
399400
return cast<SCEVAddExpr>(this)->getType();
400401
case scUDivExpr:
@@ -2679,16 +2680,27 @@ ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
26792680
SCEV::NoWrapFlags Flags) {
26802681
FoldingSetNodeID ID;
26812682
ID.AddInteger(scAddExpr);
2682-
for (const SCEV *Op : Ops)
2683-
ID.AddPointer(Op);
2683+
bool HasNIPtr = false;
2684+
PointerType *NIPtrType = nullptr;
2685+
for (unsigned i = 0, e = Ops.size(); i != e; ++i) {
2686+
ID.AddPointer(Ops[i]);
2687+
if (Ops[i]->hasNonIntegralPointers()) {
2688+
HasNIPtr = true;
2689+
NIPtrType = cast<PointerType>(Ops[i]->getType());
2690+
}
2691+
}
26842692
void *IP = nullptr;
26852693
SCEVAddExpr *S =
26862694
static_cast<SCEVAddExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
26872695
if (!S) {
26882696
const SCEV **O = SCEVAllocator.Allocate<const SCEV *>(Ops.size());
26892697
std::uninitialized_copy(Ops.begin(), Ops.end(), O);
2690-
S = new (SCEVAllocator)
2691-
SCEVAddExpr(ID.Intern(SCEVAllocator), O, Ops.size());
2698+
if (HasNIPtr)
2699+
S = new (SCEVAllocator)
2700+
SCEVAddNIExpr(ID.Intern(SCEVAllocator), O, Ops.size(), NIPtrType);
2701+
else
2702+
S = new (SCEVAllocator)
2703+
SCEVAddExpr(ID.Intern(SCEVAllocator), O, Ops.size());
26922704
UniqueSCEVs.InsertNode(S, IP);
26932705
addToLoopUseLists(S);
26942706
}
@@ -2701,8 +2713,10 @@ ScalarEvolution::getOrCreateAddRecExpr(ArrayRef<const SCEV *> Ops,
27012713
const Loop *L, SCEV::NoWrapFlags Flags) {
27022714
FoldingSetNodeID ID;
27032715
ID.AddInteger(scAddRecExpr);
2704-
for (unsigned i = 0, e = Ops.size(); i != e; ++i)
2716+
for (unsigned i = 0, e = Ops.size(); i != e; ++i) {
2717+
assert(i == 0 || !Ops[i]->hasNonIntegralPointers());
27052718
ID.AddPointer(Ops[i]);
2719+
}
27062720
ID.AddPointer(L);
27072721
void *IP = nullptr;
27082722
SCEVAddRecExpr *S =
@@ -2716,6 +2730,7 @@ ScalarEvolution::getOrCreateAddRecExpr(ArrayRef<const SCEV *> Ops,
27162730
addToLoopUseLists(S);
27172731
}
27182732
setNoWrapFlags(S, Flags);
2733+
S->setHasNIPtr(Ops[0]->hasNonIntegralPointers());
27192734
return S;
27202735
}
27212736

@@ -2724,8 +2739,11 @@ ScalarEvolution::getOrCreateMulExpr(ArrayRef<const SCEV *> Ops,
27242739
SCEV::NoWrapFlags Flags) {
27252740
FoldingSetNodeID ID;
27262741
ID.AddInteger(scMulExpr);
2727-
for (unsigned i = 0, e = Ops.size(); i != e; ++i)
2742+
bool HasNIPtr = false;
2743+
for (unsigned i = 0, e = Ops.size(); i != e; ++i) {
2744+
HasNIPtr |= Ops[i]->hasNonIntegralPointers();
27282745
ID.AddPointer(Ops[i]);
2746+
}
27292747
void *IP = nullptr;
27302748
SCEVMulExpr *S =
27312749
static_cast<SCEVMulExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -2738,6 +2756,7 @@ ScalarEvolution::getOrCreateMulExpr(ArrayRef<const SCEV *> Ops,
27382756
addToLoopUseLists(S);
27392757
}
27402758
S->setNoWrapFlags(Flags);
2759+
S->setHasNIPtr(HasNIPtr);
27412760
return S;
27422761
}
27432762

@@ -3615,8 +3634,11 @@ const SCEV *ScalarEvolution::getMinMaxExpr(SCEVTypes Kind,
36153634
return ExistingSCEV;
36163635
const SCEV **O = SCEVAllocator.Allocate<const SCEV *>(Ops.size());
36173636
std::uninitialized_copy(Ops.begin(), Ops.end(), O);
3618-
SCEV *S = new (SCEVAllocator)
3637+
SCEVMinMaxExpr *S = new (SCEVAllocator)
36193638
SCEVMinMaxExpr(ID.Intern(SCEVAllocator), Kind, O, Ops.size());
3639+
// For MinMaxExprs it's sufficient to see if the first Op has NI data, as the
3640+
// operands all need to be of the same type.
3641+
S->setHasNIPtr(Ops[0]->hasNonIntegralPointers());
36203642

36213643
UniqueSCEVs.InsertNode(S, IP);
36223644
addToLoopUseLists(S);
@@ -3716,8 +3738,9 @@ const SCEV *ScalarEvolution::getUnknown(Value *V) {
37163738
"Stale SCEVUnknown in uniquing map!");
37173739
return S;
37183740
}
3741+
bool ValueIsNIPtr = getDataLayout().isNonIntegralPointerType(V->getType());
37193742
SCEV *S = new (SCEVAllocator) SCEVUnknown(ID.Intern(SCEVAllocator), V, this,
3720-
FirstUnknown);
3743+
FirstUnknown, ValueIsNIPtr);
37213744
FirstUnknown = cast<SCEVUnknown>(S);
37223745
UniqueSCEVs.InsertNode(S, IP);
37233746
return S;

llvm/test/Transforms/LoopStrengthReduce/nonintegral.ll

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
; Address Space 10 is non-integral. The optimizer is not allowed to use
44
; ptrtoint/inttoptr instructions. Make sure that this doesn't happen
5-
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:10:11:12"
5+
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:10:11:12:13"
66
target triple = "x86_64-unknown-linux-gnu"
77

88
define void @japi1__unsafe_getindex_65028(i64 addrspace(10)* %arg) {
@@ -43,3 +43,36 @@ if38: ; preds = %L119
4343
done: ; preds = %if38
4444
ret void
4545
}
46+
47+
; This is a bugpoint-reduced regression test - It doesn't make too much sense by itself,
48+
; but creates the correct SCEV expressions to reproduce the issue. See
49+
; https://github.com/JuliaLang/julia/issues/31156 for the original bug report.
50+
define void @"japi1_permutedims!_4259"(i64 %a, i64 %b, i64 %c, i64 %d, i64 %e, i64 %f, i1 %g, i8 addrspace(13)* %base) #0 {
51+
; CHECK-NOT: inttoptr
52+
; CHECK-NOT: ptrtoint
53+
; CHECK: getelementptr i8, i8 addrspace(13)* {{.*}}, i64 {{.*}}
54+
top:
55+
br label %L42.L46_crit_edge.us
56+
57+
L42.L46_crit_edge.us: ; preds = %L82.us.us.loopexit, %top
58+
%value_phi11.us = phi i64 [ %a, %top ], [ %2, %L82.us.us.loopexit ]
59+
%0 = sub i64 %value_phi11.us, %b
60+
%1 = add i64 %0, %c
61+
%spec.select = select i1 %g, i64 %d, i64 0
62+
br label %L62.us.us
63+
64+
L82.us.us.loopexit: ; preds = %L62.us.us
65+
%2 = add i64 %e, %value_phi11.us
66+
br label %L42.L46_crit_edge.us
67+
68+
L62.us.us: ; preds = %L62.us.us, %L42.L46_crit_edge.us
69+
%value_phi21.us.us = phi i64 [ %6, %L62.us.us ], [ %spec.select, %L42.L46_crit_edge.us ]
70+
%3 = add i64 %1, %value_phi21.us.us
71+
%4 = getelementptr inbounds i8, i8 addrspace(13)* %base, i64 %3
72+
%5 = load i8, i8 addrspace(13)* %4, align 1
73+
%6 = add i64 %f, %value_phi21.us.us
74+
br i1 %g, label %L82.us.us.loopexit, label %L62.us.us, !llvm.loop !1
75+
}
76+
77+
!1 = distinct !{!1, !2}
78+
!2 = !{!"llvm.loop.isvectorized", i32 1}

0 commit comments

Comments
 (0)