Skip to content

[mlir][tensor] Add a PadOp::FoldReifiedShape canonicalization #145732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nicolasvasilache
Copy link
Contributor

@nicolasvasilache nicolasvasilache commented Jun 25, 2025

…and add a PadOp::FoldReifiedShape canonicalization

@nicolasvasilache nicolasvasilache force-pushed the users/nico/pad-reify-canonicalize branch 2 times, most recently from 59ca35d to 0862760 Compare June 25, 2025 16:04
@nicolasvasilache nicolasvasilache marked this pull request as ready for review June 25, 2025 16:06
@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2025

@llvm/pr-subscribers-mlir

Author: Nicolas Vasilache (nicolasvasilache)

Changes

…and add a PadOp::FoldReifiedShape canonicalization


Full diff: https://github.com/llvm/llvm-project/pull/145732.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td (+1)
  • (modified) mlir/include/mlir/Dialect/Utils/StaticValueUtils.h (+3)
  • (modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+67-1)
  • (modified) mlir/test/Dialect/Tensor/canonicalize.mlir (+18)
diff --git a/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td b/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
index 35d0b16628417..821384eb7d15a 100644
--- a/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
+++ b/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
@@ -1256,6 +1256,7 @@ def Tensor_CollapseShapeOp : Tensor_ReassociativeReshapeOp<"collapse_shape"> {
 
 def Tensor_PadOp : Tensor_Op<"pad", [
     DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>,
+     DeclareOpInterfaceMethods<ReifyRankedShapedTypeOpInterface>,
     AttrSizedOperandSegments,
     Pure,
     SingleBlockImplicitTerminator<"mlir::tensor::YieldOp">]> {
diff --git a/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h b/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
index 77c376fb9973a..c66110f6915e9 100644
--- a/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
+++ b/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
@@ -98,6 +98,9 @@ OpFoldResult getAsOpFoldResult(Value val);
 SmallVector<OpFoldResult> getAsOpFoldResult(ValueRange values);
 /// Convert `arrayAttr` to a vector of OpFoldResult.
 SmallVector<OpFoldResult> getAsOpFoldResult(ArrayAttr arrayAttr);
+// TODO: implement a mixed form of this and deprecate getMixedPadImpl.
+// SmallVector<OpFoldResult> getAsOpFoldResult(ArrayAttr arrayAttr, ValueRange
+// values);
 
 /// Convert int64_t to integer attributes of index type and return them as
 /// OpFoldResult.
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 72144ec71c5d2..95468caa87f18 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "mlir/Dialect/Affine/IR/AffineOps.h"
+#include "mlir/Dialect/Affine/Utils.h"
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/Arith/Utils/Utils.h"
 #include "mlir/Dialect/Complex/IR/Complex.h"
@@ -3791,13 +3792,78 @@ struct FoldConsecutiveConstantPadding : public OpRewritePattern<tensor::PadOp> {
   }
 };
 
+struct FoldReifiedShape : public OpRewritePattern<tensor::PadOp> {
+  using OpRewritePattern<tensor::PadOp>::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(tensor::PadOp padOp,
+                                PatternRewriter &rewriter) const override {
+    if (padOp.getNofold()) {
+      return rewriter.notifyMatchFailure(padOp, "skipping unfoldable pad");
+    }
+
+    ReifiedRankedShapedTypeDims reifiedResultShapes;
+    if (failed(reifyResultShapes(rewriter, padOp, reifiedResultShapes)))
+      return failure();
+
+    SmallVector<int64_t> newShape;
+    for (const auto &[s, ofr] : llvm::zip_equal(
+             padOp.getResultType().getShape(), reifiedResultShapes.front())) {
+      std::optional<int64_t> maybeCst = getConstantIntValue(ofr);
+      // Reification does not add static information, just use existing shape.
+      if (!maybeCst.has_value()) {
+        newShape.push_back(s);
+        continue;
+      }
+      int64_t cst = *maybeCst;
+      assert((ShapedType::isDynamic(s) || s == cst) && "constants must agree!");
+      newShape.push_back(cst);
+    }
+    if (newShape == padOp.getResultType().getShape())
+      return failure();
+
+    Type oldType = padOp.getResultType();
+    Type newType =
+        RankedTensorType::Builder(padOp.getResultType()).setShape(newShape);
+    Location loc = padOp->getLoc();
+    Operation *newPad = rewriter.clone(*padOp);
+    newPad->getResult(0).setType(newType);
+    rewriter.replaceOpWithNewOp<tensor::CastOp>(padOp, oldType,
+                                                newPad->getResult(0));
+    return success();
+  }
+};
+
 } // namespace
 
+LogicalResult
+PadOp::reifyResultShapes(OpBuilder &b,
+                         ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
+  reifiedReturnShapes.resize(1, SmallVector<OpFoldResult>(getType().getRank()));
+  SmallVector<OpFoldResult> lp = getMixedLowPad();
+  SmallVector<OpFoldResult> hp = getMixedHighPad();
+  for (int64_t i = 0; i < getResultType().getRank(); ++i) {
+    if (!getType().isDynamicDim(i)) {
+      reifiedReturnShapes[0][i] = b.getIndexAttr(getType().getDimSize(i));
+      continue;
+    }
+    Location loc = getLoc();
+    Value dim = b.createOrFold<tensor::DimOp>(
+        loc, getSource(), b.create<arith::ConstantIndexOp>(loc, i));
+
+    affine::AffineBuilder ab(b, loc);
+    AffineExpr d0, d1, d2;
+    bindDims(b.getContext(), d0, d1, d2);
+    reifiedReturnShapes[0][i] = affine::makeComposedFoldedAffineApply(
+        b, loc, {d0 + d1 + d2}, {dim, lp[i], hp[i]});
+  }
+  return success();
+}
+
 void PadOp::getCanonicalizationPatterns(RewritePatternSet &results,
                                         MLIRContext *context) {
   results.add<FoldStaticZeroPadding, FoldSourceTensorCast, FoldTargetTensorCast,
               FoldOrthogonalPaddings, FoldStaticPadding,
-              FoldConsecutiveConstantPadding>(context);
+              FoldConsecutiveConstantPadding, FoldReifiedShape>(context);
 }
 
 /// Return the padding value of the PadOp if it constant. In this context,
diff --git a/mlir/test/Dialect/Tensor/canonicalize.mlir b/mlir/test/Dialect/Tensor/canonicalize.mlir
index 3251c5a4a2bfd..358c1c214a3b1 100644
--- a/mlir/test/Dialect/Tensor/canonicalize.mlir
+++ b/mlir/test/Dialect/Tensor/canonicalize.mlir
@@ -2543,3 +2543,21 @@ func.func @partial_sink_expand_of_cast(%arg0 : tensor<10x10xf32>, %arg1 : index,
 //       CHECK:   %[[RES:.+]] = tensor.cast %[[EXPAND]]
 //  CHECK-SAME:     tensor<?x?x10xf32> to tensor<?x?x?xf32>
 //       CHECK:   return %[[RES]]
+
+// -----
+
+// CHECK-LABEL:  func.func @pad_reification
+func.func @pad_reification(%cst : f32, %idx : index, %t: tensor<64x?x64xf32>)
+    -> tensor<1x?x64xf32> {
+  %pad_amt = affine.apply affine_map<(d0) -> (-d0 + 256)>(%idx)
+  %es = tensor.extract_slice %t[0, 0, 0] [1, %idx, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
+
+//       CHECK: tensor.pad
+//       CHECK:   : tensor<1x?x64xf32> to tensor<1x256x64xf32>
+  %padded = tensor.pad %es low[0, 0, 0] high[0, %pad_amt, 0] {
+  ^bb0(%a: index, %b: index, %c: index):
+    tensor.yield %cst : f32
+  } : tensor<1x?x64xf32> to tensor<1x?x64xf32>
+
+  return %padded : tensor<1x?x64xf32>
+}

@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2025

@llvm/pr-subscribers-mlir-tensor

Author: Nicolas Vasilache (nicolasvasilache)

Changes

…and add a PadOp::FoldReifiedShape canonicalization


Full diff: https://github.com/llvm/llvm-project/pull/145732.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td (+1)
  • (modified) mlir/include/mlir/Dialect/Utils/StaticValueUtils.h (+3)
  • (modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+67-1)
  • (modified) mlir/test/Dialect/Tensor/canonicalize.mlir (+18)
diff --git a/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td b/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
index 35d0b16628417..821384eb7d15a 100644
--- a/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
+++ b/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
@@ -1256,6 +1256,7 @@ def Tensor_CollapseShapeOp : Tensor_ReassociativeReshapeOp<"collapse_shape"> {
 
 def Tensor_PadOp : Tensor_Op<"pad", [
     DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>,
+     DeclareOpInterfaceMethods<ReifyRankedShapedTypeOpInterface>,
     AttrSizedOperandSegments,
     Pure,
     SingleBlockImplicitTerminator<"mlir::tensor::YieldOp">]> {
diff --git a/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h b/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
index 77c376fb9973a..c66110f6915e9 100644
--- a/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
+++ b/mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
@@ -98,6 +98,9 @@ OpFoldResult getAsOpFoldResult(Value val);
 SmallVector<OpFoldResult> getAsOpFoldResult(ValueRange values);
 /// Convert `arrayAttr` to a vector of OpFoldResult.
 SmallVector<OpFoldResult> getAsOpFoldResult(ArrayAttr arrayAttr);
+// TODO: implement a mixed form of this and deprecate getMixedPadImpl.
+// SmallVector<OpFoldResult> getAsOpFoldResult(ArrayAttr arrayAttr, ValueRange
+// values);
 
 /// Convert int64_t to integer attributes of index type and return them as
 /// OpFoldResult.
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 72144ec71c5d2..95468caa87f18 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "mlir/Dialect/Affine/IR/AffineOps.h"
+#include "mlir/Dialect/Affine/Utils.h"
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/Arith/Utils/Utils.h"
 #include "mlir/Dialect/Complex/IR/Complex.h"
@@ -3791,13 +3792,78 @@ struct FoldConsecutiveConstantPadding : public OpRewritePattern<tensor::PadOp> {
   }
 };
 
+struct FoldReifiedShape : public OpRewritePattern<tensor::PadOp> {
+  using OpRewritePattern<tensor::PadOp>::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(tensor::PadOp padOp,
+                                PatternRewriter &rewriter) const override {
+    if (padOp.getNofold()) {
+      return rewriter.notifyMatchFailure(padOp, "skipping unfoldable pad");
+    }
+
+    ReifiedRankedShapedTypeDims reifiedResultShapes;
+    if (failed(reifyResultShapes(rewriter, padOp, reifiedResultShapes)))
+      return failure();
+
+    SmallVector<int64_t> newShape;
+    for (const auto &[s, ofr] : llvm::zip_equal(
+             padOp.getResultType().getShape(), reifiedResultShapes.front())) {
+      std::optional<int64_t> maybeCst = getConstantIntValue(ofr);
+      // Reification does not add static information, just use existing shape.
+      if (!maybeCst.has_value()) {
+        newShape.push_back(s);
+        continue;
+      }
+      int64_t cst = *maybeCst;
+      assert((ShapedType::isDynamic(s) || s == cst) && "constants must agree!");
+      newShape.push_back(cst);
+    }
+    if (newShape == padOp.getResultType().getShape())
+      return failure();
+
+    Type oldType = padOp.getResultType();
+    Type newType =
+        RankedTensorType::Builder(padOp.getResultType()).setShape(newShape);
+    Location loc = padOp->getLoc();
+    Operation *newPad = rewriter.clone(*padOp);
+    newPad->getResult(0).setType(newType);
+    rewriter.replaceOpWithNewOp<tensor::CastOp>(padOp, oldType,
+                                                newPad->getResult(0));
+    return success();
+  }
+};
+
 } // namespace
 
+LogicalResult
+PadOp::reifyResultShapes(OpBuilder &b,
+                         ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
+  reifiedReturnShapes.resize(1, SmallVector<OpFoldResult>(getType().getRank()));
+  SmallVector<OpFoldResult> lp = getMixedLowPad();
+  SmallVector<OpFoldResult> hp = getMixedHighPad();
+  for (int64_t i = 0; i < getResultType().getRank(); ++i) {
+    if (!getType().isDynamicDim(i)) {
+      reifiedReturnShapes[0][i] = b.getIndexAttr(getType().getDimSize(i));
+      continue;
+    }
+    Location loc = getLoc();
+    Value dim = b.createOrFold<tensor::DimOp>(
+        loc, getSource(), b.create<arith::ConstantIndexOp>(loc, i));
+
+    affine::AffineBuilder ab(b, loc);
+    AffineExpr d0, d1, d2;
+    bindDims(b.getContext(), d0, d1, d2);
+    reifiedReturnShapes[0][i] = affine::makeComposedFoldedAffineApply(
+        b, loc, {d0 + d1 + d2}, {dim, lp[i], hp[i]});
+  }
+  return success();
+}
+
 void PadOp::getCanonicalizationPatterns(RewritePatternSet &results,
                                         MLIRContext *context) {
   results.add<FoldStaticZeroPadding, FoldSourceTensorCast, FoldTargetTensorCast,
               FoldOrthogonalPaddings, FoldStaticPadding,
-              FoldConsecutiveConstantPadding>(context);
+              FoldConsecutiveConstantPadding, FoldReifiedShape>(context);
 }
 
 /// Return the padding value of the PadOp if it constant. In this context,
diff --git a/mlir/test/Dialect/Tensor/canonicalize.mlir b/mlir/test/Dialect/Tensor/canonicalize.mlir
index 3251c5a4a2bfd..358c1c214a3b1 100644
--- a/mlir/test/Dialect/Tensor/canonicalize.mlir
+++ b/mlir/test/Dialect/Tensor/canonicalize.mlir
@@ -2543,3 +2543,21 @@ func.func @partial_sink_expand_of_cast(%arg0 : tensor<10x10xf32>, %arg1 : index,
 //       CHECK:   %[[RES:.+]] = tensor.cast %[[EXPAND]]
 //  CHECK-SAME:     tensor<?x?x10xf32> to tensor<?x?x?xf32>
 //       CHECK:   return %[[RES]]
+
+// -----
+
+// CHECK-LABEL:  func.func @pad_reification
+func.func @pad_reification(%cst : f32, %idx : index, %t: tensor<64x?x64xf32>)
+    -> tensor<1x?x64xf32> {
+  %pad_amt = affine.apply affine_map<(d0) -> (-d0 + 256)>(%idx)
+  %es = tensor.extract_slice %t[0, 0, 0] [1, %idx, 64] [1, 1, 1] : tensor<64x?x64xf32> to tensor<1x?x64xf32>
+
+//       CHECK: tensor.pad
+//       CHECK:   : tensor<1x?x64xf32> to tensor<1x256x64xf32>
+  %padded = tensor.pad %es low[0, 0, 0] high[0, %pad_amt, 0] {
+  ^bb0(%a: index, %b: index, %c: index):
+    tensor.yield %cst : f32
+  } : tensor<1x?x64xf32> to tensor<1x?x64xf32>
+
+  return %padded : tensor<1x?x64xf32>
+}

Copy link
Contributor

@fabianmcg fabianmcg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Pls update the commit message.

@fabianmcg fabianmcg changed the title [mlir][tensor] Make tensor::PadOp a ReifyRankedShapedTypeOpInterface … [mlir][tensor] Make tensor::PadOp a ReifyRankedShapedTypeOpInterface and add a PadOp::FoldReifiedShape canonicalization Jun 25, 2025
Copy link
Contributor

@fabianmcg fabianmcg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment in FoldReifiedShape

void PadOp::getCanonicalizationPatterns(RewritePatternSet &results,
MLIRContext *context) {
results.add<FoldStaticZeroPadding, FoldSourceTensorCast, FoldTargetTensorCast,
FoldOrthogonalPaddings, FoldStaticPadding,
FoldConsecutiveConstantPadding>(context);
FoldConsecutiveConstantPadding, FoldReifiedShape>(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please dont add as canonicalization. I think we decided that we could just used dim op folding patterns as and when needed instead of running it as a canonicalization without control. I think this pattern is unnecessary really.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken -I might be missing something, the test added by Nico is not covered by any transforms/passes/canonicalizers upstream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, --linalg-fold-unit-extent-dims --canonicalize without the folder is getting there. So I might be missing another transform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok no, --linalg-fold-unit-extent-dims --canonicalize doesn't work in all cases. I can't find a transform to simplify without Nico's canonicalization pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try --resolve-ranked-shaped-type-result-dims .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried that one and it doesn't do the trick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tomorrow I'll take a look if I can augment the pass and get the desired behavior because I agree this doesn't really fit as a canonicalization. The main thing we want from this transform is that in some circumstances when we tile we can get static shapes after applying this, this is shown in the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaheshRavishankar is there any written discussion about that decision? I'd like to better understand the needs here, and why we keep adding specialized passes, and hopefully figure out a more general upstream solution.

@nicolasvasilache
Copy link
Contributor Author

nicolasvasilache commented Jun 26, 2025

@MaheshRavishankar @matthias-springer I think we may need a longer alignment session here but TL;DR my position atm is:

  • it is not a problem to create new IR in canonicalization patterns as long as it can be cleaned up with a scope guard on failure. There is a world of difference between creating eventually unused IR that we can clean up and putting the IR in an invalid state.
  • re. what should and should not be a canonicalization: there is generally not "a canonicalization" but a multitude of normal forms whose lattice is not well defined and whose effects on "system entropy" are really unclear. However, folding dynamic into static dimensions is always fair game in my book.

If I wanted to be creative I could slash at the APIs to make this not generate IR and just pop out the constants (I am unsure the API changes will be desired though). However the tensor.cast is still needed to make types agree and has to propagate forward, so it can never become a true folder.

Taking the extreme opposite view, I would ask why is a tensor/memref.CastOp ever considered a canonicalization?
The answer would be the same: more static information => lower entropy in the IR.

In general, looking at these issues more deeply after 2y, I like how the reification APIs and interfaces have evolved.
But I find there is a big gap between their existence and their use.
I think we are missing a lot of opportunities for rewrites that discover static information in the IR and turn ? into constants.

This is all "modulo compiler cost"; I did not go as far as putting ValueBoundsOpInterface part of canonicalization but I would claim we should have an "expensive canonicalization" or something rather than have to discover a set of transforms that need to be applied together for this purpose of finding more static information.

To further supplement my point, I'll claim that a lot of the static behavior that we enjoy today comes from makeComposedFoldedAffineApply + DimOp/reifyStuff folding.
The first one is applied pervasively, the second is not and I'd argue we would want to make it much more pervasive.

Anyway, not trying to start an RFC in a PR or an ODM but a discussion on "is making shapes more static" a canonicalization and "creating IR that can be reverted" would be useful principles to have a general agreement on.

@joker-eph @rengolin @ftynse for their take too.

@ftynse
Copy link
Member

ftynse commented Jun 26, 2025

I would say @llvm/mlir-area-team if we are at this point.

@nicolasvasilache
Copy link
Contributor Author

Just listened to https://www.youtube.com/watch?v=929JVLlbtRU
I had lost context on the non-determinism aspect .. that's a nasty one indeed!

@nicolasvasilache
Copy link
Contributor Author

Extracted out #145867 for the immediate functionality and this one can be put on the backburner.

@nicolasvasilache nicolasvasilache marked this pull request as draft June 26, 2025 10:20
@ftynse
Copy link
Member

ftynse commented Jun 26, 2025

I had lost context on the non-determinism aspect .. that's a nasty one indeed!

Don't recall all the details, but normally we shouldn't have non-deterministic behavior in the compiler. Is this due to something like the map of dialects or ops being a hashmap or something like that?

@ftynse
Copy link
Member

ftynse commented Jun 26, 2025

I generally think the canonical form in MLIR is moot, given that we can add arbitrary operations and run transformations in vastly different paradigms (trivial example, LICM/CSE/GVN was harming affine transformations so much they had to undo it in Polly: https://dl.acm.org/doi/abs/10.1145/3168815, but these are generally considered "useful simplifications" for regular SSA; upstream example, we push constants and some operations back into gpu kernels to undo CSE) that may have different pre-requirements. We should rather move to a set of composable "normal forms" that have well-defined, programmatically verifiable invariants, and passes can declare which of the "normal forms" they expect and produce.

Anyway, not trying to start an RFC in a PR or an ODM but a discussion on "is making shapes more static" a canonicalization and "creating IR that can be reverted" would be useful principles to have a general agreement on.

I think this is precisely a discussion that should happen at an ODM. This is about design, and we need high-bandwidth discussion. ODM is not a seminar were people present finished work, or rather, it should be that. It is also worth surfacing this discussion to the forum, there was a related topic recently. I feel like I've written the text from above at least twice in the past month or so.

@nicolasvasilache
Copy link
Contributor Author

I think this is precisely a discussion that should happen at an ODM. This is about design, and we need high-bandwidth discussion. ODM is not a seminar were people present finished work, or rather, it should be that. It is also worth surfacing this discussion to the forum, there was a related topic recently. I feel like I've written the text from above at least twice in the past month or so.

yeah happy to discuss this in an informal ODM, what I meant is I am not trying to litigate this as an RFC or ODM in this PR :p

@MaheshRavishankar
Copy link
Contributor

@MaheshRavishankar @matthias-springer I think we may need a longer alignment session here but TL;DR my position atm is:

I think we have had multiple of these sessions. Happy to chat offline to give context, but Tl;DR is MLIR canonicalizations have become too much of a kitchen sink and we should avoid. What we need are fixed point iterations that do a specific transformation of the program. For example, the tensor.cast resolution is a well defined fixed point (unfortunately this is today bundled into canonicalizer and should be removed IMO). tensor.dim resolution is a well defined fixed point which is done with populateResolveRankedShapedTypeResultDimsPatterns . We should be exposing fixed points (and potentially allow combination of fixed points for things that work together). Bundling that in a canonicalization that allows no control of what is run is a huge flaw that in my view we should work ourselves out of.

In this particular case though I think what you are trying to do during dim resolution is cast folding. Can you say why https://github.com/llvm/llvm-project/pull/145732/files#diff-025c342b2a76f2edaae5557ff3e81b0b9ef9f47107359e78d85654581318d154R3864 these patterns dont already do this. Instead of fixing that, this seems to be adding another pattern. I am more curious as to what the existing cast folding misses that you needed to add it during dim resolution.

@nicolasvasilache nicolasvasilache force-pushed the users/nico/pad-reify-canonicalize branch from 0862760 to eb33654 Compare June 27, 2025 07:06
@nicolasvasilache nicolasvasilache changed the title [mlir][tensor] Make tensor::PadOp a ReifyRankedShapedTypeOpInterface and add a PadOp::FoldReifiedShape canonicalization [mlir][tensor] Add a PadOp::FoldReifiedShape canonicalization Jun 27, 2025
@nicolasvasilache
Copy link
Contributor Author

Bundling that in a canonicalization that allows no control of what is run is a huge flaw that in my view we should work ourselves out of.

Yet in #145927 the proposed path forward is to add a canonicalization pattern :)

@nicolasvasilache
Copy link
Contributor Author

nicolasvasilache commented Jun 27, 2025

In this particular case though I think what you are trying to do during dim resolution is cast folding. Can you say why https://github.com/llvm/llvm-project/pull/145732/files#diff-025c342b2a76f2edaae5557ff3e81b0b9ef9f47107359e78d85654581318d154R3864 these patterns dont already do this. Instead of fixing that, this seems to be adding another pattern. I am more curious as to what the existing cast folding misses that you needed to add it during dim resolution.

Those patterns are based on PadOp::inferResultType, this is pretty old code that I wrote back when PadOp was in Linalg (here is where history stops without deeper spelunking). inferResultType can only handle static low/high pads as can be seen in its signature:

RankedTensorType PadOp::inferResultType(RankedTensorType sourceType,
                                        ArrayRef<int64_t> staticLow,
                                        ArrayRef<int64_t> staticHigh,
                                        ArrayRef<int64_t> resultShape) {

Fast forward a few years, we now have both ReifyRankedShapedTypeOpInterface and the makeComposedFoldedAffineApply on OpFoldResult.

Combining the 2 is strictly more powerful and general.

Additionally, #145927 further generalizes to ReifyRankedShapedTypeOpInterface.

@ftynse
Copy link
Member

ftynse commented Jun 27, 2025

I think we have had multiple of these sessions. Happy to chat offline to give context, but Tl;DR is MLIR canonicalizations have become too much of a kitchen sink and we should avoid. What we need are fixed point iterations that do a specific transformation of the program. For example, the tensor.cast resolution is a well defined fixed point (unfortunately this is today bundled into canonicalizer and should be removed IMO). tensor.dim resolution is a well defined fixed point which is done with populateResolveRankedShapedTypeResultDimsPatterns .

Yeah, what I'd like to have is to have outcomes of these discussions documented and saved somewhere for the broader community. It may feel like more work, but it is actually less if scaled over time and people. Not having to re-discover what people have already seen and partly resolved.

On the technical concept, I think we have a strong alignment: specific properties of the IR are desired over a generic "canonicalization". This seems to be a dominant viewpoint by the way, just nobody took the time to share it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants