WIP - FEAT - Quantile Huber & Progressive Smoothing #312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

floriankozikowski wants to merge 7 commits into scikit-learn-contrib:main from floriankozikowski:quantilehuber

Contributor

floriankozikowski commented May 23, 2025

Context of the PR

This PR implements a smooth quantile regression estimator using a Huberized loss with progressive smoothing. The goal is to provide a faster alternative to scikit-learn's QuantileRegressor while maintaining similar accuracy.
(closes #276 )
(Also it aims to simplify earlier approaches done in PR #306 )

Contributions of the PR

Added QuantileHuber loss in skglm/experimental/quantile_huber.py

Added SmoothQuantileRegressor class in skglm/experimental/smooth_quantile_regressor.py:

Uses FISTA solver with L1 regularization
Implements progressive smoothing from delta_init to delta_final
Includes intercept updates using gradient steps

Added example in examples/plot_smooth_quantile.py

Checks before merging PR

added documentation for any new feature
added unit tests
edited the what's new (if applicable)


          first try at simple quantile huber

21f1459

floriankozikowski changed the title ~~first try at simple quantile huber~~ WIP - FEAT - Quantile Huber & Progressive Smoothing

floriankozikowski mentioned this pull request

BUG & FEAT Fix convergence issues with Pinball loss on large datasets (Issue #276) #306

Closed

3 tasks

floriankozikowski and others added 3 commits

May 26, 2025 16:21


          make basic version without intercept handling and progressive smoothing

010c399


          add progressive smoothing

e4888d9


          din't inherit from Huber

be991c1

mathurinm reviewed

View reviewed changes

examples/plot_smooth_quantile.py Outdated

		return np.mean(residuals * (quantile - (residuals < 0)))


		def create_data(n_samples=1000, n_features=10, noise=0.1):

Collaborator

mathurinm May 28, 2025

avoid this: this is literally just wrapping make_regression

examples/plot_smooth_quantile.py Outdated

		plt.tight_layout()
		plt.show()

Collaborator

mathurinm May 28, 2025

no need to wrap in if name == main for example plots

skglm/experimental/quantile_huber.py Outdated

+                          res += self._loss_scalar(residual)
+                      return res / n_samples
+                  def _loss_scalar(self, residual):

Collaborator

mathurinm May 28, 2025

loss_sample may be a clearer name

skglm/experimental/quantile_huber.py Outdated

+                          grad_j += -X[i, j] * self._grad_scalar(residual)
+                      return grad_j / n_samples
+                  def _grad_scalar(self, residual):

Collaborator

mathurinm May 28, 2025

having gradient_scalar and _grad_scalar is a massive risk of confusion in the future; _grad_per_sample ?

skglm/experimental/quantile_huber.py Outdated

+                      return grad_j / n_samples
+                  def _grad_scalar(self, residual):
+                      """Calculate gradient for a single residual."""

Collaborator

mathurinm May 28, 2025

a single sample

skglm/experimental/quantile_huber.py Outdated

+                  def fit(self, X, y):
+                      """Fit using progressive smoothing: delta_init --> delta_final."""
+                      X, y = check_X_y(X, y)

Collaborator

mathurinm May 28, 2025

no need to check: GeneralizedLinearEstimator will do it

skglm/experimental/quantile_huber.py Outdated

+                      for i, delta in enumerate(deltas):
+                          datafit = QuantileHuber(quantile=self.quantile, delta=delta)
+                          penalty = L1(alpha=self.alpha)

Collaborator

mathurinm May 28, 2025

those can be taken out of the for loop

Collaborator

mathurinm May 28, 2025

(initialize datafit, penalty, solver and est outside of the loop; then in the loop only update the delta parameter of GLE.datafit)

skglm/experimental/quantile_huber.py Outdated

+                              solver=solver
+                          )
+                          if i > 0:

Collaborator

mathurinm May 28, 2025

this way you won't need this (if est is fixed outside the loop and uses warm_start=True)

skglm/experimental/quantile_huber.py


		return self

		def predict(self, X):

Collaborator

mathurinm May 28, 2025

you can store est as self.est and use self.est.predict to leverage existing code

Collaborator

mathurinm commented May 28, 2025

Ok as discussed separately, you need to implement the maths computation to make the solver work with Fista solver and AndersonCD solver; then it should be easy to support the intercept as these solvers rely on update_intercept_step (which is just a coordinate descent step on the intercept, which has a lipschitz constant equal to that of a feature which would be filled with 1s)

floriankozikowski added 3 commits

May 30, 2025 12:40


          implemented lipschitz for dense case, support for AndersonCD and adre…

575ffbb

…ssed feedback comments


          resolve merge conflict

938d842


          add intercept method (only works for AndersonCD so far)

39ced0d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet