-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Add FISTA solver #91
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work @PABannier!
Below, some minor remarks.
Besides, I have one concern: I don't think it's a good idea to add support for FISTA to all the datafits. We can limit ourselves to one of them just for testing purposes.
Indeed, AndersonCD
and ProxNewton
are much faster for separable problems. We better keep FISTA for particular cases (e.g SLOPE #92)
WDYT?
@Badr-MOUFAD totally agree. FISTA would be for a subset of penalties where PN or AndersonCD are not available. |
skglm/datafits/single_task.py
Outdated
for j in range(n_features): | ||
Xj = X_data[X_indptr[j]:X_indptr[j+1]] | ||
self.lipschitz[j] = (Xj ** 2).sum() / (len(y) * 4) | ||
self.global_lipschitz += (Xj ** 2).sum() / (len(y) * 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that will yield a very crude bound, potentially with a loss or the order of n_features
.
Use a few iterations of the power method instead to approximate the lipschitz constant of the sparse matrix (there's also the Lanczos iteration but it's more complicated, let's implement the easy one first)
Co-authored-by: Badr MOUFAD <[email protected]>
I am unable to find the root cause of the problem at the uniitest. Here is a small script to reproduce import numpy as np
from scipy.sparse import random
from skglm.estimators import LinearSVC
n_samples, n_features = 20, 30
X_sparse = random(n_samples, n_features, density=0.5, format='csc', random_state=0)
y = np.ones(n_samples)
LinearSVC(C=1., tol=1e-9).fit(X_sparse, y) Output (it depends) Segmentation fault (core dumped) or corrupted size vs. prev_size
Aborted (core dumped) or python3: malloc.c:3852: _int_malloc: Assertion `chunk_main_arena (fwd)' failed.
Aborted (core dumped) @mathurinm, @PABannier, any thoughts? |
A segfault is usually thrown by Numba when it can't access something it should (e.g. missing initialization of datafit). Have you tried setting breakpoints at various places of the code to see which line is causing the issue? |
Yes absolutely, I tried that. it breaks down in the initialization of datafit. Yet, I can't figure out why. |
Weird, that works for me on this branch. Can you try reinstalling |
I found the bug. Thanks @PABannier for your help! |
Wow, good catch @Badr-MOUFAD For a more robust design replace n_samples by n_rows_X and pass X.shape[0] explicitly |
Thanks @PABannier and @Badr-MOUFAD |
Closes #89
A few points to discuss:
Currently the FISTA solver uses Gram updates (as per Gram-based CD/BCD/FISTA solvers for (group)Lasso when
n_samples >> n_features
#4 ). Question: do we want to keep it this way? Implement without Gram update? Or have two options?For non-coordinate-wise updates, we run into the issue of not having a
prox_vec
method in theBasePenalty
class. If we want to support a larger class of penalties for FISTA (e.g.: L1, WeightedL1, SLOPE, ...), we need aprox_vec
method.