-
Notifications
You must be signed in to change notification settings - Fork 60
Torch support part 1 #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Torch support part 1 #249
Conversation
…ation and readability
1. In the model initialization method: - Wrapped the linear layer in a Sequential model with a Sigmoid activation - Updated weight and bias assignment to use the first layer of the Sequential model 2. In the prediction method: - Compute probabilities for both classes (0 and 1) - Concatenate the probabilities to match sklearn's predict_proba format - Use the probability of the positive class (index 1) as the presentation score These changes should resolve the `AttributeError: 'Linear' object has no attribute 'predict_proba'` by making the PyTorch model behave more like a scikit-learn classifier. Would you like me to explain any part of the changes in more detail?
…patible with the existing code. Let's modify the `torch_presentation_predictor.py` to add a `predict_proba` method to the Sequential model: ```python def predict(self, *args, **kwargs): df = super().predict(*args, **kwargs) return df def predict_proba(self, inputs): with torch.no_grad(): self.eval() probs = self(inputs) return torch.cat([1 - probs, probs], dim=1).cpu().numpy() ``` This approach adds a `predict_proba` method to the Sequential model that returns probabilities for both classes, similar to scikit-learn's implementation. Modify the `torch_presentation_predictor.py` file to: ```python class TorchPresentationPredictor(Class1PresentationPredictor): def predict(self, *args, **kwargs): df = super().predict(*args, **kwargs) return df def _get_model(self, model_name): model = super()._get_model(model_name) # Add predict_proba method to the model def predict_proba(inputs): with torch.no_grad(): model.eval() probs = model(inputs) return torch.cat([1 - probs, probs], dim=1).cpu().numpy() model.predict_proba = predict_proba return model ``` The key changes are: 1. Override the `predict` method to call the parent class's method 2. Add a `predict_proba` method to the model in `_get_model` 3. The `predict_proba` method returns probabilities for both classes This should resolve the `AttributeError: 'Sequential' object has no attribute 'predict_proba'` error by dynamically adding the method to the model. Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` If this doesn't work, could you share the full traceback and the contents of the `test_torch_implementations.py` file? That would help diagnose the specific issue more precisely.
…we need to modify the PyTorch presentation predictor to match the Keras implementation's prediction method. Here's a comprehensive fix for the `torch_presentation_predictor.py`: ```python class TorchPresentationPredictor(Class1PresentationPredictor): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._torch_models = {} def get_model(self, name): if name not in self._torch_models: # Create a model without sigmoid in the final layer model = nn.Sequential( nn.Linear(len(self.model_inputs), 1) ) row = self.weights_dataframe.loc[name] # Load weights state_dict = { '0.weight': torch.FloatTensor(row['weights'].reshape(1, -1)), '0.bias': torch.FloatTensor([row['intercept']]) } model.load_state_dict(state_dict) # Move to appropriate device model = model.to(self.device) self._torch_models[name] = model return self._torch_models[name] def predict(self, df, model_name=None, *args, **kwargs): # Existing method implementation if model_name is None: model_name = self.default_model model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) # Get raw logits probs = torch.sigmoid(logits) # Apply sigmoid manually df["presentation_score"] = probs.squeeze().cpu().numpy() if null_mask is not None: df.loc[null_mask, "presentation_score"] = np.nan return df ``` Key changes: 1. Removed `nn.Sigmoid()` from the model definition 2. Use `model(inputs)` to get raw logits 3. Apply `torch.sigmoid()` manually to convert logits to probabilities 4. Simplified model loading and weight assignment 5. Ensured compatibility with existing prediction method This implementation should now match the Keras version's prediction method by: - Using raw linear layer output - Applying sigmoid manually - Handling null values consistently Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` The changes ensure that: - The model architecture matches the Keras implementation - Probability calculation is consistent - Null value handling remains the same Let me know if this resolves the issue or if you need further adjustments.
…The key issue is that we need to handle the predict_proba call correctly. Here's an updated implementation for the `torch_presentation_predictor.py`: ```python def get_model(self, name=None): """ Load or instantiate a new logistic regression model in PyTorch. Parameters ---------- name : string Model variant name ('with_flanks' or 'without_flanks') Returns ------- torch.nn.Module """ if name is None: return nn.Linear(len(self.model_inputs), 1) if name not in self._torch_models: model = nn.Linear(len(self.model_inputs), 1) row = self.weights_dataframe.loc[name] # Convert weights and bias to PyTorch tensors weights = torch.FloatTensor(row[self.model_inputs].values) bias = torch.FloatTensor([row.intercept]) # Assign the weights with torch.no_grad(): model.weight.copy_(weights.unsqueeze(0)) model.bias.copy_(bias) model = model.to(self.device) self._torch_models[name] = model return self._torch_models[name] def predict(self, *args, **kwargs): """ Override predict to use PyTorch models for the final presentation score calculation """ df = super().predict(*args, **kwargs) if "processing_score" in df.columns and "affinity_score" in df.columns: if len(df) > 0: model_name = 'with_flanks' if 'n_flank' in df.columns else "without_flanks" model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) probs = torch.sigmoid(logits) df["presentation_score"] = probs.squeeze().cpu().numpy() if null_mask is not None: df.loc[null_mask, "presentation_score"] = numpy.nan df["presentation_percentile"] = self.percentile_ranks( df["presentation_score"], throw=kwargs.get("throw", True)) else: df["presentation_score"] = [] df["presentation_percentile"] = [] del df["affinity_score"] return df def predict_proba(self, X): """ Predict probabilities using the PyTorch model. Parameters ---------- X : pandas.DataFrame Input features Returns ------- numpy.ndarray Predicted probabilities """ model_name = 'with_flanks' if 'n_flank' in X.columns else "without_flanks" model = self.get_model(model_name) # Ensure only model inputs are used X = X[self.model_inputs] # Convert to PyTorch tensor inputs = torch.FloatTensor(X.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) probs = torch.sigmoid(logits) return probs.squeeze().cpu().numpy() ``` The key changes are: 1. Added a `predict_proba` method that directly returns probabilities 2. Simplified the model loading and prediction logic 3. Ensured that only model inputs are used in predictions 4. Used sigmoid activation to convert logits to probabilities This implementation should resolve the `predict_proba` issue and make the PyTorch implementation consistent with the Keras version. Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` Would you like me to explain any part of the implementation?
…ba` or `decision_function` method like scikit-learn models do. For PyTorch models, you'll need to modify the prediction logic to use the model's forward pass and apply a sigmoid activation. Here's a potential fix in `mhcflurry/torch_presentation_predictor.py`: ```python import torch import numpy as np class TorchPresentationPredictor(Class1PresentationPredictor): def predict(self, *args, **kwargs): # Ensure the model is in evaluation mode self.model.eval() # Call the parent class's predict method df = super().predict(*args, **kwargs) return df def _predict_scores(self, input_matrix): # Convert input to PyTorch tensor X = torch.tensor(input_matrix.values, dtype=torch.float32) # Disable gradient computation with torch.no_grad(): # Get model output (logits) logits = self.model(X) # Apply sigmoid to get probabilities scores = torch.sigmoid(logits).numpy() return scores ``` This modification does a few key things: 1. Sets the model to evaluation mode 2. Converts input to a PyTorch tensor 3. Disables gradient computation 4. Applies sigmoid activation to get probabilities 5. Converts back to numpy array for further processing The key change is using `torch.sigmoid()` to convert the model's raw output to a probability between 0 and 1, which is equivalent to the `predict_proba` method in scikit-learn models. If this doesn't fully resolve the issue, could you share the specific implementation of the PyTorch model's forward method?
refactor: Conditionally compute presentation score for sklearn models
…test is expecting a column named "presentation_score", but it's not being created. Let's modify the `predict` method in the `TorchPresentationPredictor` class to ensure this column is added. Here's an updated implementation: ```python def predict(self, *args, **kwargs): """ Override predict to use PyTorch models for the final presentation score calculation """ df = super().predict(*args, **kwargs) if "processing_score" in df.columns and "affinity_score" in df.columns: if len(df) > 0: model_name = 'with_flanks' if 'n_flank' in df.columns else "without_flanks" model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) presentation_scores = torch.sigmoid(logits).squeeze().cpu().numpy() # Ensure the column is added even if it's empty df["presentation_score"] = presentation_scores if null_mask is not None: df.loc[null_mask, "presentation_score"] = numpy.nan df["presentation_percentile"] = self.percentile_ranks( df["presentation_score"], throw=kwargs.get("throw", True)) else: # Ensure the column is added even for empty dataframes df["presentation_score"] = [] df["presentation_percentile"] = [] del df["affinity_score"] return df ``` Key changes: 1. Explicitly create the "presentation_score" column 2. Ensure the column is added even for empty dataframes 3. Simplified the prediction logic Please replace the existing `predict` method in the `TorchPresentationPredictor` class with this implementation and run the test again. Would you like me to help you apply this change?
…d device property
@timodonnell There are now special torch-only tests in the Github actions CI that happen after torch is installed. My plan for this PR is to complete the stub tests that are trivially passing now and then see if we should merge, before moving on to make the rest of the code work with torch. The stub tests left
In particular: The TorchClass1AffinityPredictor in torch_implementations.py is missing many of the methods and internal logic that make the Keras Class1AffinityPredictor in class1_affinity_predictor.py fully featured. In particular (in the words of o1): • The Torch version doesn’t do ensemble predictions across multiple models for each allele (it only grabs the first model for that allele). By contrast, the Keras version can handle multiple allele-specific and pan-allele models and combine their predictions (usually by geometric mean). After all that, I'll ask for a final review. |
OK, I think I've made enough progress to warrant a review. You can get a sense of what's going on from the test files. My goal with this PR was to take the example from the readme
mhcflurry-predict --alleles HLA-A0201 HLA-A0301 --peptides SIINFEKL SIINFEKD SIINFEKQ --out predictions.csv
and to be able to run it with torch and get the same output. The test for this is in
test_predict_command.py
I think you would have to have the weights.csv already downloaded for it to run so it will probably fail if run during CI... So just pull yourself and double-check. We can edit the tests to make them run via CI.
Why 200 commits? Because I did this almost entirely with aider... It was an experiment and I have learned a lot about how incredibly naive AI-code generation is if you just let it tell its own stories without constant questioning, re-questioning and demands for more tests/debugging/logging/analysis.