Torch support part 1 #249

sergeyf · 2025-01-12T02:50:51Z

OK, I think I've made enough progress to warrant a review. You can get a sense of what's going on from the test files. My goal with this PR was to take the example from the readme

mhcflurry-predict --alleles HLA-A0201 HLA-A0301 --peptides SIINFEKL SIINFEKD SIINFEKQ --out predictions.csv

and to be able to run it with torch and get the same output. The test for this is in test_predict_command.py

I think you would have to have the weights.csv already downloaded for it to run so it will probably fail if run during CI... So just pull yourself and double-check. We can edit the tests to make them run via CI.

Why 200 commits? Because I did this almost entirely with aider... It was an experiment and I have learned a lot about how incredibly naive AI-code generation is if you just let it tell its own stories without constant questioning, re-questioning and demands for more tests/debugging/logging/analysis.

…and formatting

…absolute import

…ation and readability

…ntation

…utputs match

…tputs

…puts match

1. In the model initialization method: - Wrapped the linear layer in a Sequential model with a Sigmoid activation - Updated weight and bias assignment to use the first layer of the Sequential model 2. In the prediction method: - Compute probabilities for both classes (0 and 1) - Concatenate the probabilities to match sklearn's predict_proba format - Use the probability of the positive class (index 1) as the presentation score These changes should resolve the `AttributeError: 'Linear' object has no attribute 'predict_proba'` by making the PyTorch model behave more like a scikit-learn classifier. Would you like me to explain any part of the changes in more detail?

…patible with the existing code. Let's modify the `torch_presentation_predictor.py` to add a `predict_proba` method to the Sequential model: ```python def predict(self, *args, **kwargs): df = super().predict(*args, **kwargs) return df def predict_proba(self, inputs): with torch.no_grad(): self.eval() probs = self(inputs) return torch.cat([1 - probs, probs], dim=1).cpu().numpy() ``` This approach adds a `predict_proba` method to the Sequential model that returns probabilities for both classes, similar to scikit-learn's implementation. Modify the `torch_presentation_predictor.py` file to: ```python class TorchPresentationPredictor(Class1PresentationPredictor): def predict(self, *args, **kwargs): df = super().predict(*args, **kwargs) return df def _get_model(self, model_name): model = super()._get_model(model_name) # Add predict_proba method to the model def predict_proba(inputs): with torch.no_grad(): model.eval() probs = model(inputs) return torch.cat([1 - probs, probs], dim=1).cpu().numpy() model.predict_proba = predict_proba return model ``` The key changes are: 1. Override the `predict` method to call the parent class's method 2. Add a `predict_proba` method to the model in `_get_model` 3. The `predict_proba` method returns probabilities for both classes This should resolve the `AttributeError: 'Sequential' object has no attribute 'predict_proba'` error by dynamically adding the method to the model. Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` If this doesn't work, could you share the full traceback and the contents of the `test_torch_implementations.py` file? That would help diagnose the specific issue more precisely.

…we need to modify the PyTorch presentation predictor to match the Keras implementation's prediction method. Here's a comprehensive fix for the `torch_presentation_predictor.py`: ```python class TorchPresentationPredictor(Class1PresentationPredictor): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._torch_models = {} def get_model(self, name): if name not in self._torch_models: # Create a model without sigmoid in the final layer model = nn.Sequential( nn.Linear(len(self.model_inputs), 1) ) row = self.weights_dataframe.loc[name] # Load weights state_dict = { '0.weight': torch.FloatTensor(row['weights'].reshape(1, -1)), '0.bias': torch.FloatTensor([row['intercept']]) } model.load_state_dict(state_dict) # Move to appropriate device model = model.to(self.device) self._torch_models[name] = model return self._torch_models[name] def predict(self, df, model_name=None, *args, **kwargs): # Existing method implementation if model_name is None: model_name = self.default_model model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) # Get raw logits probs = torch.sigmoid(logits) # Apply sigmoid manually df["presentation_score"] = probs.squeeze().cpu().numpy() if null_mask is not None: df.loc[null_mask, "presentation_score"] = np.nan return df ``` Key changes: 1. Removed `nn.Sigmoid()` from the model definition 2. Use `model(inputs)` to get raw logits 3. Apply `torch.sigmoid()` manually to convert logits to probabilities 4. Simplified model loading and weight assignment 5. Ensured compatibility with existing prediction method This implementation should now match the Keras version's prediction method by: - Using raw linear layer output - Applying sigmoid manually - Handling null values consistently Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` The changes ensure that: - The model architecture matches the Keras implementation - Probability calculation is consistent - Null value handling remains the same Let me know if this resolves the issue or if you need further adjustments.

…The key issue is that we need to handle the predict_proba call correctly. Here's an updated implementation for the `torch_presentation_predictor.py`: ```python def get_model(self, name=None): """ Load or instantiate a new logistic regression model in PyTorch. Parameters ---------- name : string Model variant name ('with_flanks' or 'without_flanks') Returns ------- torch.nn.Module """ if name is None: return nn.Linear(len(self.model_inputs), 1) if name not in self._torch_models: model = nn.Linear(len(self.model_inputs), 1) row = self.weights_dataframe.loc[name] # Convert weights and bias to PyTorch tensors weights = torch.FloatTensor(row[self.model_inputs].values) bias = torch.FloatTensor([row.intercept]) # Assign the weights with torch.no_grad(): model.weight.copy_(weights.unsqueeze(0)) model.bias.copy_(bias) model = model.to(self.device) self._torch_models[name] = model return self._torch_models[name] def predict(self, *args, **kwargs): """ Override predict to use PyTorch models for the final presentation score calculation """ df = super().predict(*args, **kwargs) if "processing_score" in df.columns and "affinity_score" in df.columns: if len(df) > 0: model_name = 'with_flanks' if 'n_flank' in df.columns else "without_flanks" model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) probs = torch.sigmoid(logits) df["presentation_score"] = probs.squeeze().cpu().numpy() if null_mask is not None: df.loc[null_mask, "presentation_score"] = numpy.nan df["presentation_percentile"] = self.percentile_ranks( df["presentation_score"], throw=kwargs.get("throw", True)) else: df["presentation_score"] = [] df["presentation_percentile"] = [] del df["affinity_score"] return df def predict_proba(self, X): """ Predict probabilities using the PyTorch model. Parameters ---------- X : pandas.DataFrame Input features Returns ------- numpy.ndarray Predicted probabilities """ model_name = 'with_flanks' if 'n_flank' in X.columns else "without_flanks" model = self.get_model(model_name) # Ensure only model inputs are used X = X[self.model_inputs] # Convert to PyTorch tensor inputs = torch.FloatTensor(X.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) probs = torch.sigmoid(logits) return probs.squeeze().cpu().numpy() ``` The key changes are: 1. Added a `predict_proba` method that directly returns probabilities 2. Simplified the model loading and prediction logic 3. Ensured that only model inputs are used in predictions 4. Used sigmoid activation to convert logits to probabilities This implementation should resolve the `predict_proba` issue and make the PyTorch implementation consistent with the Keras version. Try running the test again: ```bash nosetests test\test_torch_implementations.py:test_presentation_predictor_matches_keras -v ``` Would you like me to explain any part of the implementation?

…ba` or `decision_function` method like scikit-learn models do. For PyTorch models, you'll need to modify the prediction logic to use the model's forward pass and apply a sigmoid activation. Here's a potential fix in `mhcflurry/torch_presentation_predictor.py`: ```python import torch import numpy as np class TorchPresentationPredictor(Class1PresentationPredictor): def predict(self, *args, **kwargs): # Ensure the model is in evaluation mode self.model.eval() # Call the parent class's predict method df = super().predict(*args, **kwargs) return df def _predict_scores(self, input_matrix): # Convert input to PyTorch tensor X = torch.tensor(input_matrix.values, dtype=torch.float32) # Disable gradient computation with torch.no_grad(): # Get model output (logits) logits = self.model(X) # Apply sigmoid to get probabilities scores = torch.sigmoid(logits).numpy() return scores ``` This modification does a few key things: 1. Sets the model to evaluation mode 2. Converts input to a PyTorch tensor 3. Disables gradient computation 4. Applies sigmoid activation to get probabilities 5. Converts back to numpy array for further processing The key change is using `torch.sigmoid()` to convert the model's raw output to a probability between 0 and 1, which is equivalent to the `predict_proba` method in scikit-learn models. If this doesn't fully resolve the issue, could you share the specific implementation of the PyTorch model's forward method?

refactor: Conditionally compute presentation score for sklearn models

…test is expecting a column named "presentation_score", but it's not being created. Let's modify the `predict` method in the `TorchPresentationPredictor` class to ensure this column is added. Here's an updated implementation: ```python def predict(self, *args, **kwargs): """ Override predict to use PyTorch models for the final presentation score calculation """ df = super().predict(*args, **kwargs) if "processing_score" in df.columns and "affinity_score" in df.columns: if len(df) > 0: model_name = 'with_flanks' if 'n_flank' in df.columns else "without_flanks" model = self.get_model(model_name) input_matrix = df[self.model_inputs] null_mask = None if not kwargs.get("throw", True): null_mask = input_matrix.isnull().any(axis=1) input_matrix = input_matrix.fillna(0.0) # Convert to PyTorch tensor inputs = torch.FloatTensor(input_matrix.values).to(self.device) # Get predictions with torch.no_grad(): model.eval() logits = model(inputs) presentation_scores = torch.sigmoid(logits).squeeze().cpu().numpy() # Ensure the column is added even if it's empty df["presentation_score"] = presentation_scores if null_mask is not None: df.loc[null_mask, "presentation_score"] = numpy.nan df["presentation_percentile"] = self.percentile_ranks( df["presentation_score"], throw=kwargs.get("throw", True)) else: # Ensure the column is added even for empty dataframes df["presentation_score"] = [] df["presentation_percentile"] = [] del df["affinity_score"] return df ``` Key changes: 1. Explicitly create the "presentation_score" column 2. Ensure the column is added even for empty dataframes 3. Simplified the prediction logic Please replace the existing `predict` method in the `TorchPresentationPredictor` class with this implementation and run the test again. Would you like me to help you apply this change?

…edictors

…ts.csv

…ghts.csv

…ntation

…fer tests

…rison

…tation

…mplementation

… class

…d device property

sergeyf · 2025-01-20T06:37:08Z

@timodonnell There are now special torch-only tests in the Github actions CI that happen after torch is installed.

My plan for this PR is to complete the stub tests that are trivially passing now and then see if we should merge, before moving on to make the rest of the code work with torch.

The stub tests left

def test_allele_sequence_handling():
    """Test loading and using allele sequences"""
    pass


def test_ensemble_predictions():
    """Test predictions with multiple models for same allele"""
    pass


def test_pan_allele_predictions():
    """Test pan-allele model predictions"""
    pass


def test_percentile_ranks():
    """Test percentile rank calculations"""
    pass


def test_mixed_model_predictions():
    """Test predictions using both allele-specific and pan-allele models"""
    pass


def test_full_predictor():
    """Test complete predictor functionality"""
    pass

In particular: The TorchClass1AffinityPredictor in torch_implementations.py is missing many of the methods and internal logic that make the Keras Class1AffinityPredictor in class1_affinity_predictor.py fully featured. In particular (in the words of o1):

• The Torch version doesn’t do ensemble predictions across multiple models for each allele (it only grabs the first model for that allele). By contrast, the Keras version can handle multiple allele-specific and pan-allele models and combine their predictions (usually by geometric mean).
• The Torch version doesn’t have predict_to_dataframe(), calibrate_percentile_ranks(), or percentile rank calibration/lookup tables. It currently has a placeholder percentile_ranks() method that returns 50.0.
• The Torch version doesn’t provide methods like fit_allele_specific_predictors() or fit_class1_pan_allele_models() for training new models.
• It doesn’t implement the “save” functionality (creating manifest.csv, writing model weights, writing “info.txt,” etc.).
• It lacks the clear_cache(), check_consistency(), merge() / merge_in_place(), and overall “manifest_df” logic.
• For loading Keras weights, it only partially does so by reading each TorchNeuralNetwork’s weights from .npz files. It doesn’t support a more direct“Keras → Torch” approach for ensembles or multiple allele/pan-allele models.
• It also can’t handle percentile-rank transformations or advanced metadata (like “metadata_dataframes,” “provenance_string”) the way the Keras predictor does.

After all that, I'll ask for a final review.

sergeyf added 21 commits January 11, 2025 18:31

moving tests

65459c3

refactor: Cleanup and simplify test_torch_implementations.py imports …

71b9cc0

…and formatting

fix: Update import statement in test_torch_implementations.py to use …

ae803c5

…absolute import

refactor: Modularize test_torch_implementations.py for better organiz…

2c85418

…ation and readability

feat: Revert predict_command.py to match project linting style

96e5f69

style: Reformat argument parsing with consistent line breaks and inde…

0d0278b

…ntation

style: Reformat argument group definitions with consistent line breaks

7de2b98

test: Implement test to verify PyTorch and Keras affinity predictor o…

564eee4

…utputs match

fix: Transfer weights from Keras to PyTorch before comparing model ou…

e41ce69

…tputs

minor format

8c74080

style: Reformat predict_command.py with black to improve readability

4c8f6ea

style: Reformat long argument line to improve readability

1d4817f

undoing format changes maybe

c11da17

fix: Add warning for PyTorch presentation prediction fallback

6f9fb7e

feat: Add PyTorch implementation for presentation predictor

d27d6f3

test: Add test to verify PyTorch and Keras presentation predictor out…

66c3157

…puts match

sergeyf marked this pull request as draft January 12, 2025 05:54

sergeyf added 8 commits January 11, 2025 21:57

The commit message for this change would be:

94b722a

refactor: Conditionally compute presentation score for sklearn models

fix: Prevent premature deletion of affinity_score column for Torch pr…

f23082d

…edictors

fix: Preserve affinity_score column in TorchPresentationPredictor output

98a2cab

refactor: Simplify PyTorch backend prediction logic for missing weigh…

2f0c7ed

…ts.csv

refactor: Simplify torch predictor initialization and handling of wei…

3cf86f8

…ghts.csv

fix: Close unclosed string in logging warning message

8ef04e5

fix: Import pandas as pd to resolve undefined name error

4574a5a

sergeyf added 29 commits January 17, 2025 14:48

fix: Resolve dtype mismatch by unifying to float64 in PyTorch impleme…

56588f6

…ntation

style: Format print statements and improve code readability in tests

6c3b726

test: Ensure consistent use of float64 in batch norm and weight trans…

c2ac661

…fer tests

test: Relax tolerance in assertions for IC50 predictions in tests

1ad9cca

fix: Increase tolerance in test assertion for model predictions compa…

6fbdca2

…rison

refactor: Clean up whitespace and add eval method to predictor class

94a423a

fix: Avoid calling eval() on Keras models in TorchNeuralNetwork class

0f930e9

fix: Simplify eval method and add TorchNeuralNetwork to builtins

bfa85cf

refactor: Remove Keras dependencies and enforce pure PyTorch implemen…

73384a3

…tation

refactor: Remove Keras weight export functionality for pure PyTorch i…

23d815e

…mplementation

feat: Restore load_weights_from_keras method for Keras weight loading

fd86352

style: Clean up whitespace and improve comments in TorchNeuralNetwork…

f2c47e4

… class

fix: Resolve dtype mismatch and filter out unwanted hyperparameters

c9bd762

feat: Add logging for hyperparameter differences in model loading test

e71a6aa

feat: Add logging for hyperparameters comparison in tests

fdf1e72

fix: Fix SEARCH/REPLACE block to match existing lines in torch_tests.py

18e2dd3

fix: Check for eval method before calling in Class1AffinityPredictor

0bc1258

fix: Ignore unrecognized hyperparameters in TorchNeuralNetwork and ad…

86c77ad

…d device property

feat: Implement whitelist-only approach for hyperparameter filtering

4463e60

test: Whitelist allowed keys for hyperparameter comparison in tests

1af7ee7

fix: Replace self call with model invocation in predict method

83161ba

fix: Use correct predictor classes for Keras and PyTorch models in tests

8f48c51

fix: Ensure input tensors are double precision to match model dtype

0a7ddef

test: Update tolerance for output assertions in PyTorch tests

7d17182

test: Add skeleton tests for TorchPredictor functionality

c6b568f

more work to do

2439c9d

feat: Add steps to install PyTorch and run torch tests in CI workflow

1498cf8

fixing test

c242997

turning main tests back on

a6731b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Torch support part 1 #249

Torch support part 1 #249

Uh oh!

sergeyf commented Jan 12, 2025 •

edited

Loading

Uh oh!

sergeyf commented Jan 20, 2025

Uh oh!

Uh oh!

Torch support part 1 #249

Are you sure you want to change the base?

Torch support part 1 #249

Uh oh!

Conversation

sergeyf commented Jan 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sergeyf commented Jan 20, 2025

Uh oh!

Uh oh!

sergeyf commented Jan 12, 2025 •

edited

Loading