Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditions argument for loadData in SSIT doesn't filter data #21

Closed
alexpopinga opened this issue Mar 8, 2025 · 8 comments
Closed

Conditions argument for loadData in SSIT doesn't filter data #21

alexpopinga opened this issue Mar 8, 2025 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@alexpopinga
Copy link
Collaborator

No description provided.

@alexpopinga alexpopinga added the bug Something isn't working label Mar 8, 2025
@Munsky
Copy link
Contributor

Munsky commented Mar 8, 2025

Hi @alexpopinga,

Could you please provide an example where it failed, so I can better understand what went wrong?

@alexpopinga
Copy link
Collaborator Author

Yes @Munsky - apologies for not doing so when submitting the issue (was at the crag!)

I added a file called a0_Fit_GR.m (issue occurs in loadData for loop starting line 86), which is just a paired down version of a0_Fit_GR_and_DUSP1_models.m (where issue occurs in the same for loop starting on line 135, although here loadData is currently commented out due to issue). Both are found in Workspace/EricModel.

ModelGRfit{i} = ModelGR.loadData("EricData/Gated_dataframe_Ron_020224_NormalizedGR_bins.csv",...
{'nucGR','normgrnuc';'cytGR','normgrcyt'},...
{'Dex_Conc',GRfitCases{i,2}});

^ It will run successfully, but as you can see from the "disp(ModelGRfit{i}.dataSet.DATA(1:5,:))" line I added to a0_Fit_GR.m, it does not actually filter for Dex concentration. I tried hard-coding the concentration also, e.g., "{'Dex_Conc','10'}" which follows the comment suggestion in SSIT.m -

% Example:
% conditions = {'Rep_num','1'} : Only the data in the
% 'Rep_num' column that is exactly equal to '1' will be
% kept in the data set.

-but this does not work, either.

@alexpopinga
Copy link
Collaborator Author

alexpopinga commented Mar 9, 2025

@Munsky I found the problem and a solution. I just want to check that I don't break everything :-)

As I understand it, the issue lies with the types. In the "% set up conditions" block starting on line 1489 of SSIT.m (in this branch, SequentialExperimentDesign):

K>> class(obj.dataSet.DATA(:,3))

ans =

'cell'

But, my "conditions" is a particular Dex concentration, for which I want to select only the data that was produced under that particular experimental condition for each model in SSITMultiModel, such that:

K>> class(conditions{i,2})

ans =

'char'

This means the loop is simply not matching the hits and returns everything. I can force type changes and then filter the DATA in obj.dataSet.DATA, something like:

            filterValue = string(conditions{i,2}); % Convert 'char' to 'string'

            % Apply filter
            filteredRows = columnData == filterValue;

            filteredData = obj.dataSet.DATA(filteredRows, :); 

            obj.dataSet.DATA = filteredData; 

This solution successfully filters the data for specified Dex concentrations in the loop with condition {'Dex_Conc',GRfitCases{i,1}} - I just don't want to break things downstream in SSIT.m. Now investigating.

@alexpopinga alexpopinga self-assigned this Mar 9, 2025
@Munsky
Copy link
Contributor

Munsky commented Mar 17, 2025

Hi @alexpopinga ,

Your code ran fine for me. at least no errors before fittng at line 135.
is this still an issue?

-B

@alexpopinga
Copy link
Collaborator Author

Hi @Munsky ,

It runs for me, too, but does it actually filter by Dex concentration for you? That is, does it load only the data associated with a Dex concentration of 1 nM for one of the SSITMultiModels, then only the data associated with 10 nM for another, then only the data associated with 100 nM for the last?

@Munsky
Copy link
Contributor

Munsky commented Mar 18, 2025

Hi @alexpopinga .

Could you please put together an example data file with a few rows and several columns so that we can test this more easily?

I am thinking something with maybe three rows at dex equals one, five rows add dex, etc. and various time points.

Then you can just come up with a toy model, and test just the data loading part with a known answer.

This would make for a good test and you wouldn't need to run any fits or anything before determining if it worked.

This would also be a good test for us for creating the future version that allows for more flexible conditioning.

1 similar comment
@Munsky
Copy link
Contributor

Munsky commented Mar 18, 2025

Hi @alexpopinga .

Could you please put together an example data file with a few rows and several columns so that we can test this more easily?

I am thinking something with maybe three rows at dex equals one, five rows add dex, etc. and various time points.

Then you can just come up with a toy model, and test just the data loading part with a known answer.

This would make for a good test and you wouldn't need to run any fits or anything before determining if it worked.

This would also be a good test for us for creating the future version that allows for more flexible conditioning.

@Munsky
Copy link
Contributor

Munsky commented Mar 19, 2025

Hi @alexpopinga .

I just added a test case ("dataLoading") for this in the poissonTest.m file -- please take a look at this to see what these can look like.

Also, this should show how easy it is to do logic with tables in Matlab. I think re-writing the data loading routines from scratch would be very easy and lead to more functionality. But i cant do it one handed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants