Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can I generate a sequence with a sequence with 'medium' score? #27

Open
Lix1993 opened this issue Dec 5, 2019 · 3 comments
Open

can I generate a sequence with a sequence with 'medium' score? #27

Lix1993 opened this issue Dec 5, 2019 · 3 comments

Comments

@Lix1993
Copy link
Contributor

Lix1993 commented Dec 5, 2019

By default, dnachisel gives a 'best' sequence with score close to 0.
I can get a 'worst' sequence by new_score < score.

If i want a 'medium' sequence , i can set a target score,
and set the if condition to abs(new_score - target) < abs(score - target).

Since scores varies from objectives, I have to get worset score first to determint the target score.

Is there any solution to generate 3 sequences just using optimize() once?

@Zulko
Copy link
Member

Zulko commented Dec 5, 2019

That's a complex problem, and I don't think it can be done efficiently by changing how optimize works.

A lot of the optimization efficiency comes from the way the specifications work, not the optimize method. The specification tells the solver which regions should be mutated. And you don't mutate the same sequence regions depending on whether you want to optimize or de-optimize the sequence. So the best way to "de-optimize" a sequence is to define a new specification whose evaluate method does the contrary (let's call it the anti-specification), i.e. its score is the "opposite" of the original spec, and its suggested locations to optimize are the complementary of the locations suggested by the original spec. But I agree it is a lot of work.

The other question is how you determine a specification's "worst score". That can be complicated, as the worst score can depend on your constraints. The way I would approach it is by defining an anti-specification and running the optimization once with only the anti-specification. The final score is the worst score, and a medium score is anything in-between.

The last point is how you reach a given "medium" score. Here it is even more complicated, because the regions to mutate depend on whether you are currently above or below the score. The closest I have done to that is EnforceChanges(). Used alone, it maximizes the changes in the affected area. But you can set up amount_percent=40 to

If you don't want to dive into all this yet, there may be an quicker solution (but it is untested). You can try using AvoidChanges(max_edits_percent=20), which will restrain sequence changes less than 20% of the sequence. You can use it as a constraint or as an objective. Intuitively, this should help you find "medium-optimized" sequences.

@Lix1993
Copy link
Contributor Author

Lix1993 commented Dec 6, 2019

Thanks

@Lix1993 Lix1993 closed this as completed Dec 6, 2019
@Zulko
Copy link
Member

Zulko commented Dec 6, 2019

Let's keep this open as an un-resolved issue at the moment for other people to find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants