Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved naming and documentation for flag field accessors and mutators. #868

Closed
wants to merge 3 commits into from

Conversation

tfenne
Copy link
Member

@tfenne tfenne commented Apr 29, 2017

Description

The accessor and mutator methods for flag fields in SAMRecord have had really long, ugly names forever; names that don't even follow java standards very well. I feel like I/we should have done this a lot sooner! I've deprecated all the existing flag field methods and re-routed them to new methods with much cleaner names. In addition for some flag fields I've added inverted accessors (e.g. isUnmapped() and isMapped() together replace getReadUnmappedFlag()).

I suspect this might be a little controversial, but since I think we're likely stuck with the existing implementation for SAMRecord it would be nice to start tidying up by improving things and deprecating things we don't like.

While I don't think we want to have a long conversation about the exact names of the new methods, if there are any that really don't sit well with anyone, I'm happy to change them. I struggled to come up with good names for the replacements for getReadFailsVendorQualityCheck() which ended up with failsQc() and passesQc() since all other phrasings sounded weird to me (e.g. isFailingQc(), isQcFailure(), etc.).

Lastly, if/when we are happy with this, I will take on the task of finding/replacing calls to all the newly deprecated methods within HTSJDK, so that upon-merge there will be no direct use of the deprecated methods within HTSJDK.

Checklist

  • Code compiles correctly
  • New tests covering changes and new functionality
  • All tests passing
  • Extended the README / documentation, if necessary
  • Is not backward compatible (breaks binary or source compatibility)

@tfenne
Copy link
Member Author

tfenne commented Apr 29, 2017

I feel like this is a pretty significant step, so would like fairly broad review if possible. @nh13, @yfarjoun, @jacarey, @lbergelson, @droazen: want to share your 2c each?

@tfenne
Copy link
Member Author

tfenne commented Apr 29, 2017

Also, FYI to any reviewers, my apologies for making the review slightly more awkward that needs be. The original code had a block of all the getters followed by a block with all the setters. I've re-arranged to bring the accessor/mutator pair together for each flag, but that makes the diff a little weird looking.

@tfenne tfenne force-pushed the tf_samrecord_better_flag_methods branch from 33f0be3 to 0a2298a Compare April 29, 2017 18:59
@codecov-io
Copy link

codecov-io commented Apr 29, 2017

Codecov Report

Merging #868 into master will increase coverage by 0.012%.
The diff coverage is 89.231%.

@@               Coverage Diff               @@
##              master      #868       +/-   ##
===============================================
+ Coverage     64.964%   64.975%   +0.012%     
- Complexity      7221      7251       +30     
===============================================
  Files            528       528               
  Lines          31867     31966       +99     
  Branches        5442      5457       +15     
===============================================
+ Hits           20702     20770       +68     
- Misses          9018      9041       +23     
- Partials        2147      2155        +8
Impacted Files Coverage Δ Complexity Δ
src/main/java/htsjdk/samtools/SAMRecord.java 65.854% <89.231%> (-0.078%) 294 <56> (+9)
...samtools/util/AsyncBlockCompressedInputStream.java 72% <0%> (-4%) 12% <0%> (-1%)
...k/samtools/reference/IndexedFastaSequenceFile.java 71.528% <0%> (-0.472%) 43% <0%> (+14%)
.../htsjdk/samtools/reference/FastaSequenceIndex.java 65.306% <0%> (+1.764%) 19% <0%> (+7%) ⬆️
...sjdk/samtools/util/Md5CalculatingOutputStream.java 79.487% <0%> (+7.692%) 9% <0%> (+1%) ⬆️

@lindenb
Copy link
Contributor

lindenb commented Apr 29, 2017

ok for the changes, but please, please keep the java bean nomenclature ! :-)

when injecting a SAMRecord in a javascript , servlet , velocity context, etc... we can defacto use the standard APIs.

if(record.failingQC) { ... }

If my voice matters, I would be happy with: getReadFailsVendorQualityCheck -> isFailingQC() but not with getReadFailsVendorQualityCheck -> failsQC() :-)

Copy link
Contributor

@yfarjoun yfarjoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments. not final review.

return (mFlags & SAMFlag.READ_REVERSE_STRAND.flag) != 0;
}
/** Sets whether both reads in the read pair are aligned as expected. */
public void setProperlyPaired(final boolean paired) { setFlag(SAMFlag.PROPER_PAIR, paired); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that you simply moved the code, but I was wondering if this function should also requiredPaired()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that ship has sailed, in general - it would be a breaking change to add those kinds of checks into the setter, and I don't want to be the one who does that ;) More specifically, I think things are the way they are so that records can be cleaned up without having to worry about ordering. E.g. if you want to make a record "unpaired", you don't have to worry about calling setProperlyPaired(false) before setPaired(false) etc.

private boolean getMateNegativeStrandFlagUnchecked() {
return (mFlags & SAMFlag.MATE_REVERSE_STRAND.flag) != 0;
}
/** Returns true if the read represented by this record in unmapped. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/in/is/

/**
* It is preferable to use the get*Flag() methods that handle the flag word symbolically.
*/
/** It is preferable to use the get*Flag() methods that handle the flag word symbolically. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but you deprecated all of them :-) ...please reword the comment.

requireReadPaired();
return getProperPairFlagUnchecked();
}
/** Sets the "read paired in sequencing" flag to the given value. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and below, there are two styles of comments:

A. set/get the XXX flag on the read
B. returns true if . False otherwise.

Personally, I prefer the latter, but in any case, I think we should be more consistent....happy to help with wording if you like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a pass through and tried to be more consistent. Am happy to take suggestions for anywhere you think could still be improved.

@Deprecated public void setSupplementaryAlignmentFlag(final boolean flag) { setSupplementaryAlignment(flag); }

/** Returns true if the read fails vendor quality checks. */
public boolean failsQc() { return getFlag(SAMFlag.READ_FAILS_VENDOR_QUALITY_CHECK); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not isPassingFilters()/isNotPassingFilters() ? the description of the flag in the SAM spec is "not passing filters, such as platform/vendor quality controls" so, qc is an example, not the rule.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. I'll change it.

Copy link
Member Author

@tfenne tfenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews @lindenb and @yfarjoun. I've tried to address your comments.

return (mFlags & SAMFlag.READ_REVERSE_STRAND.flag) != 0;
}
/** Sets whether both reads in the read pair are aligned as expected. */
public void setProperlyPaired(final boolean paired) { setFlag(SAMFlag.PROPER_PAIR, paired); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that ship has sailed, in general - it would be a breaking change to add those kinds of checks into the setter, and I don't want to be the one who does that ;) More specifically, I think things are the way they are so that records can be cleaned up without having to worry about ordering. E.g. if you want to make a record "unpaired", you don't have to worry about calling setProperlyPaired(false) before setPaired(false) etc.

@Deprecated public void setSupplementaryAlignmentFlag(final boolean flag) { setSupplementaryAlignment(flag); }

/** Returns true if the read fails vendor quality checks. */
public boolean failsQc() { return getFlag(SAMFlag.READ_FAILS_VENDOR_QUALITY_CHECK); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. I'll change it.

requireReadPaired();
return getProperPairFlagUnchecked();
}
/** Sets the "read paired in sequencing" flag to the given value. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a pass through and tried to be more consistent. Am happy to take suggestions for anywhere you think could still be improved.

Copy link
Contributor

@yfarjoun yfarjoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed a few spots. :-)

@Deprecated public void setFirstOfPairFlag(final boolean flag) { setFirstOfPair(flag); }


/** Returns true if the second of pair flag is set, false otherwise. Illegal to call on unpaired reads. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...if the read is second in a pair,...."

/** Returns true if the second of pair flag is set, false otherwise. Illegal to call on unpaired reads. */
public boolean isSecondOfPair() { requireReadPaired(); return getFlag(SAMFlag.SECOND_OF_PAIR); }

/** Sets the second of pair flag. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Sets whether the read is second in a pair"

public void setReadFailsVendorQualityCheckFlag(final boolean flag) {
setFlag(flag, SAMFlag.READ_FAILS_VENDOR_QUALITY_CHECK.flag);
}
/** Returns true if the first of pair flag is set, false otherwise. Illegal to call on unpaired reads. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...if the read is the first in a pair,...."

/** Returns true if the first of pair flag is set, false otherwise. Illegal to call on unpaired reads. */
public boolean isFirstOfPair() { requireReadPaired(); return getFlag(SAMFlag.FIRST_OF_PAIR); }

/** Sets the first of pair flag. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Sets whether the read is first in a pair."

@lbergelson
Copy link
Member

@tfenne It might make sense to make these changes in the context of introducing a new read interface. (We happen to have a new read interface that we think is pretty good https://github.com/broadinstitute/gatk/blob/master/src/main/java/org/broadinstitute/hellbender/utils/read/GATKRead.java)


/**
* strand of the mate (false for forward; true for reverse strand).
* True if the read is the second read in a read pair, false otherwise. Illegal to call on unpaired reads.
* Returns the value of flag bit 0x80.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're already making changes changes we might want to update these to match the language in the sam spec. I.e. firstSegment, lastSegment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm. While first/last is indeed the language of the sam-spec, the vast majority of the actual usage pair-ended reads. If we are to go this route (of pretending to have more then 2 reads), we would also need to avoid the use of the term "mate" and start using "next" everywhere...I don't think that the confusion that would cause is worth the adherance to the language in the spec. What we can do is specify in the javadoc precisely what flag is being accessed here (and in other places...)

@droazen
Copy link
Contributor

droazen commented May 1, 2017

@tfenne Completely agree with @lbergelson -- it would be extremely useful to finally have a Read interface in htsjdk! If we could propose these naming adjustments in the context of a discussion around a new Read interface, without changing SAMRecord proper (yet), it would be less immediately disruptive to downstream code, and we'd likely end up with something less constrained by the existing structure of the SAMRecord class.

@tfenne
Copy link
Member Author

tfenne commented May 1, 2017

@lbergelson & @droazen thanks for the feedback. I'm trying to understand the goal of introducing a Read interface above SAMRecord in htsjdk. I'm not against it, but don't yet understand what it would buy us. Specifically, I think the interface would have to be constrained somewhat by not requiring breaking changes in SAMRecord - which may or may not make it palatable to you vs. your approach in GATK. Also, unless there are multiple SAM-like things that satisfy the interface, it seems like overhead to have an interface and a single implementation. Would the goal be to have other implementations such as the google genomics one in GATK make their way down in htsjdk?

I did take a look at the linked GATKRead interface, and note that most of the names given to flag-like methods either match or are pretty close to where I ended up 😄 . But there are also constraints on that interface that would be hard to impose retro-actively on SAMRecord - e.g. the deep copying of any mutable structure returned. Still, I could imagine an interface in htsjdk that doesn't impose those constraints, that is sub-typed in the GATK to impose them - it would still require either extending or wrappering SAMRecord there though. Lastly, it's a small thing, but the setIs* are somewhat unconventional - for booleans the JavaBean standard is to have isX and setX, not setIsX. Are you open to changing things like that for an interface in htsjdk?

I don't think the changes proposed are immediately disruptive - the old methods are deprecated, but not removed. That said, changing them multiple times to align with a future interface would be more disruptive than doing it all at once.

My goal is to clean up some of the cruft in SAMRecord and make interacting with nicer, without the need to wrapper it since I'm somewhat afraid of the performance overhead of putting another level of indirection between application code and reads. I'd also like, over time, to do things like get rid of indexingBin and compute it on the fly, and investigate improving the storage, performance and access to extended attributes. If an interface is helpful, I'm open to that. But I'd rather that not lead this down a path that takes us months to make these kinds of changes.

@droazen
Copy link
Contributor

droazen commented May 1, 2017

@tfenne Deprecating these methods in SAMRecord directly would be immediately disruptive to us -- it would break all of our builds, which treat use of deprecated code as an error (a policy which has proved necessary to prevent usages of deprecated code from accumulating and persisting over time).

We've had a need for a Read (and Variant!) interface in htsjdk for a long time -- having these interfaces in the GATK, and then having to adapt or convert the underlying htsjdk types has been painful for us. It would be great if we could take this opportunity to finally design a Read interface in htsjdk that we are all happy with. The design process need not take months!

I'm sure I don't need to convince you of the advantages and value of coding against interfaces rather than concrete types in a general sense. The specific use case that prompted us to create the GATKRead interface was the need to deal with GA4GH data -- the records returned by GA4GH queries were sufficiently different from SAMRecord that we needed an interface with clear, well-defined semantics not tied to any particular file format to write code against.

@yfarjoun
Copy link
Contributor

yfarjoun commented May 2, 2017

@droazen I'm not sure that idiosyncrasies of the GATK4 build can be used as an argument to increase the scope of a PR, or significantly slow it's acceptance. We should be able to accept/reject a PR on its merits and make the build systems/CI we use in projects that depend on htsjdk flexible enough to handle the occasional deprecation. Heaven knows we cause this pain to others....

Regarding the the specific ideas being floated around.

  1. Is the GATKRead interface close enough to being the Read interface we would like to have?
  2. Are the changes proposed here compatible with the eventual interface?
  3. I think that enabling various sources of reads to be used interchangeably in htsjdk and libraries that use it would be a positive move and so I support the creation of an interface that would support that. However, that might indeed prove to be a longer effort than what is needed to get this PR though.

Can we agree on the names/javadoc of the methods in this PR and open a separate issue dedicated to the discussion of the read interface?

@magicDGS
Copy link
Member

magicDGS commented May 2, 2017

I think that the Read interface is something nice to add, but more in the scope of a new API (#520). It will also include the Variant interface, and I'm open to contribute to that one.

What's about doing something similar as in GATK and create a new HTSJDK repository for version 3, including interfaces and solving known problems with the API? New issues could be back-ported in the meanwhile, and the new SemVer system could be apply from version 3 onwards.

Copy link
Member

@magicDGS magicDGS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change, but I rather prefer an interface to be able to use some methods in HTSJDK in custom implementations...

public boolean isSecondaryOrSupplementary() { return isSecondaryAlignment() || isSupplementaryAlignment(); }

/** Sets a specific flag bit to the provided value. */
private void setFlag(final SAMFlag flag, final boolean value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To have more control and do not rely on the new syntax, could this and the getter be public?

@droazen
Copy link
Contributor

droazen commented May 2, 2017

@yfarjoun If we do this rename now, and then introduce a Read interface later, we'll have to make two passes through our code base to migrate method calls instead of one. In the past, as a courtesy to downstream projects like IGV, Picard, and GATK, we've made an effort to try to minimize such unnecessary pain. And to be clear, this change likely affects thousands or tens of thousands of lines of code in GATK alone, so this is on a somewhat different level from typical htsjdk breakage :) Also, consider this: if we did this rename now, and then wanted different names in the final Read interface, for whatever reason, we'd then have to do a second round of renaming/deprecation in SAMRecord itself when we make it implement that interface, which would be gross.

It also seems more productive and ultimately useful to work together to design a generalized Read interface agnostic to file format than to just fiddle with method names in SAMRecord. GATKRead could act as a starting point for that discussion, but we are very open to changes!

@yfarjoun
Copy link
Contributor

yfarjoun commented May 2, 2017

well. I guess that part I disagree with is the "have to". You could also change GATK's build system (temporarily) to allow deprecated calls.

@droazen
Copy link
Contributor

droazen commented May 2, 2017

@yfarjoun Not really a viable option for us, given the number of our dependencies and the frequency with which APIs change out from under us (particularly on the Spark side of things).

But there's a bigger problem here: if we rename a bunch of methods in SAMRecord, and then separately design a Read interface, we might have to do a second round of renaming in SAMRecord to make it agree with the names we settle on for the interface. This is why it makes little sense to me to de-couple the tasks of improving naming in SAMRecord from introducing a Read interface.

Copy link
Member

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tfenne looks really great, though I think I would suggest some extra methods like setMapped(final boolean mapped) and setPositiveStrand(final boolean positive), as they seem a bit more natural, for example a read goes from unmapped to mapped so setMapped.

Also, reading through the discussion, while it is interesting to think about making an interface on top of SAMRecord for more general representations of sequencing reads, I think it should to be motivated by having a second implementation in htsjdk versus elsewhere. Furthermore, since SAMRecord is meant to represent the specifics and peculiarities of the SAM spec, I am not sure it makes sense to think of other formats unless they can be converted back into SAM (ex. CRAM, BAM). A common interface that both SAMRecord and a second implementation would implement seems like a separate discussion (worth having too).

I am also not convinced about the arguments that deprecations will break other folks' build, or that it is a too of work to switch over to the new methods before they are removed. They could not be removed for quite some time allowing folks to make the switch.

throw new IllegalStateException("Inappropriate call if not paired read");
}
}
/** Sets whether or not the read is part of a read pair. Sets flag bit 0x1. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets the, or "Sets the 0x1 flag bit"

}
}
/** Sets whether or not the read is part of a read pair. Sets flag bit 0x1. */
public void setPaired(boolean paired) { setFlag(SAMFlag.READ_PAIRED, paired); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

/** Sets the read's unmapped flag. Sets flag bit 0x4. */
public void setUnmapped(final boolean unmapped) {
setFlag(SAMFlag.READ_UNMAPPED, unmapped);
setIndexingBin(null);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a setMapped method too.

return (mFlags & SAMFlag.READ_FAILS_VENDOR_QUALITY_CHECK.flag) != 0;
}
/** Sets whether the read's mate is unmapped. Sets flag bit 0x8. */
public void setMateUnmapped(final boolean unmapped) { setFlag(SAMFlag.MATE_UNMAPPED, unmapped); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto about a setMateMappedmethod.

public final boolean isPositiveStrand() { return !isNegativeStrand(); }

/** Sets whether the read is mapped to the negative strand of the genome. Sets flag bit 0x10. */
public void setNegativeStrand(final boolean negative) { setFlag(SAMFlag.READ_REVERSE_STRAND, negative); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why, but I want a setPositiveStrand too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's for symmetry.

@magicDGS
Copy link
Member

magicDGS commented May 3, 2017

I repeat my suggestion about creating a non-backwards compatible API and ask to the community for a clean-up effort of the library. Doing that, this PR could get in and the discussion about what is a good Read interface could be moved to the new htsjdk3 repository. I'm in for contribution!

@droazen
Copy link
Contributor

droazen commented May 3, 2017

@nh13 Well, it's not really practical to push down alternate read implementations into htsjdk, given how strictly we control project dependencies here (eg., I assume you would not want to see a new Google Genomics dependency added to htsjdk, right?). But the fact is that alternate read representations do exist, have been standardized, and the lack of a Read interface in htsjdk is a big problem for projects that have to deal with these. We end up having to write all sorts of adapter classes to interface with htsjdk code.

@yfarjoun
Copy link
Contributor

yfarjoun commented May 3, 2017

@tfenne I think that this is totally a good direction and I too would love to have better API for sam-record. I also think that a Read API is a good way forward and would hate to change the names twice. I think we are all on the same page. The only question is timing. As we are working towards a presentation in BioIT we are too short-staffed to deal with the breakage that would result (due to the way gatk4 is currently set-up) but would love to be part of designing the Read API. There are important decisions to be made (for example regarding using first / mate / second versus first / next / last ) and we do not have the bandwidth right now to give it the attention it deserves.

Can we agree to put this on hold for about a month and then hold a day-long meeting where we will work on the new API including changing names? We're happy to host.

I hope I was able to convey our reluctance to making these changes right now and at the same time, our full commitment to getting to them in the near future, and not letting them languish like some htsjdk PRs...

@tfenne
Copy link
Member Author

tfenne commented May 3, 2017

@yfarjoun: I appreciate your candour. As you know @nh13 and I do most of our development in scala these days, and we discussed two options. The first was to do what this PR is trying to do - i.e. evolve HTSJDK APIs, in this case SAMRecord, in a better direction. We feel like this is both a better option for the community at large, and also more effort for us. The other option is to encapsulate our use of HTSJDK classes and provide ourselves with a better API in scala. The end result of that option is actually a better result for us, but doesn't help any other HTSJDK users.

Waiting a month to have the discussion about starting a longer process of specifying a super-interface for SAMRecord is likely not compatible with our plans. The best answer I can give is that you should ask again in a month, but likely we'll have moved on by then. I'm not even fully bought into the idea that there should be a Read interface in HTSJDK unless either a) there is more than one implementation or b) it's a more radical departure and is not based on paving the cow path that is the current SAMRecord API give or take some re-naming. I.e. if we're going to introduce an interface we should be open to broader change as @magicDGS suggests.

I also echo your own earlier comment that it should not be encumbent on htsjdk to deal with the consequences of choices made in downstream projects. If htsjdk were to be completely beholden to the GATK development team's choices that would be a real problem. Treating use of deprecated code as compile breaks is a choice, but should not mean that we now have to treat deprecation as a breaking change in HTSJDK. If that were true there would be literally no way to evolve the library! In addition, nobody is forcing GATK to use the latest HTSJDK snapshot - if if that's disruptive why not simply pin the dependency at 2.9.1 until you're ready? Also, how is it that GATK is not insulated from changes in SAMRecord via it's use of GATKRead and SAMRecordToGATKReadAdapter?

@droazen
Copy link
Contributor

droazen commented May 3, 2017

@tfenne Surely changes to a core class like SAMRecord require some coordination among stakeholders and maintainers of the project -- and if one stakeholder would be inconvenienced and requests that a change be delayed by a few weeks, in the spirit of collegiality I'd hope that request would be honored unless the change is particularly urgent. "Making method names nicer" does not seem to me to rise to that level of urgency. I feel like there have been many occasions when we've been asked to hold back much more substantial changes for weeks or months out of deference to the Picard project.

@yfarjoun
Copy link
Contributor

yfarjoun commented May 4, 2017

If you (@tfenne) and @nh13 are open to the possibility of changing some names (again) in a several weeks, there seems to be a third option that enables you to start moving forward with an experimental API and not disrupting the GATK project, whose objection to the suggested change is temporal, not categorical (as in, in a month or so we will gladly accept the disruption cause by the deprecation, there is no objection to the mere idea of deprecating functions.)

The option I'm thinking of is the following:

Add the functions that you are currently suggesting annotated with (a new) @Experimental annotation and do not @Deprecate the current versions. This will enable you to move forward in fgBio and would also convey clearly that these functions/names are not final and that whoever uses them risks needing to change some code in the future.

@tfenne
Copy link
Member Author

tfenne commented May 6, 2017

@yfarjoun Thanks for the suggestion. I honestly didn't think this PR was going to generate as much discussion as it did. My goal here was to spruce up the API to SAMRecord here, and then create custom sub-classes in fgbio for use with a SAMRecordFactory.

The names I picked here are already something of a compromise w.r.t. idiomatic scala. Since the PR generated the response it did, I spent a bit more time on the scala side and have found a neat way of totally encapsulating the API to SAMRecord without having to wrapper it(!). For anyone interested (it's all in scala), the prototype of that work is here.

Having done that, I'm not much less dependent on the particulars of the SAMRecord API and am happy to wait. I can either leave the PR open, or close the PR and leave the branch up here, so it's available as input to the future discussion. Thoughts on which is preferable?

@yfarjoun
Copy link
Contributor

yfarjoun commented May 7, 2017 via email

@yfarjoun
Copy link
Contributor

yfarjoun commented Jul 4, 2018

can this PR be closed?

@magicDGS
Copy link
Member

magicDGS commented Jul 6, 2018

I guess so. This would be a addressed in the htsjdk-next-beta Read interface

@tfenne tfenne closed this Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants