Question about luke-luke-4.10.4-field-reconstruction #175
Description
Hi, I am opening an issue as I can't find another way to ask Dmitry a question - sorry if this is the wrong approach.
Before I realised that you had made this patch I wrote my own code to do the same thing, which I am still experimenting with. Then I found your patch and compared the code, specifically the part which reconstructs the data when there is no position information.
I note that (IIUC) your new code takes the first term and adds it to the GrowableStringArray without checking if it is for the current document. Is this correct? In the alternate case where position info is available, there is a loop which discards any terms which are not for the current document:
int num = dpe.advance(docNum); if (num != docNum) { // either greater than or NO_MORE_DOCS continue; // no data for this term in this doc }
Shouldn't both cases do the same thing?
In my code (as there is no DocsAndPositionsEnum) I use the DocsEnum instead to cross reference against the current document:
`
DocsEnum docsEnum = termsEnum.docs(null, null);
if (docsEnum != null) {
int termDoc;
while ((termDoc = docsEnum.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
if (termDoc != docNum) {
continue; // this term is not for this document
}
GrowableStringArray gsa = (GrowableStringArray)
res.getReconstructedFields().get(fld);
if (gsa == null) {
gsa = new GrowableStringArray();
res.getReconstructedFields().put(fld, gsa);
}
String term = termsEnum.term().utf8ToString();
System.out.print("(Cross-referenced term " + term + " via doc " + termDoc + ") ");
gsa.append(0, "|", term);
}
}
`
Do you see what I mean? Or have I missed something?
Thanks
- Chris