Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing an annotation gets added as a new annotation #5

Open
davidfor opened this issue Dec 29, 2013 · 10 comments
Open

Changing an annotation gets added as a new annotation #5

davidfor opened this issue Dec 29, 2013 · 10 comments

Comments

@davidfor
Copy link
Owner

While testing for the Kobo devices, fetched the annotations, then I changed one on the device and did another fetch. The changed annotation was added as a new annotation in the column. Doing a third fetch, removed the original version and left only the second version.

After the first fetch, I had:

  <div class="annotation" genre=""
  hash="fe72eed43f4dedfcd9c003e1eafb0afd" location_sort="000200"
  reader="KoboTouch" style="margin:0 0 0.5em 0">
    <table cellpadding="0" width="100%"
    style="background-color:LightGray;color:black;font-size:80%;font-weight:bold;margin:0;"
    color="Gray">
      <tbody>
        <tr>
          <td class="location" style="text-align:left">Extra
          Chapters</td>
          <td class="timestamp" uts="1387924396.0"
          style="text-align:right">25/12/2013 09:33:16</td>
        </tr>
      </tbody>
    </table>
    <p class="highlight" style="margin:0;text-indent:0.5em;">Extra
    Chapters</p>
    <p class="note"
    style="font-size:90%;font-style:italic;margin:0;">0</p>
  </div>

After adding text to the annotation and fetching again, I have:

  <div class="annotation" genre=""
  hash="fe72eed43f4dedfcd9c003e1eafb0afd" location_sort="000200"
  reader="KoboTouch" style="margin:0 0 0.5em 0">
    <table cellpadding="0" width="100%"
    style="background-color:LightGray;color:black;font-size:80%;font-weight:bold;margin:0;"
    color="Gray">
      <tbody>
        <tr>
          <td class="location" style="text-align:left">Extra
          Chapters</td>
          <td class="timestamp" uts="1387924396.0"
          style="text-align:right">25/12/2013 09:33:16</td>
        </tr>
      </tbody>
    </table>
    <p class="highlight" style="margin:0;text-indent:0.5em;">Extra
    Chapters</p>
    <p class="highlight" style="margin:0;text-indent:0.5em;"></p>
    <p class="note"
    style="font-size:90%;font-style:italic;margin:0;">0</p>
    <p class="note"
    style="font-size:90%;font-style:italic;margin:0;"></p>
  </div>
  <div class="annotation" genre=""
  hash="8f8b5c8c247f7470afbd0ecc58ac74a1" location_sort="000200"
  reader="KoboTouch" style="margin:0 0 0.5em 0">
    <table cellpadding="0" width="100%"
    style="background-color:LightGray;color:black;font-size:80%;font-weight:bold;margin:0;"
    color="Gray">
      <tbody>
        <tr>
          <td class="location" style="text-align:left">Extra
          Chapters</td>
          <td class="timestamp" uts="1388271595.0"
          style="text-align:right">29/12/2013 09:59:55</td>
        </tr>
      </tbody>
    </table>
    <p class="highlight" style="margin:0;text-indent:0.5em;">Extra
    Chapters</p>
    <p class="note"
    style="font-size:90%;font-style:italic;margin:0;">0 - update
    1</p>
  </div>

On the third fetch, after disconnecting the device and restarting calibre, I get:

  <div class="annotation" genre=""
  hash="8f8b5c8c247f7470afbd0ecc58ac74a1" location_sort="000200"
  reader="KoboTouch" style="margin:0 0 0.5em 0">
    <table cellpadding="0" width="100%"
    style="background-color:LightGray;color:black;font-size:80%;font-weight:bold;margin:0;"
    color="Gray">
      <tbody>
        <tr>
          <td class="location" style="text-align:left">Extra
          Chapters</td>
          <td class="timestamp" uts="1388271595.0"
          style="text-align:right">29/12/2013 09:59:55</td>
        </tr>
      </tbody>
    </table>
    <p class="highlight" style="margin:0;text-indent:0.5em;">Extra
    Chapters</p>
    <p class="note"
    style="font-size:90%;font-style:italic;margin:0;">0 - update
    1</p>
  </div>

From looking at the code, the issue is the hash.

The method "merge_annotations", gets the old version of the annotations and the new version. It then gets a list of the hashes from both versions. Then it works out what hashes are in the new annotations but not the old. The annotations for these hashes are then added to old annotations.

The problem with this is the way the hashes are calculated. They are produced from the selected text and the annotation text. Changing one of these on the device will produce a new hash value. Hence they won't match and the changed annotation will appear as a new annotation.

I would suggest using the annotation_id. That shouldn't change when the annotation is modified (it definitely won't on the Kobo devices). Also, the hash can then be used to determine if the annotations have actually changed. If they haven't, the update of the metadata for the book can be skipped. This will mean the modification timestamp of the book will only change when there are actual changes to the metadata.

@davidfor
Copy link
Owner Author

Forgot to put the version details:

calibre: 1.17.0
Language: English
Annotations: 1.3.0
Windows 7, 32-bit

@GRiker
Copy link
Collaborator

GRiker commented Dec 30, 2013

@davidfor, I am looking at this. I'm currently traveling with infrequent internet access.

@GRiker
Copy link
Collaborator

GRiker commented Dec 31, 2013

I have a provisional version of 1.3.1 with a new algorithm for merging annotations based on timestamps, and a fix for issue #6. If you give me an email address, I can email it to you as I am unable to upload to Dropbox currently. I will be back online normally after the 4th.

@GRiker
Copy link
Collaborator

GRiker commented Dec 31, 2013

I was able to push the source for 1.3.1, so alternatively you can pull the latest source and build the plugin yourself.

@davidfor
Copy link
Owner Author

davidfor commented Jan 1, 2014

Got the code, but, it doesn't work. To be honest, I think it is worse.

With this, when I modified the annotation, there did not seem to be a change from before. But, when I fetched again without making changes, the original version of the annotation was not removed.

The first problem is the timestamp. The timestamp in "AnnotationStruct" is called "last_modification". I am populating that with when the annotation was last changed. So, it changes as the annotation is modified. The device does have a creation timestamp for the annotation and I could use that, but, I feel it would be better to display the modification timestamp.

For testing purposes, I did change to use the creation timestamp. This did update the annotation correctly. But, when I deleted an annotation on the device and did a fetch, it did not remove the deleted annotation.

If the timestamp is not supposed to change, then I can change my fetch code. But, I think displaying when the annotation was last modified is better than when it was created. Also, as I said before, I think putting the annotation_id in the annotation XML and using that for matching will work better. Then the timestamp or hash can be used to determine if the annotation has actually changed.

@GRiker
Copy link
Collaborator

GRiker commented Jan 1, 2014

The timestamp is assumed to always be the initial creation date. If an annotation is subsequently modified, the timestamp should stay the creation date. The logic is fairly simple - if the timestamp already is already stored, then the stored version is kept. This allows for the user to edit the annotation in calibre. If the timestamp exists only on the device, or only in the database, it is kept.

I would prefer that you try using the creation timestamp persistently before making more any more changes to the code (or reverting to the previous code). The logic will get much more complex if the timestamps aren't predictable, especially if the user edits the annotation in the calibre db.

@GRiker
Copy link
Collaborator

GRiker commented Jan 1, 2014

I've pushed up a new version of the source that restores the hash comparison as the default technique when merging annotations, with an option to use the timestamps. If you do nothing in your class, it will use the hashes as the merge_index. If you add a property to your class MERGE_INDEX = "timestamp" it will use the timestamp technique. See the Marvin class for an example of declaring MERGE_INDEX in your class.

It's likely that there's still a bug in the hash technique, as I didn't change that code, I just restored it. My recommendation is still that you try using the persistent creation timestamp, as I think that it will be more reliable and predictable.

@davidfor
Copy link
Owner Author

davidfor commented Jan 2, 2014

Have I missed an intention of how the annotations are to work? The second last comment seems to imply that you do not expect annotations to be removed from the calibre library unless someone edits them out. Is that? Is that correct or have I missed it.

For each book, what I think you are saying is:

  • If the annotation on the device is new, add it to the library
  • If the annotation has been deleted on the device, keep it in the library
  • If the annotation has been modified on the device, but not in the library, update the library.

But, if the annotation has been modified in the library but not on the device, what happens? Update the library, ignore the change or something else? Is this where the hash is used? If the save hash doesn't matches the incoming hash, then it hasn't been updated on the device and you keep whatever the current text is in the library. If they don't match, then you use the text from the device.

At the moment, I will be using the hash only method. Both the people I have had test the KoboTouch fetch expected that deleted annotations would be removed from the library. They also expected changed one would be updated. This isn't happening, but doing another fetch cleans it up. The timestamp method means that deleted annotations remain. They can be manually edited out, but that might not what people want to do. Plus, I am finding editing the annotations in the library a bit fragile.

@GRiker
Copy link
Collaborator

GRiker commented Jan 2, 2014

Here's some background on how merging was designed to work, and why.

I originally implemented the plugin for Marvin, an iOS reader application. It is quite common for Marvin users to have it installed on both an iPad and an iPhone. Therefore, the first objective was to allow incoming annotations to be merged from different devices for the same book. If the intent was to only fetch existing device annotations, there would be no need for a merge_annotations() function.

A second objective was to allow the user to edit a stored annotation in calibre. Suppose you created a note attached to a highlight with a typo. You spot the typo after fetching annotations, so you make the correction in the custom column in calibre. The user would expect that edit to be persistent, unless the user updates the annotation later on the device, in which case the device annotation wins.

A third objective was to support the "re-reading a book" scenario: You read a book, make a couple of annotations, fetch them with the plugin, then remove the book from your device. A year later, you decide to read the book again. You make a couple of new annotations. You'd want all of the annotations to be preserved.

It would be possible to add a switch to the config dialog specifying whether to merge or overwrite annotations, but this strikes me as unnecessarily complicated, and could easily lead to an unintended loss of older annotations.

@davidfor
Copy link
Owner Author

Firstly, my apologies for not responding sooner. I read your response and decided to think about it and how it affected what I had done for a couple of days. That has extended a bit more than planned :)

With that explanation, it makes sense what you are doing in the plugin. My personal use of annotations has been temporary markers for errors or things I want to look up when I am near a computer. Generally, I delete the annotations as I fix the problem in the book. Fetching them is handy as I can copy the marked text for searching,

As to an option to overwrite the existing annotations, I think that should be on the import dialog. It still has risk of unintended loss, but it lets the user choose at the time.

My code is finished and tested. I've done a lot of checks and have had a few other Kobo users test good results. How do you want to manage getting the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants