Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Generating Extra Gaps #102

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from
Open

Avoid Generating Extra Gaps #102

wants to merge 25 commits into from

Conversation

bitsofbits
Copy link
Collaborator

This PR attempts to reduce the number of extra gaps being generated by tagging messages during the thinning phase
with is_possible_gap_end. Only tagged messages that have this set to true can generate a gap event.

Caution

This has not been tested. Do not merge without testing!

Changes

  • message_schema.py: add is_possible_gap_end column. Remove unused destination column.
  • smart_thin_records.py: Convert VesselLocationRecords to VisitLocationRecords on input. These have the added field is_possible_gap_end, which is initially set to False. When emitting points due to a gap, set is_possible_gap_end to True for the gap end.
  • port_visits_pipeline.py: load messages as VisitLocationRecord rather the VesselLocationRecord to accommodate is_possible_gap_end field.
  • create_in_out_events.py: Only generate gap events if the end point has is_possible_gap_end set.

bitsofbits and others added 22 commits August 24, 2022 07:18
* update regions

* remove bad eten peru anchorages

* remove pipe-tools dependencies

* reduce changes relative to master

* switch to using Dockerfile-scheduler in Dockerfile compose

* fix requirements and other docker infrastructure to run name_anchorages using Docker on Dataflow.

* update numpy due to security vulnerability

Co-authored-by: Nate Miller <[email protected]>
…ery and adjust the format string exp according to the amount of args
…lass in py8, shard port_state by last_timestamp
…am[gcp] to authenticate and remove numpy reference that comes already in the DF SDK 2.40.0
…sting overrides and those Hannah commited to Master
Several changes to update the read/write transfors and descriptions and labels
@@ -68,7 +81,7 @@ def thin(self, grouped_records, anchorage_map):
and rcd.timestamp - last_rcd.timestamp >= self.min_gap
):
active.add(last_rcd)
active.add(rcd)
active.add(rcd._replace(is_possible_gap_end=True))

if self.transition_map[(last_state, state)]:
Copy link

@Chr96er Chr96er Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought is that at this point the record is still being dropped due to thinning (if both points - before and after gap - are IN_PORT) , no?

edit: the result would be that we'd get rid of gaps altogether - in theory

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Chr96er , if there is a gap seen during the thinning it should keep the start and end of the gap in the thinned points, and (this is the key part) the end of the gap gets tagged with is_possible_gap_end. Then in creare_in_out_events, points with that tag, but no other points can emit gaps. So we should still get gaps when there's a legitimate gap.

It might make sense to only emit this pair of points when the current point is in port (since that's the only time we'd generate a gap), but I'm not sure it makes a difference one way or the other.

… which means not using it for last state. Also fix bug where we forgot to inherit from NamedTuple
…rect seg_id. Also clean up section that wasn't causing an issue, but was confusing and would likely lead to some bugs later
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants