Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to specify parent child streams ? #40

Open
silverbullet1 opened this issue Dec 17, 2023 · 3 comments
Open

How to specify parent child streams ? #40

silverbullet1 opened this issue Dec 17, 2023 · 3 comments

Comments

@silverbullet1
Copy link

Hi!

I have 2 streams X and Y.
My stream Y depends on X, i.e Y will iterate over all IDs that I got from X's stream. How do i specify this in the config ? Is this supported ?

Thanks 😄

@jlloyd-widen
Copy link
Contributor

I don't believe this is currently supported. Good request though. Feel free to submit a PR if you would like it or need it within a reasonable timeframe.

@silverbullet1
Copy link
Author

silverbullet1 commented Dec 19, 2023

Sure @jlloyd-widen, would love to work on this feature request. I was trying to understand the current logic and I see that we are doing streams.append(DynamicStream(...)) for all the discovered streams from config. I was thinking of adding the following config in meltano.yml

  streams:
        - name: x
          path: /x
          primary_keys:
          - id
        - name: y
          path: /y
          depends_on:
          - x
          primary_keys:
          - id

And then before appending a dynamic stream, we would somehow have to append a child stream class, but I am not very sure on how to specify parent_stream_type as per official docs, because they have concrete classes there and we have a common DynamicStream class for all the streams, how to refer to a particular X among X1, X2, X3, etc?

Thanks for the help 😄

@jlloyd-widen
Copy link
Contributor

Well, I'm not too sure off the top of my head since I've never instantiated child streams in this way before. All I know is that the only real difference between a parent stream and child stream from a code perspective is that the parent stream has a particular method defined within it that looks something like this:

    def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
        """Return a context dictionary for child streams."""
        return {
            "X_id": record["id"],
        }

And a child stream has the parent_stream_type class attribute defined as well as usage of the context from the parent, sometimes used like:

parent_stream_type = XStream
path = "/Xs/{X_id}/X-elements"

My guess is that you can use our DynamicStream class but leave the respective methods and attributes null for streams where they don't apply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants