feat: add conversation id to HRI message #480

rachwalk · 2025-03-25T15:38:41Z

Purpose

It is useful to have conversation id assigned when the message is passed between different agents

Proposed Changes

Adds conversation id to HRIMessage
Makes it so that ASR agent "follows" a single conversation, so delayed messages from different conversations don't affect runtime
Makes TTS and conversational agents handle the conversation id

Issues

N/A

Testing

tests pass

maciejmajek · 2025-03-27T09:32:45Z

src/rai_core/rai/communication/hri_connector.py

    def from_langchain(
        cls,
        message: LangchainBaseMessage | RAIMultimodalMessage,
+        conversation_id: Optional[str] = None,


Both RAIMultimodalMessage and LangchainBaseMessage have id field. How about using them instead of new argument?

From langchain documenation: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.base.BaseMessage.html#langchain_core.messages.base.BaseMessage.id - the Langchain id field is unique per message. As discussed conversation_id should not be assumed to be unique per message. Langchain id therefore cannot be used. RAIMultimodalMessage uses the same id as the aforementioned one.

Also note, that in the documenation it says "This should ideally be provided by the provider/model which created the message." - it is therefore not guaranteed to be provided by every provider/model, and adds unnecessary layers of complexity, especially given that logicallly message id is not the conversation id -- e.g. multiple HRI messages are published on /to_human with the same id, during response to a single prompt.

In that case, I think we should consider adding chunk_id and message_id to the HRI message instead. This way it would be semantically complete for both streaming and standard use case. What do you think?

adding chunk_id and message_id to the HRI message instead

What is the difference between these two? And how would their pair be used instead of conversation id?

This way it would be semantically complete for both streaming and standard use case.

I do not understand what you mean by "semantically complete".

What is the difference between these two? And how would their pair be used instead of conversation id?

message_id refers to the unique identifier for a specific message (previously referred to as conversation_id, but renamed for clarity, as "conversation" was unnecessarily specific).
chunk_id identifies a specific part of a message.

In streaming use cases, chunk_id is necessary to track individual parts of a message as they are received.
In standard (non-streaming) use cases, chunk_id can be set to None, indicating that the message was received in full.

I do not understand what you mean by "semantically complete".

This refers to whether the message (or chunk) contains all the necessary information to be considered complete in meaning—i.e., whether it represents a full, coherent message or only a partial one.

As discussed outside of Git, message_id naturally corresponds to the primary key of the message, meaning it must be unique. Other possible names for message_id include conversation_id, communication_id, etc.
The rest of the comment remains valid.

Also, a bool is_stream has been proposed outside of git.
I'd rather use two ids instead.

To summarise for clarity:

there will not be message_id, as there is apparently no need for this information to be included

there will be communication_id which identifies a single instance of communication (request/response) wether streaming or not streaming

there will be chunk_id which will be set to None in non-streaming cases, and otherwise will identify specific chunks of the message

Then I have a question, as it seems to me that there is no use case for a chunk_id not being a boolean, unless additional data is provided: i.e. either a (creation) timestamp is added to the message or chunk_ids are are sequential. Both of these options would allow to recreate the stream in case massages arrive in non-ordered sequence, as may be the case with some creation protocols.

Otherwise if chunk_id is not to be sequential, and a timestamp is not to be provided, I see no rationale behind using and id instead of a boolean to identify streaming. If that's the case, it would be helpful if you could provide one.

To re-summarize:

communication_id stays, defined as above

seq_no will be added, providing a sequential id for chunks (starts at 0). This is the same for streaming and non streaming messages (single message is by definition 0th message in the sequence)

is_done - boolean field, to signify whether communication is finished or not will be added

…again before new speech is generated

rachwalk requested a review from maciejmajek March 25, 2025 15:38

maciejmajek reviewed Mar 27, 2025

View reviewed changes

rachwalk force-pushed the feat/add-id-hrimessage branch from a7e2ce2 to ac729d9 Compare March 31, 2025 14:08

rachwalk added 11 commits April 1, 2025 12:39

add conversation_id optional to HRI message

e3882fa

add adding conversation id to asr agent

3f3017e

test: update tests to use conversation id

7393d1b

feat: update conversational agent to transmit conversation id

9478e4d

feat: update asr agent to use conversation id

0fe4684

feat: update react agent to use conversation id

b607861

fix: race condition when a previously discarded speech id would show …

8c7a445

…again before new speech is generated

feat: react agent now uses langchain id instead

6856825

feat: add seq_no and seq_end fields to hri message

5195e91

refactor: rename conversation_id to communication_id

b780140

feat: add testing for conversation_id, seq_no, and seq_end

fb12a21

rachwalk force-pushed the feat/add-id-hrimessage branch from 9f88c47 to fb12a21 Compare April 1, 2025 10:40

rachwalk added 3 commits April 1, 2025 12:58

feat: update langchain llm callback to utilize seq_no and seq_end

7974066

fix: fix broken tests

61832ff

fix: broken test class

3eb45c4

maciejmajek self-requested a review April 1, 2025 13:11

maciejmajek approved these changes Apr 1, 2025

View reviewed changes

maciejmajek merged commit 2a2c613 into development Apr 1, 2025
5 checks passed

maciejmajek deleted the feat/add-id-hrimessage branch April 1, 2025 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add conversation id to HRI message #480

feat: add conversation id to HRI message #480

rachwalk commented Mar 25, 2025

maciejmajek Mar 27, 2025

rachwalk Mar 27, 2025

rachwalk Mar 27, 2025 •

edited

Loading

maciejmajek Mar 27, 2025

rachwalk Mar 31, 2025

maciejmajek Mar 31, 2025

maciejmajek Mar 31, 2025

rachwalk Mar 31, 2025

rachwalk Mar 31, 2025 •

edited

Loading

maciejmajek Mar 31, 2025

feat: add conversation id to HRI message #480

feat: add conversation id to HRI message #480

Conversation

rachwalk commented Mar 25, 2025

Purpose

Proposed Changes

Issues

Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachwalk Mar 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachwalk Mar 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachwalk Mar 27, 2025 •

edited

Loading

rachwalk Mar 31, 2025 •

edited

Loading