Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: [Feature Branch] Vector Store Logging #10865

Open
wants to merge 12 commits into
base: feature-vector-data-telemetry
Choose a base branch
from

Conversation

dmytrostruk
Copy link
Member

@dmytrostruk dmytrostruk commented Mar 9, 2025

Motivation and Context

Related: #10596

This PR enhances the vector store functionality by adding builder pattern implementations and logging decorators for key vector data interfaces, along with supporting utilities and DI integration.

Interfaces handled:

  • IKeywordHybridSearch<TRecord>
  • IVectorizableTextSearch<TRecord>
  • IVectorizedSearch<TRecord>
  • IVectorStore
  • IVectorStoreRecordCollection<TKey, TRecord>

Class types added:

  • Builders: implement a pipeline pattern with Use and Build methods.
  • Logging Extensions: provide reusable logging methods for tasks and enumerables, handling success, failure and cancellation.
  • Logging Decorators: wrap the interfaces to add logging by using LoggingExtensions reusable logging methods.
  • Builder Extensions: Add AsBuilder methods to convert service instances into builders.
  • Logging Builder Extensions: Add UseLogging methods to integrate logging into builder pipelines.
  • Service Collection Extensions: Add Add{T} and AddKeyed{T} methods for DI registration with configurable lifetimes.

Other changes:

  • Split Verify class into Verify and KernelVerify in order to be able to use Verify methods in Microsoft.Extensions.VectorData package without a reference to Semantic Kernel specific logic, like ValidPluginName, ValidFunctionName etc.
  • Added unit tests for new classes and methods.
  • Added usage example in Concepts/Memory folder.

Usage example

Logging with manual registration:

var vectorStore = new InMemoryVectorStore()
    .AsBuilder()
    .UseLogging(this.LoggerFactory)
    .Build();

Logging with DI:

serviceCollection.AddInMemoryVectorStore();

serviceCollection
    .AddVectorStore(s => s.GetRequiredService<InMemoryVectorStore>())
    .UseLogging(this.LoggerFactory);

Contribution Checklist

@dmytrostruk dmytrostruk self-assigned this Mar 9, 2025
@dmytrostruk dmytrostruk requested a review from a team as a code owner March 9, 2025 23:01
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel kernel.core labels Mar 10, 2025
@github-actions github-actions bot changed the title [Feature Branch] .Net: Vector Store Logging .Net: [Feature Branch] .Net: Vector Store Logging Mar 10, 2025
@dmytrostruk dmytrostruk changed the title .Net: [Feature Branch] .Net: Vector Store Logging .Net: [Feature Branch] Vector Store Logging Mar 10, 2025
Copy link
Member

@markwallace-microsoft markwallace-microsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some documentation updates (I didn't comment everywhere) and the question about adding the same instance twice.


// Register InMemoryVectorStore with enabled logging.
serviceCollection
.AddVectorStore(s => s.GetRequiredService<InMemoryVectorStore>())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate. I wonder if we should just change AddInMemoryVectorStore to return the builder, or add an overload like AddInMemoryVectorStoreBuilder(), which returns the builder instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but connector developers will need to make sure that they have similar extension method available on their side. I think that method like AddVectorStore is also useful for cases when there are no extension methods available for specific connector/abstraction implementation. There is an overload for this method to simply accept IVectorStore instance.

For connectors that exist on our side at the moment, we can add an additional extension method to operate with builder, but I would probably do that in a separate PR to minimize the number of changes in this one. Please let me know what you think, thanks!

namespace Memory;

/// <summary>
/// A simple example showing how to ingest data into a vector store and then use vector search to find related records to a given string
Copy link
Contributor

@westey-m westey-m Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do a sample around what this looks like for HybridSearch and VectorizableTextSearch.
Also, I think we may have some issues with cases where someone just created a VectorStoreRecordCollection instance and they want to add logging to it, and then they want to cast it to a VectorizableTextSearch and start searching and they expectation would be that logging would be there for it too...

It's almost like we might need a single decorator class that implements all the interfaces and contains all logging, although that has drawbacks too, in that someone can cast to an interface that the decorator has but the underlying implementation doesn't support.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do a sample around what this looks like for HybridSearch and VectorizableTextSearch.

Yes, I'm definitely going to add that. It would be nice if we add an example for HybridSearch first, so then I can re-use that example to show how to enable logging, if possible :)

I think we may have some issues with cases where someone just created a VectorStoreRecordCollection instance and they want to add logging to it, and then they want to cast it to a VectorizableTextSearch and start searching and they expectation would be that logging would be there for it too

Do we think that casting would be a common approach here? Let's imagine we have an ASP.NET Web API application, and we have a service, that performs vector search using text. I would expect that this service will accept IVectorizableTextSearch<MyRecord> in constructor and it's a matter of configuration which instance of that interface is going to be passed in the service and whether it supports logging or not. Same for other interfaces from VectorData.

In case of VectorStoreRecordCollection which implements multiple interfaces, I think that user registers specific vector db collection once (like RedisVectorStoreRecordCollection), and then use it to register IVectorStoreRecordCollection, IVectorizableTextSearch and wrap each registration to support logging. I think it's really flexible approach that user can control where exactly logging or any other decoration is needed, but of course we can provide even more extension methods for easier usage (for example, the extension method that will register IVectorStore, IVectorStoreRecordCollection and IVectorizableTextSearch with already enabled logging and so on).

Please let me know what you think about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants