-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
HHH-19708 prototype support for read/write replicas #10754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
73f9ced
to
7b4feea
Compare
Hello Gavin, I didn’t check the whole PR, but this reminded me of a (IMO) common mistake touted as a best practice in the Hibernate/Spring ecosystem. Often, the replica is udpated asynchronously. In this case, binding the same SessionFactory to the replica and the main database can cause inconsistencies related to the second level cache. You can take a look at https://stackoverflow.com/a/69183807/3761154 for more context. |
Mmmm, yeah, that's a good point. You would sorta have to disable second-level cache puts from a read-only session. Not obvious to me that you can do much more than that, though. |
I think this is not enough. https://github.com/reda-alaoui/ro-rw-routing/tree/af4502dffd266b8457f0f5f49670c461d7a9a399 demonstrates that the cache can contain entities existing on the main (therefore populated from main) but not yet on the replica. A query using the replica might rely on a mix of replica data and cached data not existing yet on the replica. |
E.g. Entity A depends on Entity B. Entity A is marked cacheable, not B. RW creates A + B. RO transaction loads A from the cache, then tries to read B directly from the lagging RO database. B is not found by RO transaction because B is not yet in |
I think a reasonable assumption is that the read-only replica satisfies the referential integrity constraints. That is, that foreign keys point to rows that exist in the replica. If that's not the case, I would say that an error is expected. |
If one is really concerned about the situation you describe, then set the read-only session to |
In my example, the referencial integrity is satisfied by main and replica:
Disabling the second level cache on RO side is a solution. But in our case, we wanted to keep it for performance reasons. All of this to say that my solution was to have one SessionFactory per database, one for I think it would be nice to document those gotchas somewhere. |
I don't see how that could work. An item in the second-level cache for the replica is not going to be evicted when data is updated in the replica (unless you have some other infrastructure listening for replication events from the database and evicting the cache in response). |
You are correct. I forgot this part, it's been a while. I think we made the separation and disabled the long term caching on the replica side. I suppose we couldn't use |
I guess I would just say that replication and second-level caching are things that simply don't work very well-together unless you know what you're doing and are very careful. But if you do know what you're doing, and if you are careful, I think you could get some mileage out of the combining the two. |
hibernate-core/src/main/java/org/hibernate/internal/AbstractSharedSessionContract.java
Show resolved
Hide resolved
Will this work help in any way to handle it all at the platform level (Quarkus/WildFly/Spring/...)? Because if the platform level will leverage a completely different mechanism, I'm afraid we're creating work (initial, and maintenance) for ourselves for little reason... It's not like using Hibernate outside of any platform is a recommended use case. |
In principle, yes.
I completely disagree with this statement. "Standalone operation" is a required feature of JPA and is widely used. |
1. allow a session to be created in a read-only mode 2. pass that mode through to the MultiTenantConnectionProvider
…replica This is better than throwing, because you might be using: - JDBC driver-level support for replicas, together with - true multi-tenancy
…ptions motivation for this in Javadoc
in case the MultiTenantConnectionProvider needs to access e.g. the TenantSchemaMapper
There are two scenarios contemplated here:
Connection.setReadOnly(true)
as a hint (this, as I understand it, is the case for MySQL)DataSource
(this, I believe, is the case for Postgres and Oracle)[Note that I'm new to all this and I might be misunderstanding something.]
In the second case we can't really do a whole lot better than just asking the user to write custom logic to select an appropriate source of connections, similar to what we ask them to do with database-based multi-tenancy, and so let's just piggy-back this off the
MultiTenantConnectionProvider
.Of course, I would prefer all this to be handled at the platform level (i.e. Quarkus/WildFly) but I now realize that that's going to require a lot of coordination, and doesn't help people using Hibernate in standalone mode. So let's do something ourselves.
MultiTenantConnectionProvider
, allowing it to select a read-only replicaConnection.setReadOnly(true)
if appropriateThis is a sort-of "simplest possible" approach.
This new kind of read-only mode differs from
setDefaultReadOnly()
in that it's immutable. It does imply the previous sort of readonliness, but it also implies:Connection.setReadOnly(true)
, and thatMultiTenantConnectionProvider
is available, a connection to a read-only replica will be obtained.One could argue that readonliness is an aspect of the transaction, not of the session, but given the complicated relationship between sessions and connections, I think this is probably more robust.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.
https://hibernate.atlassian.net/browse/HHH-19708