-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CTE support to select in QueryBuilder #6621
base: 4.3.x
Are you sure you want to change the base?
Conversation
@nio-dtp, thanks for the PR. As this is a new feature, please retarget against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a very basic unit test that would demonstrate the expected SQL and an integration test that would run the resulting query on the platforms that support CTE.
I'm curious to see how this will work with query parameters (both positional and named).
2936e39
to
686bf02
Compare
79cf2fb
to
1e545fc
Compare
@morozov Thanks for the review. I've added some tests and a I'm not sure if it should check the platform supports in the QueryBuilder or if it is enough to mention in the documentation that it is not supported for deprecated mysql 5.7 |
src/Query/QueryBuilder.php
Outdated
|
||
private function hasCTEs(): bool | ||
{ | ||
return 0 < count($this->with); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return 0 < count($this->with); | |
return $this->with !== []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the Yoda-style condition looks odd but the !== []
looks like a PHP-ism.
Generally, in order to compare two arrays, one first needs to compare their lengths and, if they are equal, iterate both arrays until a non-equal element is found.
In this case, checking if count($array) === 0
looks like a more meaningful alternative because we don't really need to the full-fledged/generic comparison algorithm.
We try to avoid those If we did that, what would be the worst this that could happen? MySQL 5.7 users will get a syntax error if and only if they try to run a query with a CTE. I could live with that. If we really want a nicer error message for MySQL 5.7 users, we could move the Either way: Please remove the |
Thanks @nio-dtp for providing this PR. ❤️ Love to see that other has the same needs and staying on the DBAL way to implement this. Just let me add my five cents to this. First, I would remove the platform support cte methods as @derrabus already mentioned and asked for. In the end, we would end up with multiple support methods if we want to take care beforehand of support questions, because even a lot of vendors supports CTE in general, they have dirty little differences. I'm here more for let the database report if it does not understand something and that it needs to be done from the application then. DBAL should here only provide a generic way to let WITH (CTE) syntax in general created and used. The second point I want to put a light on is the current implementation of the Regarding the WITH
baseCTE AS (SELECT id, title FROM tableName),
secondCTE AS (SELECT id, title FROM baseCTE where id < 1000)
SELECT * from secondCTE If public function with(string $name, string|\Stringable|self $part): self
{
$this->with = [];
$this->with[] = [ /* adding first part */ ];
}
public function addWith(string $name, string|\Stringable|self $part, string ...$dependsOn): self
{
$this->with[] = [
/* adding additional part */
];
} The current implemetation would already allow to add a WITH RECURSIVE
recursiveCTE AS (
-- initial query
SELECT id, parentId, title FROM table_a WHERE parentId = 0
UNION
-- recursive query
SELECT b.id, b.parentId, b.title
FROM table_b AS b
WHERE b.parentId > 0 AND b.parentId = recursiveCTE.id
)
SELECT * FROM recursiveCTE With recursive CTE the whole thing gets a little more nifty, as different vendors requires or allow lazyness on different levels. For example:
WITH RECURSIVE
recursiveCTE(virtual_id, virtual_title, virtual_parentid) AS (
-- initial query
SELECT
id AS virtual_id,
title AS virtual_title,
parentId AS virtual_parent_id
FROM table_a WHERE parentId = 0
UNION
-- recursive query
SELECT
b.id AS virtual_id,
b.title AS virtual_title,
b.parentId AS virtual_parentid
FROM table_b AS b
WHERE b.parentId > 0 AND b.parentId = recursiveCTE.virtual_id
)
SELECT * FROM recursiveCTE That is not fully possible with the current implementation. So we could consider
In any case, the From a internal handling perspective, I suggest to introduce a internal class for cte parts similar to the internal Join part class along with the options required (name, field, depends, isRecursive etc) which would allow to omit a dedicated flag and later iterate the array and add the RECURSIVE keywoard as soon as at least one part has it set. Or a additional container around the parts could be implemented, which internal tracks that when a recursive part gets added. I would not make that class public API. With such a To be honest, I did not investigated yet which DBAL supported platforms allows RECURSIVE cte's on top of normal CTE's and which not and this was still outstanding. I started a custom implementation a couple of months ago [1], but delayed it due to time constraints and implemented it as a custom solution within the TYPO3 decorated QueryBuilder (prefixed) as a test-baloon [2] as we needed it for the release after implementing a first (quite advanced) usage [3] To summerize this, this PR is a good start but in my eyes not finished yet (without wanting to blame it), because in general it is the same I started with before adding the additional stuff. To be honest, in my working state I had also the support methods but already considered to drop them again before making a pull-request out of it. I propose that we clarify first what Doctrine DBAL wants to support and what not out of all these things and than where to continue. Either (if @nio-dtp) is open to adopt it and continue or if I should polish and finish my work (which is alrady some steps further) and adopt to the decisions made (dependency sorting or not, internal class usage or not, ...) Suggested method(s): /**
* @param non-empty-string[] $fields
*/
public function with(
string $identifier,
string|\Stringable|QueryBuilder $part,
array $fields = [],
WithType $type = WithType::SIMPLE, // WithType::RECURSIVE for recursive part
): self {}
/**
* @param non-empty-string[] $dependsOn
* @param non-empty-string[] $fields
*/
public function addWith(
string $identifier,
string|\Stringable|QueryBuilder $part,
array $dependsOn = [],
array $fields = [],
WithType $type = WithType::SIMPLE, // WithType::RECURSIVE for recursive part
): self {} Developers implementing a recursive CTE needs to use a dedicated QueryBuilder instance to build a union query using the union support to define it, if not done the database should report this as an issue and not tried to scan or throw a custom exception from doctrine. Recursive CTE's have been the reason why I started and contributed the union support for the QueryBuilder as a preparation for providing a CTE implementation. In my POC/WIP i extendted the I just would not merge this one here to quickly before considering the aforementioned points, at least for a overall strategy and either making it directly or with followups.
|
src/Query/QueryBuilder.php
Outdated
@@ -342,7 +350,7 @@ public function getSQL(): string | |||
QueryType::INSERT => $this->getSQLForInsert(), | |||
QueryType::DELETE => $this->getSQLForDelete(), | |||
QueryType::UPDATE => $this->getSQLForUpdate(), | |||
QueryType::SELECT => $this->getSQLForSelect(), | |||
QueryType::SELECT => $this->getSQLForCTEs() . $this->getSQLForSelect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the CTE query builders have parameters? It looks like the parameter needs to be declared in the CTE builder but bound to the top-level one. This is non-obvious and potentially unusable. If the CTE builder contains bound parameters, they will be silently ignored.
As an end user, I'd expect that the CTE builder defines both the query and parameters, and they are taken into account by the top-level one. This way, the builders are naturally composable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, parameters have to be defined in the top QueryBuilder. In case of CTE that one building the WITH chain (in most cases). In case, it would be used as a sub-query it needs to be done on the main query using the CTE as sub-query.
That is already the case for using multiple QueryBuilder instances to create queries with sub-selects and also using the union support with sub QueryBuilder instances.
It would be the same requrirement here, and I would expect that it works here the very same as all the other parts. Developers have to take care to define parameters on the correct instance in all these cases.
Thanks you to all the reviewers and their detailed feedback, I'll rework my PR and make a more complete proposal.
|
I'd improve the following aspects:
As for the "reset" method – what is the use case for it? It would basically allow to disregard the logic of a pre-built query. If it wasn't necessary, why declare it in the first place? |
I talked with @derrabus on sunday about this and came up with following using 4 methods: /**
* @param string[] $fields
*/
public function with(
string $name,
string|\Stringable|self $part,
array $fields = [],
): self {}
/**
* @param string[] $fields
*/
public function addWith(
string $name,
string|\Stringable|self $part,
array $fields = [],
): self {}
/**
* @param string[] $fields
*/
public function withRecursive(
string $name,
string|\Stringable|self $initialOrUnionPart,
string|\Stringable|self|null $recursivePart,
array $fields = [],
): self {}
/**
* @param string[] $fields
*/
public function addRecursive(
string $name,
string|\Stringable|self $initialOrUnionpart,
string|\Stringable|self|null $recursivePart,
array $fields = [],
): self {} If a UNION query is passed as the first requried part for the recursive vairants, it is simply used. If both parts are passed, internally a union query should be created (using the union querybuilder api without allowing duplicates). Should be explained within the method phpdocblocks and by providing a concrete single part the developer has full power about the recursive union block. to follow semantic and logic similar to with or withRecursive will reset the internal array and create a first element, whereas the addWith* methods adds an additionall entry to the internal array (DTO object). The We think that a dedicated It should be possible to define the fields for the CTE part, but having that optional: WITH
customCte(virtual_id, virtual_field)
AS (SELECT id AS virtual_id, somefield AS virtual_field FROM sometable)
SELECT * from customCte Regarding my point for the `depends, in special for the recursive CTE's we discussed and decided to not provide an API in this low-level implementation. Developers or frameworks (for example ORM or similar) should keep track on there own and add the parts in the required and correect order. No custom sorting or cylcing detection, will be reported by the database when executed. We had no hard meaning for the internal implementation and building the query, so the current switch form may be okay. Personally, I think it would make more sense to move that into the DefaultSelectQueryBuilder and pass the with array (of DTO's) within the with/addWith and withRecursive/addWithRecursive naming would follow the semantic Doctrine already has for the other parts, exceopt that the *Recursive variant makes it more clear. Sure, PHPDoc block needs to make that clear. In the end, it kind of follows the design approach of the QueryBuilder. And it is noticable that the top (most outer) querybuilder instance (in most cases that instance where the with/addWith/withRecursive/addWithRecursive methods are used) needs to be used for creating named placeholders (parameters). That matches the same requriement and flow as it is needed for using QueryBuilder to create sub queries or for the union support already and can therefore be taken as expectable. (Remark to #6621 (comment)) Taking this, we could make this one a first implementation for the My question here is, if @nio-dtp wants to update and work on this and has the time for it the next time. Otherwise I would rebase and finish my work in my fork and provide an additional PR in the next two weeks (which was a start after an original pitch and discussion with @derrabus but not added as PR(draft pr) yet due to time constraints). I'm totally fine not to do anything and test/verify/review this later on but also being fine finishing mine (because of access) and mention @nio-dtp as co-author then. |
If it is ok for all, I'll make a proposal by the end of the week without the recursive part. |
A small remark on this: I would remove the |
Thanks @nio-dtp, and that is pretty fine. I will add the recursive part afterwards in a follow up PR. |
8c8b185
to
06d6612
Compare
src/Query/QueryBuilder.php
Outdated
public function addWith(string $name, string|QueryBuilder $part, array $fields = []): self | ||
{ | ||
if (count($this->withParts) === 0) { | ||
throw new QueryException('No initial WITH part set, use with() to set one first.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this an exception? Will anything break if this exception is not thrown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it will break nothing. We can accept it and a with part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have only one method with
. I don't see the case when we need to reset the with parts.
src/Query/QueryBuilder.php
Outdated
@@ -1266,7 +1323,14 @@ private function getSQLForSelect(): string | |||
throw new QueryException('No SELECT expressions given. Please use select() or addSelect().'); | |||
} | |||
|
|||
return $this->connection->getDatabasePlatform() | |||
$selectSQL = ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use an array of query parts and implode them at the end. Otherwise, with every concatenation, the previously built string string will be copied to the new one.
src/Query/QueryBuilder.php
Outdated
return $this->connection->getDatabasePlatform() | ||
$selectSQL = ''; | ||
if (count($this->withParts) > 0) { | ||
$selectSQL .= $this->connection->getDatabasePlatform() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the result of $this->connection->getDatabasePlatform()
be assigned to a variable and then reused?
src/Query/WithQuery.php
Outdated
|
||
namespace Doctrine\DBAL\Query; | ||
|
||
final class WithQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this class? How is accepting a WithQuery
different from accepting an array of With
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now there is no difference, I will change the parameter type of the buildSQL method
src/Query/QueryBuilder.php
Outdated
* | ||
* @return $this | ||
*/ | ||
public function with(string $name, string|QueryBuilder $part, array $fields = []): self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the meaning of $name
and the usage of $fields
in PHPDoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be "columns", not "fields". We're not dealing with abstract data structures with fields. We're dealing with relational tables and columns.
src/SQL/Builder/WithSQLBuilder.php
Outdated
|
||
use Doctrine\DBAL\Query\WithQuery; | ||
|
||
interface WithSQLBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be an interface? Could we make it a class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, for now we do not need other implementation of this interface
} | ||
|
||
/** @param string[] $fields */ | ||
private static function fields(array $fields): string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be a separate method and why static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purpose was to make the array_map more readable. And static because it has not to be aware of the context of the class and the callback of the array_map is static too.
I can refactor this if this not acceptable.
$expectedRows = $this->prepareExpectedRows([['id' => 1]]); | ||
$qb = $this->connection->createQueryBuilder(); | ||
|
||
$cteQueryBuilder1 = $this->connection->createQueryBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why $cteQueryBuilder1
if there's no$cteQueryBuilder2
?
$cteQueryBuilder1 = $this->connection->createQueryBuilder(); | ||
$cteQueryBuilder1->select('id') | ||
->from('for_update') | ||
->where($qb->expr()->eq('id', $qb->createNamedParameter(1, ParameterType::INTEGER))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the user still needs to bind parameters defined in the CTE builder to the top-level one. I don't think this behavior is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will produce another test with binding parameters as the top level.
SQL Server does not support ORDER in CTE neither using columns to fetch into declaration. WITH cte_a(virtual_id) AS (SELECT id AS virtual_id FROM table_a ORDER BY id ASC) SELECT * FROM cte_a Should we manage the error or assume that the developer should know that this is not supported ? |
42cd7b4
to
3a04999
Compare
Fixes #5018.
Summary
This pull request introduces support for Common Table Expressions (CTEs) across various database platforms and updates the
QueryBuilder
to utilize this feature.