Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three PRs merged to debug the sigabort #1707

Draft
wants to merge 27 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
5cb6a0e
Backup in the middle.
joka921 Jan 7, 2025
e356ee1
Add some parsing and add some thoughts.
joka921 Jan 7, 2025
fc20174
Also implement NOT EXISTS
joka921 Jan 7, 2025
dde296b
Fix a small warning, to feed this to the tool.
joka921 Jan 7, 2025
0d1c788
Some cleanups and fixes.
joka921 Jan 8, 2025
7ff49c9
Fix compilation.
joka921 Jan 8, 2025
7ec8947
Fix the many many segfaults.
joka921 Jan 8, 2025
c03f3e5
Fix another bug.
joka921 Jan 8, 2025
2da52ab
Fix another bug.
joka921 Jan 8, 2025
cbbc771
Fix another bug.
joka921 Jan 8, 2025
91e5802
blub.
joka921 Jan 8, 2025
c3a9a7d
Added some more tests.
joka921 Jan 8, 2025
0adbfa6
Add some tests at least for the parser and query planner.
joka921 Jan 8, 2025
babd294
Some more tests.
joka921 Jan 9, 2025
6766af3
Added some comments.
joka921 Jan 9, 2025
f2524a8
Merge branch 'master' into exists
joka921 Jan 9, 2025
3a574ea
This is commented and very clean.
joka921 Jan 9, 2025
256e38a
In the middle of patching these things.
joka921 Jan 9, 2025
f29efc6
Also account for the filters when counting the subgraphs.
joka921 Jan 9, 2025
740d186
Added some comments.
joka921 Jan 9, 2025
2050af8
A small fix.
joka921 Jan 9, 2025
5294357
Made a pass over `ExistsJoin.h` and `ExistsJoin.cpp`
Jan 10, 2025
982cff7
Clean up, this should work with a reasonable threshold for the query-…
joka921 Jan 10, 2025
eced22b
A first working version of this values thing.There is a lot to do:
joka921 Jan 10, 2025
70319f7
Try out if that fixes the error...
joka921 Jan 10, 2025
da59709
Merge branch 'add-values-to-union' of github.com:joka921/qlever into …
joka921 Jan 10, 2025
c1a6203
Merge branch 'query-planner-better-filters' of github.com:joka921/qle…
joka921 Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions src/engine/Bind.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,22 @@
#include "Bind.h"

#include "engine/CallFixedSize.h"
#include "engine/ExistsJoin.h"
#include "engine/QueryExecutionTree.h"
#include "engine/sparqlExpressions/SparqlExpression.h"
#include "engine/sparqlExpressions/SparqlExpressionGenerators.h"
#include "util/ChunkedForLoop.h"
#include "util/Exception.h"

// _____________________________________________________________________________
Bind::Bind(QueryExecutionContext* qec,
std::shared_ptr<QueryExecutionTree> subtree, parsedQuery::Bind b)
: Operation(qec), _subtree(std::move(subtree)), _bind(std::move(b)) {
_subtree = ExistsJoin::addExistsJoinsToSubtree(
_bind._expression, std::move(_subtree), getExecutionContext(),
cancellationHandle_);
}

// BIND adds exactly one new column
size_t Bind::getResultWidth() const { return _subtree->getResultWidth() + 1; }

Expand Down
6 changes: 3 additions & 3 deletions src/engine/Bind.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@
#include "engine/sparqlExpressions/SparqlExpressionPimpl.h"
#include "parser/ParsedQuery.h"

/// BIND operation, currently only supports a very limited subset of expressions
// BIND operation.
class Bind : public Operation {
public:
static constexpr size_t CHUNK_SIZE = 10'000;

// ____________________________________________________________________________
Bind(QueryExecutionContext* qec, std::shared_ptr<QueryExecutionTree> subtree,
parsedQuery::Bind b)
: Operation(qec), _subtree(std::move(subtree)), _bind(std::move(b)) {}
parsedQuery::Bind b);

private:
std::shared_ptr<QueryExecutionTree> _subtree;
Expand Down
2 changes: 1 addition & 1 deletion src/engine/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,5 @@ add_library(engine
CartesianProductJoin.cpp TextIndexScanForWord.cpp TextIndexScanForEntity.cpp
TextLimit.cpp LazyGroupBy.cpp GroupByHashMapOptimization.cpp SpatialJoin.cpp
CountConnectedSubgraphs.cpp SpatialJoinAlgorithms.cpp PathSearch.cpp ExecuteUpdate.cpp
Describe.cpp)
Describe.cpp ExistsJoin.cpp)
qlever_target_link_libraries(engine util index parser sparqlExpressions http SortPerformanceEstimator Boost::iostreams s2)
160 changes: 148 additions & 12 deletions src/engine/CheckUsePatternTrick.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,22 +82,157 @@
});
}

using ValuesClause = std::optional<parsedQuery::Values>;
// TODO<joka921> How many possible return values do we need here.
bool addValuesClauseToPattern(parsedQuery::GraphPatternOperation& operation,
const ValuesClause& clause);

// __________________________________________________________________________
void addValuesClause(ParsedQuery::GraphPattern& graphPattern,
const ValuesClause& values, bool recurse) {
// TODO<joka921> Do we want to do this, or do we only want this if the values
// clause hasn't been handled downstream.
/*
bool containedInFilter = ql::ranges::any_of(
graphPattern._filters, [&values](const SparqlFilter& filter) {
return ql::ranges::any_of(
values._inlineValues._variables, [&filter](const Variable& var) {
return filter.expression_.isVariableContained(var);
});
});
*/
[[maybe_unused]] const bool containedInFilter = false;
auto check = [&](parsedQuery::GraphPatternOperation& op) {
return addValuesClauseToPattern(op, values);
};
// TODO<joka921> We have to figure out the correct positioning of the values
// clause, s.t. we don't get cartesian products because of optimization
// barriers like bind/Optional/Minus etc.
std::optional<size_t> insertPosition;
if (values.has_value()) {
for (const auto& [i, pattern] :
::ranges::views::enumerate(graphPattern._graphPatterns)) {
if (check(pattern)) {
insertPosition = i;
}
}
}

if (!recurse) {
return;
}
if (insertPosition.has_value()) {
graphPattern._graphPatterns.insert(
graphPattern._graphPatterns.begin() + insertPosition.value(),
values.value());
}

Check warning on line 128 in src/engine/CheckUsePatternTrick.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/CheckUsePatternTrick.cpp#L125-L128

Added lines #L125 - L128 were not covered by tests

std::vector<ValuesClause> foundClauses;
for (const auto& pattern : graphPattern._graphPatterns) {
if (auto* foundValues = std::get_if<parsedQuery::Values>(&pattern)) {
foundClauses.push_back(*foundValues);
}
}
for (const auto& foundValue : foundClauses) {
addValuesClause(graphPattern, foundValue, false);
}

if (foundClauses.empty()) {
for (auto& pattern : graphPattern._graphPatterns) {
addValuesClauseToPattern(pattern, std::nullopt);
}
}
}

// __________________________________________________________________________
bool addValuesClauseToPattern(parsedQuery::GraphPatternOperation& operation,
const ValuesClause& result) {
auto check = [&](parsedQuery::GraphPattern& pattern) {
addValuesClause(pattern, result);
return false;
};
const std::vector<Variable> emptyVars{};
const auto& variables =
result.has_value() ? result.value()._inlineValues._variables : emptyVars;
auto anyVar = [&](auto f) { return ql::ranges::any_of(variables, f); };
return operation.visit([&](auto&& arg) -> bool {
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<T, p::Optional> ||
std::is_same_v<T, p::GroupGraphPattern> ||
std::is_same_v<T, p::Minus>) {
return check(arg._child);
} else if constexpr (std::is_same_v<T, p::Union>) {
check(arg._child1);
check(arg._child2);
return false;
} else if constexpr (std::is_same_v<T, p::Subquery>) {
// Subqueries always are SELECT clauses.
const auto& selectClause = arg.get().selectClause();

if (anyVar([&selectClause](const auto& var) {
return ad_utility::contains(selectClause.getSelectedVariables(),
var);
})) {
return check(arg.get()._rootGraphPattern);

Check warning on line 176 in src/engine/CheckUsePatternTrick.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/CheckUsePatternTrick.cpp#L173-L176

Added lines #L173 - L176 were not covered by tests
} else {
// Also recurse into the subquery, but not with the given `VALUES`
// clause.
addValuesClause(arg.get()._rootGraphPattern, std::nullopt);
return false;
}
} else if constexpr (std::is_same_v<T, p::Bind>) {
return ql::ranges::any_of(variables, [&](const auto& variable) {
return ad_utility::contains(arg.containedVariables(), variable);
});

Check warning on line 186 in src/engine/CheckUsePatternTrick.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/CheckUsePatternTrick.cpp#L185-L186

Added lines #L185 - L186 were not covered by tests
} else if constexpr (std::is_same_v<T, p::BasicGraphPattern>) {
return ad_utility::contains_if(
arg._triples, [&](const SparqlTriple& triple) {
return anyVar([&](const auto& variable) {
return (triple.s_ == variable ||
// Complex property paths are not allowed to contain
// variables in SPARQL, so this check is sufficient.
// TODO<joka921> Still make the interface of the
// `PropertyPath` class typesafe.
triple.p_.asString() == variable.name() ||
triple.o_ == variable);
});

Check warning on line 198 in src/engine/CheckUsePatternTrick.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/CheckUsePatternTrick.cpp#L198

Added line #L198 was not covered by tests
});
} else if constexpr (std::is_same_v<T, p::Values>) {
return anyVar([&](const auto& variable) {
return ad_utility::contains(arg._inlineValues._variables, variable);
});
} else if constexpr (std::is_same_v<T, p::Service>) {
return anyVar([&](const auto& variable) {
return ad_utility::contains(arg.visibleVariables_, variable);
});

Check warning on line 207 in src/engine/CheckUsePatternTrick.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/CheckUsePatternTrick.cpp#L206-L207

Added lines #L206 - L207 were not covered by tests
} else {
static_assert(
std::is_same_v<T, p::TransPath> || std::is_same_v<T, p::PathQuery> ||
std::is_same_v<T, p::Describe> || std::is_same_v<T, p::SpatialQuery>);
// TODO<joka921> This is just an optimization, so we can always just omit
// it, but it would be nice to also apply this optimization for those
// types of queries.
return false;
}
});
}

// Internal helper function.
// Modify the `triples` s.t. the patterns for `subAndPred.subject_` will appear
// in a column with the variable `subAndPred.predicate_` when evaluating and
// joining all the triples. This can be either done by retrieving one of the
// additional columns where the patterns are stored in the PSO and POS
// permutation or, if no triple suitable for adding this column exists, by
// adding a triple `?subject ql:has-pattern ?predicate`.
// Modify the `triples` s.t. the patterns for `subAndPred.subject_` will
// appear in a column with the variable `subAndPred.predicate_` when
// evaluating and joining all the triples. This can be either done by
// retrieving one of the additional columns where the patterns are stored in
// the PSO and POS permutation or, if no triple suitable for adding this
// column exists, by adding a triple `?subject ql:has-pattern ?predicate`.
static void rewriteTriplesForPatternTrick(const PatternTrickTuple& subAndPred,
std::vector<SparqlTriple>& triples) {
// The following lambda tries to find a triple in the `triples` that has the
// subject variable of the pattern trick in its `triplePosition` (which is
// either the subject or the object) and a fixed predicate (no variable). If
// such a triple is found, it is modified s.t. it also scans the
// `additionalScanColumn` which has to be the index of the column where the
// patterns of the `triplePosition` are stored in the POS and PSO permutation.
// Return true iff such a triple was found and replaced.
// patterns of the `triplePosition` are stored in the POS and PSO
// permutation. Return true iff such a triple was found and replaced.
auto findAndRewriteMatchingTriple = [&subAndPred, &triples](
auto triplePosition,
size_t additionalScanColumn) {
Expand Down Expand Up @@ -133,8 +268,9 @@
// Check if any of the triples in the `graphPattern` has the form `?s
// ql:has-predicate ?p` or `?s ?p ?o` and that the other conditions for the
// pattern trick are fulfilled (nameley that the variables `?p` and if present
// `?o` don't appear elsewhere in the `parsedQuery`. If such a triple is found,
// the query is modified such that it behaves as if the triple was replace by
// `?o` don't appear elsewhere in the `parsedQuery`. If such a triple is
// found, the query is modified such that it behaves as if the triple was
// replace by
// `?s ql:has-pattern ?p`. See the documentation of
// `rewriteTriplesForPatternTrick` above.
static std::optional<PatternTrickTuple> findPatternTrickTuple(
Expand Down Expand Up @@ -183,8 +319,8 @@
}

// We currently accept the pattern trick triple anywhere in the query.
// TODO<joka921> This loop can be made much easier using ranges and view once
// they are supported by clang.
// TODO<joka921> This loop can be made much easier using ranges and view
// once they are supported by clang.
for (auto& pattern : parsedQuery->children()) {
auto* curPattern = std::get_if<p::BasicGraphPattern>(&pattern);
if (!curPattern) {
Expand Down
6 changes: 6 additions & 0 deletions src/engine/CheckUsePatternTrick.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,10 @@ bool isVariableContainedInGraphPatternOperation(
const parsedQuery::GraphPatternOperation& operation,
const SparqlTriple* tripleToIgnore);

// __________________________________________________________________________
void addValuesClause(
ParsedQuery::GraphPattern& graphPattern,
const std::optional<parsedQuery::Values>& values = std::nullopt,
bool recurse = true);

} // namespace checkUsePatternTrick
Loading
Loading