forked from ClickHouse/ClickHouse
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH-178]Support cache on locals for remote hdfs files #203
Draft
lgbo-ustc
wants to merge
656
commits into
Kyligence:clickhouse_backend
Choose a base branch
from
bigo-sg:local_cache
base: clickhouse_backend
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Backport ClickHouse#32270 to 21.9: Fix possible Pipeline stuck in case of StrictResize processor.
fix setcap in docker (cherry picked from commit 42787cf)
Backport ClickHouse#31656 to 21.9: Fix use quota bug
Backport ClickHouse#32117 to 21.9: Dictionaries custom query condition fix
Backport ClickHouse#32359 to 21.9: Fix usage of non-materialized skip indexes
…ame fixedstring
Backport ClickHouse#31859 to 21.9: keeper session timeout doesn't work
Backport ClickHouse#27822 to 21.9: Fix data race in ProtobufSchemas
Backport ClickHouse#32201 to 21.9: Try fix 'Directory tmp_merge_<part_name>' already exists
Backport ClickHouse#32755 to 21.9: fix crash fuzzbits with multiply same fixedstring
* support sort op * fixed null order * fixed null ordering
* add JNIEXPORT and JNICALL * add a concurrentMap implementation * add reserve no exception * Revert "add JNIEXPORT and JNICALL" This reverts commit 24f3f71. * add reserve no exception * change reserve function
* support count(*) support count(*)/count(1) * fixed code style * update variables' names
…ouse#181) Support non-HA mode for ClickHouse reading from HDFS. Close ClickHouse#180 .
…ncat/instr/char_length/replace/abs/chr/ceil/floor/exp/power (ClickHouse#172) * add functions concat/char_length/instr * drop functions related with clickhouse/clickhouse repo * add function abs/chr/ceil/floor/exp/power/pmod * adject function order * swap args of function replace
…se#163) * support calculate backing length of different types * remove comment * rename symbols * apply BackingDataLengthCalculator * support decimal from ch column to spark row * fix decimal issue in ch column to spark row * refactor SparkRowInfo * fix building error * wip * implement demo * dev map * finish map and tuple * fix building error * finish writer dev * fix code style * ready to improve spark row to ch column * wip * finish array/map/tuple reader * fix building error * add some uts * finish debug * commit again * finish plan convert * add benchmark * improve performance * try to optimize spark row to ch column * continue * optimize SparkRowInfo::SparkRowInfo * wrap functions * improve performance * improve from 360ms to 240 ms * finish optimizeing performance * add benchmark for BM_SparkRowTOCHColumn_Lineitem * refactor spark row reader * finish tests * revert cmake * fix code style * fix code style * fix memory leak * fix build error * fix building error in debug mode * add test data file * add build type, convert ch type to substrait type * refactor jni interface: native column type * fixbug of decimal * replace decimal.parquet * add data array.parquet * add test data map.parquet * add test data file * finish debug * wip * fix logging * fix address problem * fix core dump * fix code style * throw exception when complex types in substrait plan is in nullable * make ch complex type nullable * support nullable complex types * add tests for parquet nullable * add uts for all types * debug gtest_parquet_read * fix issue: Kyligence#166 * remove stdout log * fix bug of binary null * remove logs * remove useless files
* support more math functions * rename some functions * add debug logging * revert log level * support function greatest and least * support cast binary * support quarter
* add prewhere support * ignore delta directory * fix prewhere parse error when has in funciton * fix is_not_null result type error
* [CH-190] enable tests in GlutenDataFrameAggregateSuite * [CH-190] fix review comments
* close Kyligence#197 * fix gtest build error
lgbo-ustc
force-pushed
the
local_cache
branch
from
November 18, 2022 09:28
0c8cf11
to
2aa8049
Compare
Can one of the admins verify this patch? |
lgbo-ustc
force-pushed
the
local_cache
branch
3 times, most recently
from
November 18, 2022 10:20
fa89bd0
to
920d901
Compare
lgbo-ustc
force-pushed
the
local_cache
branch
from
November 22, 2022 03:54
cc00a81
to
87af499
Compare
lgbo-ustc
force-pushed
the
local_cache
branch
from
November 22, 2022 04:00
87af499
to
70abaff
Compare
liuneng1994
force-pushed
the
clickhouse_backend
branch
from
April 25, 2023 06:43
6528ff0
to
52be833
Compare
lwz9103
force-pushed
the
clickhouse_backend
branch
2 times, most recently
from
May 26, 2023 03:43
dc60d55
to
8066113
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add cache on the local disk for remote hdfs files, which can be used directly for later query.
Need to set the following configurations
spark.gluten.sql.columnar.backend.ch.runtime_conf.runtime_settings.use_local_cache_for_remote_storage
default is false
spark.gluten.sql.columnar.backend.ch.runtime_conf.local_cache_for_remote_fs.root_dir
default is "local_cache_root"
spark.gluten.sql.columnar.backend.ch.runtime_conf.local_cache_for_remote_fs.limit_size
default is 10G