Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the wrong parsing of jvm parameters in jdbc URL #11

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang3.StringUtils;
import org.apache.hive.service.rpc.thrift.TStatus;
import org.apache.hive.service.rpc.thrift.TStatusCode;
import org.slf4j.Logger;
Expand Down Expand Up @@ -193,12 +194,20 @@ public static JdbcConnectionParams extractURLComponents(String uri, Properties i
}
}

Pattern confPattern = Pattern.compile("([^;]*)([^;]*);?");

// parse hive conf settings
String confStr = jdbcURI.getQuery();
if (confStr != null) {
Matcher confMatcher = pattern.matcher(confStr);
Matcher confMatcher = confPattern.matcher(confStr);
while (confMatcher.find()) {
connParams.getHiveConfs().put(confMatcher.group(1), confMatcher.group(2));
String connParam = confMatcher.group(1);
if (StringUtils.isNotBlank(connParam) && connParam.contains("=")) {
int symbolIndex = connParam.indexOf('=');
connParams
.getHiveConfs()
.put(connParam.substring(0, symbolIndex), connParam.substring(symbolIndex + 1));
}
}
}

Expand Down Expand Up @@ -477,4 +486,4 @@ public static String getCanonicalHostName(String hostName) {
public static boolean isKyuubiOperationHint(String hint) {
return KYUUBI_OPERATION_HINT_PATTERN.matcher(hint).matches();
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码主要是对Hive JDBC连接字符串进行解析,提取其中的相关信息并填充到对象中。以下是一些注释:

  • 第 2 行:导入了一些包。
  • 第 9-10 行:引入了 Apache Commons Lang3 库,用于在检查字符串是否为空时方便使用 StringUtils 工具类。
  • 第 12 行:导入了 Thrift 相关的类。
  • 第 13 行:导入了日志相关的类。
  • 第 16-18 行:该方法通过分析 JDBC URL(Uniform Resource Locator)并返回一个 JdbcConnectionParams 对象实例。
  • 第 20-24 行:基于正则表达式来分析 URI(Uniform Resource Identifier)中的每个部分,并添加到 HashMap 对象中以创建一个包含所有属性的连接参数对象。
  • 第 26-32 行:声明并初始化了一个 confPattern 正则表达式用于解析 Hive 配置设置。
  • 第 34-38 行:从 JDBC URI 字符串中提取 Hive 配置设置。
  • 第 40-46 行:使用新的 confPattern 正则表达式替换之前的 pattern 正则表达式。将解析后的参数添加到连接参数对象的映射中。
  • 第 48-57 行:处理连接参数字符串,并将其中的键值对添加到连接参数对象中的 Hive 配置映射中。
  • 第 59-75 行:定义了一些帮助程序函数和正则表达式。

根据这段代码,以下是我的注释:

  • 在第 9-10 行中,StringUtils.isNotBlank 和 StringUtils.contains 都是非常实用的方法。可以使用这些方法来检查字符串是否为空或包含特定字符。
  • 第 26 行中 confPattern 的初始化部分似乎不太对。最后一项是 ";?" 而不是 ";?" 小数点是多余的。
  • 建议将变量命名更改为更具可读性和可维护性。
  • 传入的参数 jdbcURI、info 和 logger 可能存在空引用异常(NullPointerException),建议在使用它们之前进行 null 检查。
  • 程序中可能会出现其他的空引用异常,应逐个检查所有的 HashMap 对象,并确保它们包含了必要的键值映射。

Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,13 @@
import static org.apache.kyuubi.jdbc.hive.Utils.extractURLComponents;
import static org.junit.Assert.assertEquals;

import com.google.common.collect.ImmutableMap;
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.Collection;
import java.util.Map;
import java.util.Properties;
import org.junit.Test;
import org.junit.runner.RunWith;
Expand All @@ -35,23 +40,76 @@ public class UtilsTest {
private String expectedPort;
private String expectedCatalog;
private String expectedDb;
private Map<String, String> expectedHiveConf;
private String uri;

@Parameterized.Parameters
public static Collection<String[]> data() {
public static Collection<Object[]> data() throws UnsupportedEncodingException {
return Arrays.asList(
new String[][] {
{"localhost", "10009", null, "db", "jdbc:hive2:///db;k1=v1?k2=v2#k3=v3"},
{"localhost", "10009", null, "default", "jdbc:hive2:///"},
{"localhost", "10009", null, "default", "jdbc:kyuubi://"},
{"localhost", "10009", null, "default", "jdbc:hive2://"},
{"hostname", "10018", null, "db", "jdbc:hive2://hostname:10018/db;k1=v1?k2=v2#k3=v3"},
new Object[][] {
{
"localhost",
"10009",
null,
"db",
new ImmutableMap.Builder<String, String>().put("k2", "v2").build(),
"jdbc:hive2:///db;k1=v1?k2=v2#k3=v3"
},
{
"localhost",
"10009",
null,
"default",
new ImmutableMap.Builder<String, String>().build(),
"jdbc:hive2:///"
},
{
"localhost",
"10009",
null,
"default",
new ImmutableMap.Builder<String, String>().build(),
"jdbc:kyuubi://"
},
{
"localhost",
"10009",
null,
"default",
new ImmutableMap.Builder<String, String>().build(),
"jdbc:hive2://"
},
{
"hostname",
"10018",
null,
"db",
new ImmutableMap.Builder<String, String>().put("k2", "v2").build(),
"jdbc:hive2://hostname:10018/db;k1=v1?k2=v2#k3=v3"
},
{
"hostname",
"10018",
"catalog",
"db",
new ImmutableMap.Builder<String, String>().put("k2", "v2").build(),
"jdbc:hive2://hostname:10018/catalog/db;k1=v1?k2=v2#k3=v3"
},
{
"hostname",
"10018",
"catalog",
"db",
new ImmutableMap.Builder<String, String>()
.put("k2", "v2")
.put("k3", "-Xmx2g -XX:+PrintGCDetails -XX:HeapDumpPath=/heap.hprof")
.build(),
"jdbc:hive2://hostname:10018/catalog/db;k1=v1?"
+ URLEncoder.encode(
"k2=v2;k3=-Xmx2g -XX:+PrintGCDetails -XX:HeapDumpPath=/heap.hprof",
StandardCharsets.UTF_8.toString())
.replaceAll("\\+", "%20")
+ "#k4=v4"
}
});
}
Expand All @@ -61,11 +119,13 @@ public UtilsTest(
String expectedPort,
String expectedCatalog,
String expectedDb,
Map<String, String> expectedHiveConf,
String uri) {
this.expectedHost = expectedHost;
this.expectedPort = expectedPort;
this.expectedCatalog = expectedCatalog;
this.expectedDb = expectedDb;
this.expectedHiveConf = expectedHiveConf;
this.uri = uri;
}

Expand All @@ -76,5 +136,6 @@ public void testExtractURLComponents() throws JdbcUriParseException {
assertEquals(Integer.parseInt(expectedPort), jdbcConnectionParams1.getPort());
assertEquals(expectedCatalog, jdbcConnectionParams1.getCatalogName());
assertEquals(expectedDb, jdbcConnectionParams1.getDbName());
assertEquals(expectedHiveConf, jdbcConnectionParams1.getHiveConfs());
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,12 @@ SCOPE_TABLE: 'SCOPE_TABLE';
SOURCE_DATA_TYPE: 'SOURCE_DATA_TYPE';
IS_AUTOINCREMENT: 'IS_AUTOINCREMENT';
IS_GENERATEDCOLUMN: 'IS_GENERATEDCOLUMN';
VARCHAR: 'VARCHAR';
SMALLINT: 'SMALLINT';
CAST: 'CAST';
AS: 'AS';
KEY_SEQ: 'KEY_SEQ';
PK_NAME: 'PK_NAME';

fragment SEARCH_STRING_ESCAPE: '\'' '\\' '\'';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码是一个补丁(patch),它包含了一些变量的添加和一个字符转义片段。

添加的变量有:

  • VARCHAR:字符串类型
  • SMALLINT:短整型
  • CAST:用于进行显式类型转换
  • AS:用于设置别名
  • KEY_SEQ:主键序列
  • PK_NAME:主键名称

这些变量是用于数据库模型中的一些属性的,该补丁在数据库操作时可能会用到。

另外,代码中还定义了一个字符转义片段 SEARCH_STRING_ESCAPE,用于转义单引号和反斜杠符号。

就整体而言,这个补丁代码没有看到任何明显的风险和改进建议。

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,13 @@ statement
SOURCE_DATA_TYPE COMMA IS_AUTOINCREMENT COMMA IS_GENERATEDCOLUMN FROM SYSTEM_JDBC_COLUMNS
(WHERE tableCatalogFilter? AND? tableSchemaFilter? AND? tableNameFilter? AND? colNameFilter?)?
ORDER BY TABLE_CAT COMMA TABLE_SCHEM COMMA TABLE_NAME COMMA ORDINAL_POSITION #getColumns
| SELECT CAST LEFT_PAREN NULL AS VARCHAR RIGHT_PAREN TABLE_CAT COMMA
CAST LEFT_PAREN NULL AS VARCHAR RIGHT_PAREN TABLE_SCHEM COMMA
CAST LEFT_PAREN NULL AS VARCHAR RIGHT_PAREN TABLE_NAME COMMA
CAST LEFT_PAREN NULL AS VARCHAR RIGHT_PAREN COLUMN_NAME COMMA
CAST LEFT_PAREN NULL AS SMALLINT RIGHT_PAREN KEY_SEQ COMMA
CAST LEFT_PAREN NULL AS VARCHAR RIGHT_PAREN PK_NAME
WHERE FALSE #getPrimaryKeys
| .*? #passThrough
;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码补丁看起来是在修改 SQL 查询语句,主要的变化如下:

  1. 在 "getColumns" 语句中增加了一个过滤条件(tableCatalogFilter、tableSchemaFilter、tableNameFilter 和 colNameFilter)。
  2. 增加了一个新的 "getPrimaryKeys" 语句,目前认为它没有实现任何功能,因为 WHERE 子句始终为 FALSE。
  3. 添加了一个用于跳过其他内容的通配符表达式。

如果查询语句本身没有逻辑问题,这个补丁应该不会引入太大的风险。然而,我们无法确认新增的 "getPrimaryKeys" 是否存在潜在的错误或是遗漏的实现部分,需要进行进一步的检查或完善。

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import org.apache.kyuubi.operation.OperationHandle
import org.apache.kyuubi.service.BackendService
import org.apache.kyuubi.sql.parser.trino.KyuubiTrinoFeParser
import org.apache.kyuubi.sql.plan.PassThroughNode
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetPrimaryKeys, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}

class KyuubiTrinoOperationTranslator(backendService: BackendService) {
lazy val parser = new KyuubiTrinoFeParser()
Expand Down Expand Up @@ -68,6 +68,11 @@ class KyuubiTrinoOperationTranslator(backendService: BackendService) {
schemaPattern,
tableNamePattern,
colNamePattern)
case GetPrimaryKeys() =>
val operationHandle = backendService.getPrimaryKeys(sessionHandle, null, null, null)
// The trino implementation always returns empty.
operationHandle.setHasResultSet(false)
operationHandle
case PassThroughNode() =>
backendService.executeStatement(sessionHandle, statement, configs, runAsync, queryTimeout)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码在导入包后,扩展了org.apache.kyuubi.sql.plan.trino中的GetCatalogs, GetColumns, GetSchemas, GetTables, GetTableTypes, GetTypeInfo以及新增了GetPrimaryKeys,表示该类可以通过这些节点将Trino SQL翻译成Kylin实现。

在函数translatePlan中,对于不同的SQL类型,分别构建了相应的operationHandle。值得注意的是,对于新加的GetPrimaryKeys操作,trino的实现总是返回空。为了保持一致,此处也设置为空。

在风险和改进方面,由于我没有上下文信息,无法判断是否存在风险或者提供改进建议。

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import org.apache.kyuubi.sql.KyuubiTrinoFeBaseParser._
import org.apache.kyuubi.sql.KyuubiTrinoFeBaseParserBaseVisitor
import org.apache.kyuubi.sql.parser.KyuubiParser.unescapeSQLString
import org.apache.kyuubi.sql.plan.{KyuubiTreeNode, PassThroughNode}
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetPrimaryKeys, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}

class KyuubiTrinoFeAstBuilder extends KyuubiTrinoFeBaseParserBaseVisitor[AnyRef] {

Expand Down Expand Up @@ -92,6 +92,10 @@ class KyuubiTrinoFeAstBuilder extends KyuubiTrinoFeBaseParserBaseVisitor[AnyRef]
GetColumns(catalog, schemaPattern, tableNamePattern, colNamePattern)
}

override def visitGetPrimaryKeys(ctx: GetPrimaryKeysContext): KyuubiTreeNode = {
GetPrimaryKeys()
}

override def visitNullCatalog(ctx: NullCatalogContext): AnyRef = {
null
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码看起来没有明显的错误风险,只是对于该代码功能的了解不足,无法给出完整性能改进建议。

在这个代码段中,开发人员想在Trino查询引擎中添加获取主键信息的功能(通过 GetPrimaryKeys 函数)。此外,还有一些更改涉及导入和类方法重写。

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,7 @@ case class GetColumns(
colNamePattern: String) extends KyuubiTreeNode {
override def name(): String = "Get Columns"
}

case class GetPrimaryKeys() extends KyuubiTreeNode {
override def name(): String = "Get Primary Keys"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段代码补丁中新增了一个名为GetPrimaryKeyscase class,它继承自KyuubiTreeNode 并实现了 name() 方法返回"Get Primary Keys"。

从这段代码来看,似乎没有明显的错误或缺陷。如果想要完整地评估代码中的风险和提出改进建议,则需要更深入地研究该方法使用的编程语言、框架、第三方库以及系统设计。

Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ package org.apache.kyuubi.parser.trino
import org.apache.kyuubi.KyuubiFunSuite
import org.apache.kyuubi.sql.parser.trino.KyuubiTrinoFeParser
import org.apache.kyuubi.sql.plan.{KyuubiTreeNode, PassThroughNode}
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}
import org.apache.kyuubi.sql.plan.trino.{GetCatalogs, GetColumns, GetPrimaryKeys, GetSchemas, GetTables, GetTableTypes, GetTypeInfo}

class KyuubiTrinoFeParserSuite extends KyuubiFunSuite {
val parser = new KyuubiTrinoFeParser()
Expand Down Expand Up @@ -354,4 +354,19 @@ class KyuubiTrinoFeParserSuite extends KyuubiFunSuite {
tableName = "%aa",
colName = "%bb")
}

test("Support GetPrimaryKeys for Trino Fe") {
val kyuubiTreeNode = parse(
"""
| SELECT CAST(NULL AS varchar) TABLE_CAT,
| CAST(NULL AS varchar) TABLE_SCHEM,
| CAST(NULL AS varchar) TABLE_NAME,
| CAST(NULL AS varchar) COLUMN_NAME,
| CAST(NULL AS smallint) KEY_SEQ,
| CAST(NULL AS varchar) PK_NAME
| WHERE false
|""".stripMargin)

assert(kyuubiTreeNode.isInstanceOf[GetPrimaryKeys])
}
}