-
Notifications
You must be signed in to change notification settings - Fork 339
After retrieving JSON from ES Kafka Mongodb Restful…
JSON is a good thing that can carry rich structured data information in a common text format. Many modern technologies prefer to use JSON as a data transmission format, such as Elastic Search, Restful, Kafka, etc. Mongodb, which is more concerned about performance, uses binary JSON.
Structured data is often in bulk and often requires recalculation.
However, JSON related class libraries are not very convenient to use for calculations. JSONpath is fine to parse JSON, but it doesn't have much computing power. Simple filtering and aggregation are fine, but it cannot handle slightly complex operations such as grouping and summarization. Basically, you have to hard code yourself.
Write it into the database to calculate? It is too heavy. Moreover, JSON often has multiple layers of structured data, and writing it into a relational database requires creating several associated tables, which makes the cost of loading into a database much higher than the calculation itself.
esProc SPL will help you.
esProc SPL is an open source computing engine developed purely in Java, and it is here https://github.com/SPLWare/esProc.
esProc SPL encapsulates json library, and it can parse JSON text into a computable SPL table sequence (SPL's in-memory structured data object) in just one line:
A | |
---|---|
1 | =file("d:\xml\emp_orders.json").read() |
2 | =json(A1) |
The SPL table sequence naturally has a multi-layer structure, which means that the field values can be another table sequence, which is naturally in line with JSON's structure:
Once converted into an SPL table sequence, the calculation itself is a strength of esProc. Filtering, grouping, and join are never a problem, and most of the calculation objectives can be implemented in one line:
Filter:T.select(Amount>1000 && Amount<=3000 && like(Client,"*s*"))
Sort:T.sort(Client,-Amount)
Distinct:T.id(Client)
Group:T.groups(year(OrderDate);sum(Amount))
Join:join(T1:O,SellerId; T2:E,EId)
TopN:T.top(-3;Amount)
TopN in group:T.groups(Client;top(3,Amount))
There are many of these contents, and we won't expand on them here. Interested friends can refer to the relevant materials on the esProc SPL official website.
eSProc SPL has encapsulated many common access interfaces for JSON data sources.
Restful: Plain text JSON, and it can generate JSON text back after calculation
A | |
---|---|
1 | =httpfile("http://127.0.0.1:6868/restful/emp_orders").read() |
2 | =json(A1) |
3 | =A2.conj(Orders).select(Amount>1000 && Amount<=2000 && like@c(Client,"*business*")) |
4 | =json(A3) |
Elastic Search: it can directly write JSON constants in SPL code and participate in transmission and calculation
A | |
---|---|
1 | >apikey="Authorization:ApiKey a2x6aEF……KZ29rT2hoQQ==" |
2 | '{ "counter" : 1, "tags" : ["red"] ,"beginTime":"2022-01-03" ,"endTime":"2022-02-15" } |
3 | =es_rest("https://localhost:9200/index1/_doc/1", "PUT",A2;"Content-Type: application/x-ndjson",apikey) |
4 | =json(A3.Content) |
Mongodb: It is also OK for binary JSON
A | |
---|---|
1 | =mongo_open("mongodb://127.0.0.1:27017/mymongo") |
2 | =mongo_shell(A1,"{'find':'orders',filter:{OrderID: {$gte: 50}},batchSize:100}") |
3 | =A2.cursor.firstBatch.select(Amount>1000 && Amount<=2000 && like@c(Client,"*business*")) |
4 | =mongo_close(A1) |
Kafka: SPL also encapsulates the interfaces for writing to these data sources, forming an IO loop
A | |
---|---|
1 | =kafka_open("/kafka/my.properties", "topic1") |
2 | =kafka_poll(A1) |
3 | =A2.derive(json(value):v).new(key, v.fruit, v.weight) |
4 | =kafka_send(A1, "A100", json(A3)) |
5 | =kafka_close(A1) |
For Mongodb, Kafka and other data sources that may return large amounts of data, esProc SPL also provides cursor objects and methods that can read batch by batch and process while reading. We won't provide detailed examples here. Interested friends can also go to the official website to check the information.
Usually, JSON data does not exist independently, but also exchanges data with other data sources and performs mixed calculations. esProc SPL is certainly not invented only to deal with JSON, but it is a professional computing engine that can support a wide range of data sources:
These data sources can all be read as table sequence and cursor by SPL, making it very easy to implement mixed computation and exchange data.
Then, how can the code written in esProc SPL be integrated into the application?
Very simple, esProc provides a standard JDBC driver, allowing Java programs to call SPL code just like executing database SQL.
Class.forName("com.esproc.jdbc.InternalDriver");
Connection conn =DriverManager.getConnection("jdbc:esproc:local://");
Statement statement = conn.createStatement();
ResultSet result = statement.executeQuery("=json(file(\"Orders.csv\")).select(Amount>1000 && like(Client,\"*s*\")
More complex SPL scripts can be saved as files, just like calling stored procedures:
Class.forName("com.esproc.jdbc.InternalDriver");
Connection conn =DriverManager.getConnection("jdbc:esproc:local://");
CallableStatement statement = conn.prepareCall("call queryOrders()");
statement.execute();
As pure Java developed software, esProc SPL can be seamlessly integrated into Java applications, just like the code written by application programmers themselves, enjoying the advantages of mature Java frameworks together. SPL itself has well-established process control statements, such as for loops and if branches, and also supports subroutine calls. Using only SPL can achieve very complex business logic, directly forming a complete business unit, without the need for upper-level Java code to cooperate. The main program simply calls the SPL script.
Store SPL scripts as files and place them outside of the main application program. Code modifications can be made independently and immediately take effect, unlike Java code that needs to be recompiled after code modifications, and the entire application needs to be shut down and restarted. This can achieve hot swap of business logic, especially suitable for supporting frequently changing businesses, which is also where JSON is widely used.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code