Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add blog to website and introductory content #647

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions site/docs/_config
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
arrange:
nav:
- index.md
- spec
- types
- expressions
- relations
- Introduction: about.md
- News & Articles: blog
- tutorial
- Format:
- spec
- types
- expressions
- relations
- serialization
- extensions
- community
- governance.md
- about.md
- tools
- tutorial
- faq.md
- faq.md
4 changes: 2 additions & 2 deletions site/docs/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ title: About Substrait

## Project Vision

The Substrait project aims to create a well-defined, cross-language [specification](spec/specification) for data compute operations. The specification declares a set of common operations, defines their semantics, and describes their behavior unambiguously. The project also defines extension points and serialized representations of the specification.
The Substrait project aims to create a well-defined, cross-language [specification](/spec/specification) for data compute operations. The specification declares a set of common operations, defines their semantics, and describes their behavior unambiguously. The project also defines extension points and serialized representations of the specification.

In many ways, the goal of this project is similar to that of the Apache Arrow project. Arrow is focused on a standardized memory representation of columnar data. Substrait is focused on what should be done to data.


See the [introductory tutorial](/tutorial/sql_to_substrait/) for a hands on introduction to Substrait

## Why not use SQL?

Expand Down
231 changes: 231 additions & 0 deletions site/docs/blog/2024-01-pytextformat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
---
title: Substrait Python 0.13 supports textual formats
description: Support for loading text representation and json representation has been released in Substrait-Python 0.13
date: 2024-02-20
---

# Substrait Python and plan formats

Up to now the Substrait-Python library was a only able to represent in memory a Substrait plan
and emit or load it from a protobuf binary representation.

In version 0.13 it was finally introduced the support to load it from more human readable formats
like the Text Format and the JSON Format.
The Text Format allows to more easily load plans manually built by humans and provides an effective way
to debug plans, while the JSON format acts as a bridge between the human and the machine,
providing a format that can be easily manipulated in all major programming languages,
shipped via text based protocols like HTTP while also being fairly readable for a human.

## Using the Text Format

``` py
import tempfile
from substrait.planloader import planloader

with tempfile.NamedTemporaryFile(mode="rw+t) as tf:
tf.write("""

""")
testplan = planloader.load_substrait_plan(tf.name)
```

## Using JSON Format

``` py
# SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5');
plan = {
"extensions":[
{
"extensionFunction":{
"functionAnchor":1,
"name":"lte"
}
},
{
"extensionFunction":{
"functionAnchor":2,
"name":"is_not_null"
}
},
{
"extensionFunction":{
"functionAnchor":3,
"name":"and"
}
},
{
"extensionFunction":{
"functionAnchor":4,
"name":"count"
}
}
],
"relations":[
{
"root":{
"input":{
"project":{
"input":{
"aggregate":{
"input":{
"read":{
"baseSchema":{
"names":[
"exercise",
"difficulty_level"
],
"struct":{
"types":[
{
"varchar":{
"length":13,
"nullability":"NULLABILITY_NULLABLE"
}
},
{
"i32":{
"nullability":"NULLABILITY_NULLABLE"
}
}
],
"nullability":"NULLABILITY_REQUIRED"
}
},
"filter":{
"scalarFunction":{
"functionReference":3,
"outputType":{
"bool":{
"nullability":"NULLABILITY_NULLABLE"
}
},
"arguments":[
{
"value":{
"scalarFunction":{
"functionReference":1,
"outputType":{
"i32":{
"nullability":"NULLABILITY_NULLABLE"
}
},
"arguments":[
{
"value":{
"selection":{
"directReference":{
"structField":{
"field":1
}
},
"rootReference":{

}
}
}
},
{
"value":{
"literal":{
"i32":5
}
}
}
]
}
}
},
{
"value":{
"scalarFunction":{
"functionReference":2,
"outputType":{
"i32":{
"nullability":"NULLABILITY_NULLABLE"
}
},
"arguments":[
{
"value":{
"selection":{
"directReference":{
"structField":{
"field":1
}
},
"rootReference":{

}
}
}
}
]
}
}
}
]
}
},
"projection":{
"select":{
"structItems":[
{

}
]
},
"maintainSingularStruct":true
},
"namedTable":{
"names":[
"crossfit"
]
}
}
},
"groupings":[
{

}
],
"measures":[
{
"measure":{
"functionReference":4,
"outputType":{
"i64":{
"nullability":"NULLABILITY_NULLABLE"
}
}
}
}
]
}
},
"expressions":[
{
"selection":{
"directReference":{
"structField":{

}
},
"rootReference":{

}
}
}
]
}
},
"names":[
"exercise"
]
}
}
],
"version":{
"minorNumber":24,
}
}
```
6 changes: 6 additions & 0 deletions site/docs/blog/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
exclude_from_blog: true
---
# News & Articles

{{ blog_content }}
1 change: 1 addition & 0 deletions site/docs/tools/_config
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
arrange:
- producer_tools.md
- libraries.md
- substrait_validator.md
- third_party_tools.md
27 changes: 27 additions & 0 deletions site/docs/tools/libraries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Substrait Libraries

## Python

[Substrait-Python](https://github.com/substrait-io/substrait-python) is a Python library to build and manipulate Substrait plans

## Java

[Substrait-Java](https://github.com/substrait-io/substrait-java) is a Java library to build and manipulate Substrait plans,
it includes the Isthmus tool too which can convert SQL to Substrait.

## C++

[Substrait-Cpp](https://github.com/substrait-io/substrait-cpp) is a C++ library to build and manipulate Substrait plans,
it is the reference implementation and includes parsing for all official representation formats (text, protobuf, json)

## Javascript

[Substrait-Js](https://github.com/substrait-io/substrait-js) is a Javascript library to build and manipulate Substrait plans

## Rust

[Substrait-rs](https://github.com/substrait-io/substrait-rs) is a Rust library to build and manipulate Substrait plans

## Go

[Substrait-go](https://github.com/substrait-io/substrait-go) is a Go library to build and manipulate Substrait plans
4 changes: 4 additions & 0 deletions site/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ plugins:
- table-reader
- markdownextradata
- search
- blogging:
paging: off
dirs:
- blog
- awesome-pages:
filename: _config
- minify:
Expand Down
1 change: 1 addition & 0 deletions site/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ mkdocs-gen-files>=0.4.0,<1
mkdocs-markdownextradata-plugin>=0.2.5,<1
mkdocs-protobuf>=0.1.0,<1
mkdocs-table-reader-plugin>=2,<3
mkdocs-blogging-plugin>=2,<3
pygments>=2.14,<3
oyaml>=1.0,<2
mdutils>=1.4.0,<2
Loading