Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion/Suggestion] Enable the writing of kernel/shader code in directly in Python. #394

Open
LouChiSoft opened this issue Sep 3, 2024 · 2 comments

Comments

@LouChiSoft
Copy link

LouChiSoft commented Sep 3, 2024

Hi, first off I would like to state I don't know enough about Python to know if this is actually possible. But I would like to suggest a potential feature improvement. The ability to write kernels/shaders directly in Python and have it compile down to the compute string that you would normally write would make for a decent improvement I think.

Maybe having a Kernel class that the user can inherit and provides a more structured approach to declaring things like inputs by making them the arguments of a process function or maybe member values of the class itself.

Example based on the Getting Started kernel:

from .utils import compile_source # using util function from python/test/utils

def kompute(shader):
    // Definition left out to save space

if __name__ == "__main__":

    # Define a raw string shader (or use the Kompute tools to compile to SPIRV / C++ header
    # files). This shader shows some of the main components including constants, buffers, etc
    shader = """
        #version 450

        layout (local_size_x = 1) in;

        // The input tensors bind index is relative to index in parameter passed
        layout(set = 0, binding = 0) buffer buf_in_a { float in_a[]; };
        layout(set = 0, binding = 1) buffer buf_in_b { float in_b[]; };
        layout(set = 0, binding = 2) buffer buf_out_a { uint out_a[]; };
        layout(set = 0, binding = 3) buffer buf_out_b { uint out_b[]; };

        // Kompute supports push constants updated on dispatch
        layout(push_constant) uniform PushConstants {
            float val;
        } push_const;

        // Kompute also supports spec constants on initalization
        layout(constant_id = 0) const float const_one = 0;

        void main() {
            uint index = gl_GlobalInvocationID.x;
            out_a[index] += uint( in_a[index] * in_b[index] );
            out_b[index] += uint( const_one * push_const.val );
        }
    """

    kompute(shader)

Would become:

from .utils import compile_source

class GettingStartedKernel(KomputeKernel):

   def process(in_a, in_b, out_a, out_b, push_const, const_one):
       index: int = get_global_index().x
       out_a[index] += in_a[index] * in_b[index] ;
       out_b[index] += const_one * push_const.val;

if __name__ == "__main__":
   mgr = kp.Manager()

   tensor_in_a = mgr.tensor([2, 2, 2])
   tensor_in_b = mgr.tensor([1, 2, 3])

   tensor_out_a = mgr.tensor_t(np.array([0, 0, 0], dtype=np.uint32))
   tensor_out_b = mgr.tensor_t(np.array([0, 0, 0], dtype=np.uint32))

   push_constants = PushConstants(2)
   spec_constnatnts = 2

   my_kernel = GettingStartedKernel()
   mgr.execute(my_kernel, [3, 1, 1], tensor_in_a, tensor_in_b, tensor_out_a, tensor_out_b, push_constants, spec_constants)

This is by no means meant to be a "correct" solution. Just something to express the idea that I am trying to describe. It's obviously not a trivial feature to implement and there are certain things that would need to be addressed first. But I think that having something that is more than just a string can be more productive when writing.
Ideally it would also take away all the hassle of having to manually check and ensure things like your bindings indices and set indices etc.

Would love to hear some feedback on the idea/if it's even possible in Python.

@axsaucedo
Copy link
Member

axsaucedo commented Sep 4, 2024

We actually had this in a previous version of Kompute using the pyshader - here's an example on the tests:

# 4. Define the multiplication shader code to run on the GPU
@ps.python2shader
def compute_shader_multiply(index=("input", "GlobalInvocationId", ps.ivec3),
data1=("buffer", 0, ps.Array(ps.f32)),
data2=("buffer", 1, ps.Array(ps.f32)),
data3=("buffer", 2, ps.Array(ps.f32))):
i = index.x
data3[i] = data1[i] * data2[i]
# 5. Run shader code against our previously defined tensors
mgr.eval_algo_data_def(
[tensor_in_a, tensor_in_b, tensor_out],
compute_shader_multiply.to_spirv())
# 6. Sync tensor data from GPU back to local
mgr.eval_tensor_sync_local_def([tensor_out])

The library is not maintained unfortunately, and there hasn't been anything out there to provide a similar interface unfortunately, if there is an initiative that develops this further, it would be great to adopt once again.

@LouChiSoft
Copy link
Author

Thanks for the link. Shame pyshader is no longer actively maintained. In a perfect world I would be able to write entire pipelines once in Python and AOT compile it with something like PyPy to an executable with both CPU and GPU pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants