Skip to content
View codingwithsurya's full-sized avatar
🥷
coding
🥷
coding

Highlights

  • Pro

Block or report codingwithsurya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
codingwithsurya/README.md

Hey y'all - my name is Surya! I study computer science @ Georgia Tech. My interests lie in the intersection of machine learning and systems for both training and inference of foundation models.

Previously, I worked on large-scale distributed systems for feature embedding pipelines that powered ads-related ML models (ranking, retrieval, etc.) at Pinterest. I am currently pursuing research in efficient inference optimizations for mixture-of-expert models and improving reasoning in llms.

If you find any of my projects interesting, feel free to reach out @ [email protected]!

Pinned Loading

  1. PaliGemma-Inference-Pipeline PaliGemma-Inference-Pipeline Public

    Replication and efficient inference for the PaliGemma model, a state-of-the-art vision-language model

    Python 2

  2. jax-autodiff jax-autodiff Public

    A JAX-inspired autodiff compiler with optimized function transformations (jit, vmap, grad) and operation fusion.

    Python

  3. diffusion.cu diffusion.cu Public

    Implementation of high-performance diffusion transformer from scratch in CUDA/C++

    Jupyter Notebook 3

  4. distributed-transformer distributed-transformer Public

    "Attention Is All You Need" Transformer paper implemented from scratch in PyTorch with support for distributed training, in both FSDP and DDP.

    Python 3

  5. pytorch-labs/tritonbench pytorch-labs/tritonbench Public

    Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

    Python 130 20