minja.hpp - A minimalistic C++ Jinja templating engine for LLM chat templates

This is not an official Google product

Minja is a minimalistic reimplementation of the Jinja templating engine to integrate in/with C++ LLM projects (such as llama.cpp or gemma.cpp).

It is not general purpose: it includes just what’s needed for actual chat templates (very limited set of filters, tests and language features). Users with different needs should look at third-party alternatives such as Jinja2Cpp, Jinja2CppLight, or inja (none of which we endorse).

Warning

TL;DR: use of Minja is at your own risk, and the risks are plenty! See Security & Privacy section below.

Design goals:

Support each and every major LLM found on HuggingFace
- See MODEL_IDS in tests/CMakeLists.txt for the list of models currently supported
Easy to integrate to/with projects such as llama.cpp or gemma.cpp:
- Header-only
- C++17
- Only depend on nlohmann::json (no Boost)
- Keep codebase small (currently 2.5k LoC) and easy to understand
Decent performance compared to Python.

Non-goals:

Address glaring Prompt injection risks in current Jinja chat templating practices. See Security & Privacy below
Additional features from Jinja that aren't used by the template(s) of any major LLM (no feature creep!)
- Please don't submit PRs with such features, they will unfortunately be rejected.
Full Jinja compliance (neither syntax-wise, nor filters / tests / globals)

Usage:

This library is header-only: just copy the header(s) you need, make sure to use a compiler that handles C++11 and you're done. Oh, and get nlohmann::json's json.hpp in your include path.

See API in minja/minja.hpp and minja/chat-template.h (experimental).

For raw Jinja templating (see examples/raw.cpp):

#include <minja.hpp>
#include <iostream>

using json = nlohmann::ordered_json;

int main() {
    auto tmpl = minja::Parser::parse("Hello, {{ location }}!", /* options= */ {});
    auto context = minja::Context::make(minja::Value(json {
        {"location", "World"},
    }));
    auto result = tmpl->render(context);
    std::cout << result << std::endl;
}

To apply a template to a JSON array of messages and tools in the HuggingFace standard (see examples/chat-template.cpp):

#include <chat-template.hpp>
#include <iostream>

using json = nlohmann::ordered_json;

int main() {
    minja::chat_template tmpl(
        "{% for message in messages %}"
        "{{ '<|' + message['role'] + '|>\\n' + message['content'] + '<|end|>' + '\\n' }}"
        "{% endfor %}",
        /* bos_token= */ "<|start|>",
        /* eos_token= */ "<|end|>"
    );
    std::cout << tmpl.apply(
        json::parse(R"([
            {"role": "user", "content": "Hello"},
            {"role": "assistant", "content": "Hi there"}
        ])"),
        json::parse(R"([
            {"type": "function", "function": {"name": "google_search", "arguments": {"query": "2+2"}}}
        ])"),
        /* add_generation_prompt= */ true,
        /* extra_context= */ {}) << std::endl;
}

(Note that some template quirks are worked around by minja/chat-template.hpp so that all templates can be used the same way)

Supported features

Models have increasingly complex templates (see some examples), so a fair bit of Jinja's language constructs is required to execute their templates properly.

Minja supports the following subset of the Jinja2/3 template syntax:

Full expression syntax
Statements {{% … %}}, variable sections {{ … }}, and comments {# … #} with pre/post space elision {%- … -%} / {{- … -}} / {#- … -#}
if / elif / else / endif
for (recursive) (if) / else / endfor w/ loop.* (including loop.cycle) and destructuring
set w/ namespaces & destructuring
macro / endmacro
filter / endfilter
Extensible filters collection: count, dictsort, equalto, e / escape, items, join, joiner, namespace, raise_exception, range, reject, tojson, trim

Main limitations (non-exhaustive list):

Not supporting most filters. Only the ones actually used in templates of major (or trendy) models are/will be implemented.
No difference between none and undefined
Single namespace with all filters / tests / functions / macros / variables
No tuples (templates seem to rely on lists only)
No if expressions w/o else (but if statements are fine)
No {% raw %}, {% block … %}, {% include … %}, `{% extends … %},

Roadmap / TODOs

Fix known issues w/ CRLF on Windows
Integrate to llama.cpp: ggerganov/llama.cpp#11016 + ggerganov/llama.cpp#9639
Improve fuzzing coverage:
- use thirdparty jinja grammar to guide exploration of inputs (or implement prettification of internal ASTs and use them to generate arbitrary values)
- fuzz each filter / test
Measure / track test coverage
Setup performance tests
Simplify two-pass parsing
- Pass tokens to IfNode and such
Macro nested set scope = global?
Get listed in https://jbmoelker.github.io/jinja-compat-tests/, https://en.cppreference.com/w/cpp/links/libs

Developer corner

Design overview

minja::Parser does two-phased parsing:
- its tokenize() method creates coarse template "tokens" (plain text section, or expression blocks or opening / closing blocks). Tokens may have nested expressions ASTs, parsed with parseExpression()
- its parseTemplate() method iterates on tokens to build the final TemplateNode AST.
minja::Value represents a Python-like value
- It relies on nlohmann/json for primitive values, but does its own JSON dump to be exactly compatible w/ the Jinja / Python implementation of dict string representation
minja::chat_template wraps a template and provides an interface similar to HuggingFace's chat template formatting. It also normalizes the message history to accommodate different expectations from some templates (e.g. message.tool_calls.function.arguments is typically expected to be a JSON string representation of the tool call arguments, but some templates expect the arguments object instead)
Testing involves a myriad of simple syntax tests and full e2e chat template rendering tests. For each model in MODEL_IDS (see tests/CMakeLists.txt), we fetch the chat_template field of the repo's tokenizer_config.json, use the official jinja2 Python library to render them on each of the (relevant) test contexts (in tests/contexts) into a golden file, and run a C++ test that renders w/ Minja and checks we get exactly the same output.

Adding new Templates / Building

Install Prerequisites:
- cmake
- GCC / clang
- python 3.8+ (for tests)
- flake8
- editorconfig-checker
Optional: test additional templates:
- Add their HuggingFace model identifier to MODEL_IDS in tests/CMakeLists.txt (e.g. meta-llama/Llama-3.2-3B-Instruct)
- For gated models you have access to, first authenticate w/ HuggingFace:
```
pip install huggingface_hub
huggingface-cli login
```

Build & run tests (shorthand: ./scripts/run_tests.sh):

rm -fR build && \
    cmake -B build && \
    cmake --build build -j && \
    ctest --test-dir build -j --output-on-failure

Fuzzing tests

Note: fuzztest doesn't work natively on Windows or MacOS.

Show instructions to run it inside a Docker container

Beware of Docker Desktop's licensing: you might want to check out alternatives such as colima (we'll still use the docker client in the example below).

docker run --rm -it -v $PWD:/src:rw $( echo "
    FROM python:3.12-slim-bookworm
    COPY requirements.txt /tmp
    RUN apt update && \
        apt install -y cmake clang ccache git python3 python-is-python3 python3-pip && \
        apt-get clean && \
        rm -rf /var/lib/apt/lists/*
    RUN pip install setuptools pip --upgrade --force-reinstall
    RUN pip install -r /tmp/requirements.txt
    CMD /usr/bin/bash
    WORKDIR /src
" | docker build . -f - -q )

Build in fuzzing mode & run all fuzzing tests (optionally, set a higher TIMEOUT as env var):
```
./scripts/run_fuzzing_mode.sh
```

If your model's template doesn't run fine, please consider the following before opening a bug:
- Is the template using any unsupported filter / test / method / global function, and which one(s)?
- Is the template publicly available? Non-gated models are more likely to become supported.
- Which version of GCC / clang did you compile the tests with? On which OS version?
- If you intend to contribute a fix:
  - Please read CONTRIBUTING first. You'd have to sign a CLA, which your employer may need to accept.
  - Please test as many gated models as possible (use cmake -B build -DMINJA_TEST_GATED_MODELS=1 ... and edit MODEL_LIST appropriately)
For bonus points, check the style of your edits with:
```
flake8
editorconfig-checker
```

Security & Privacy

Data protection

This library doesn't store any data by itself, it doesn't access files or the web, it only transforms a template (string) and context (JSON w/ fields "messages", "tools"...) into a formatted string.

You should still be careful about untrusted third-party chat templates, as these could try and trigger bugs in Minja to exfiltrate user chat data (we only have limited fuzzing tests in place).

Risks are even higher with any user-defined functions.

Do NOT produce HTML or JavaScript with this!

HTML processing with this library is UNSAFE: no escaping of is performed (and the safe filter is a passthrough), leaving users vulnerable to XSS. Minja is not intended to produce HTML.

Beware of Prompt injection risks!

Prompt injection is NOT protected against by this library.

There are many types of prompt injection, some quite exotic (cf. data exfiltration exploits leveraging markdown image previews).

For the simpler cases, it is perfectly possible for a user to craft a message that will look like a system prompt, like an assistant response or like the results of tool calls. While some models might be fine-tuned to ignore system calls not at the very start of the prompt or out of order messages / tool call results, it is expected that most models will be very confused & successfully manipulated by such prompt injections.

Note that injection of tool calls should typically not result in their execution as LLM inference engines should not try to parse the template output (just generated tokens), but this is something to watch out for when auditing such inference engines.

As there isn't any standard mechanism to escape special tokens to prevent those attacks, it is advised users of this library take their own message sanitization measures before applying chat templates. We do not recommend any specific such measure as each model reacts differently (some even understand l33tcode as instructions).

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
.github/workflows		.github/workflows
cmake		cmake
examples		examples
include/minja		include/minja
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

minja.hpp - A minimalistic C++ Jinja templating engine for LLM chat templates

Design goals:

Non-goals:

Usage:

Supported features

Roadmap / TODOs

Developer corner

Design overview

Adding new Templates / Building

Security & Privacy

Data protection

Do NOT produce HTML or JavaScript with this!

Beware of Prompt injection risks!

About

Releases

Packages

Contributors 2

Languages

License

google/minja

Folders and files

Latest commit

History

Repository files navigation

minja.hpp - A minimalistic C++ Jinja templating engine for LLM chat templates

Design goals:

Non-goals:

Usage:

Supported features

Roadmap / TODOs

Developer corner

Design overview

Adding new Templates / Building

Security & Privacy

Data protection

Do NOT produce HTML or JavaScript with this!

Beware of Prompt injection risks!

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages