Skip to content
@Trustworthy-ML-Lab

Trustworthy-ML-Lab

Popular repositories Loading

  1. Label-free-CBM Label-free-CBM Public

    [ICLR 23] A new framework to transform any neural networks into an interpretable concept-bottleneck-model (CBM) without needing labeled concept data

    Jupyter Notebook 102 21

  2. CLIP-dissect CLIP-dissect Public

    [ICLR 23 spotlight] An automatic and efficient tool to describe functionalities of individual neurons in DNNs

    Jupyter Notebook 50 15

  3. VLG-CBM VLG-CBM Public

    [NeurIPS 24] A new training and evaluation framework for learning interpretable deep vision models and benchmarking different interpretable concept-bottleneck-models (CBMs)

    Jupyter Notebook 17 2

  4. CB-LLMs CB-LLMs Public

    [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliability, transparency, and trustworthiness.

    Python 13 3

  5. Linear-Explanations Linear-Explanations Public

    [ICML 24] A novel automated neuron explanation framework that can accurately describe poly-semantic concepts in deep neural networks

    Jupyter Notebook 12

  6. ThinkEdit ThinkEdit Public

    An effective weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study uncovering how reasoning length is encoded in the model’s representation space.

    Python 12 2

Repositories

Showing 10 of 22 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…