Lists (1)
Sort Name ascending (A-Z)
Stars
⚡ Dynamically generated stats for your github readmes
Benchmarking physical understanding in generative video models
Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurIPS'24]
Code samples for YouTube APIs, including the YouTube Data API, YouTube Analytics API, and YouTube Live Streaming API. The repo contains language-specific directories that contain the samples.
Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).
Data and code from "Estimating the perceived dimension of psychophysical stimuli using triplet accuracy and hypothesis testing"
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
Code & data for "Towards flexible perception with visual memory"
LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.
PyTorch code and models for the DINOv2 self-supervised learning method.
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
A framework for evaluating models on their alignment to brain and behavioral measurements (100+ benchmarks)
Run CLIP inference on the ImageNet dataset and use these inferences as labels to train other models and again evaluate the trained model on Imagenet validation dataset using original labels or CLIP…
Official code release for the paper Trapped in texture bias? A large scale comparison of deep instance segmentation, accepted at ECCV 2022
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
Oficial implementation of the paper Performance-optimized DNNs are evolving into worse models of inferotemporal visual cortex
This is an unofficial implementation of the diffusion-style noise frontend in "Intriguing properties of generative classifiers" by Priyank Jaini, Kevin Clark, Robert Geirhos to improve the shape-bi…
Google Research
ViT models pretrained with up to ~5k hours of human-like video data
A curated list of representational alignment papers, projects, and presentations
Code for "Don't trust your eyes: on the (un)reliability of feature visualizations" (ICML 2024)
Code for Continuously Changing Corruptions (CCC) benchmark + evaluation
Code and data for the paper "In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation"
Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)
A playbook for systematically maximizing the performance of deep learning models.
Spurious Features Everywhere - Large-Scale Detection of Harmful Spurious Features in ImageNet