Opensource
Tools, Datasets & Benchmarks
We release the artifacts behind our research — codebases, datasets, and benchmarks — so the community can reproduce, build on, and stress-test what we publish. Everything below is freely available on GitHub and Hugging Face.
Large Language Models
SummLlama HuggingFace
Faithful Summarization Model
Ext2Gen HuggingFace
Robust Generation Model for RAG
Benchmark Datasets
UniSumEval GitHub
Benchmark Data for Text Summarization
ToFuEval GitHub
Hallucination Benchmark Data
ANIMAL-10N Web
Real-world Data with Noisy Labels
Algorithms
Retrieval for RAG
Transformers
ViDT GitHub
A Fully Transformer-based Object Detector
Fast Autoregressive Decoding GitHub
Fast & Robust Early Exit
DisCal GitHub
Calibrated Distillation
MEDUSA GitHub
RGB-D Transformer-based Object Detector
Robust Deep Learning
Awesome Noisy Labels GitHub
Curated List of Noisy Label Research
SELFIE GitHub
Robust Training against Noisy Labels
Prune4Rel GitHub
Robust Data Pruning
MQNet GitHub
Open-set Active Learning
FedRN GitHub
Federated Learning with Noisy Labels