Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.09673

Foundation AI Papers (II)

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30 • 46
Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73
ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 59
KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

Lora variations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182
Flora: Low-Rank Adapters Are Secretly Gradient Compressors

Paper • 2402.03293 • Published Feb 5 • 4
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

Paper • 2401.11316 • Published Jan 20 • 1
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45

RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 66
InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 28
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 86

To read... eventually

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 123
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 49
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 64

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 103
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29 • 34
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 43

papaer selecting

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 43
Linear Transformers are Versatile In-Context Learners

Paper • 2402.14180 • Published Feb 21 • 6
Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 19
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 590

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26 • 26
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 86
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Paper • 2405.17428 • Published May 27 • 16

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 36
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

Data Efficient Approaches

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 24
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16 • 24

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 7 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 78
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 140
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs