Collections
Discover the best community collections!
Collections including paper arxiv:2403.16971
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 123 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 49 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 12 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 64
-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 39 -
microsoft/phi-1_5
Text Generation • Updated • 93.2k • 1.31k -
Language models scale reliably with over-training and on downstream tasks
Paper • 2403.08540 • Published • 14 -
Akashpb13/Swahili_xlsr
Automatic Speech Recognition • Updated • 354 • 8
-
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 74 -
Character-LLM: A Trainable Agent for Role-Playing
Paper • 2310.10158 • Published • 1 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 64 -
RakutenAI-7B: Extending Large Language Models for Japanese
Paper • 2403.15484 • Published • 12
-
Measuring the Effects of Data Parallelism on Neural Network Training
Paper • 1811.03600 • Published • 2 -
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Paper • 1804.04235 • Published • 2 -
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Paper • 1905.11946 • Published • 3 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 61