π§ͺ Model merging takes multiple LLMs and combines into one. Itβs cost-effective (no GPU needed), and can be done on consumer hardware.
π Merging can result in SOTA models as evaluated on the Hugging Face Open LLM Leaderboard. My neurotic-crown-clown model just ranked 3rd (avg. 76.38) from thousands of 7B models on π€. Whoop!
π Required absolutely no skill on my part, just time to read and try. As part of my learning journey, I worked through a great blog on merging from Maxime Labonne. I really β€οΈ the knowledge sharing mindset of the open source LLM community, this space is really special and a great place to be.
βοΈ My strategy? Simple, build on the shoulders of giants with strong base models for the merge:
- NeuralMonarch by Maxime Labonne
- AlphaMonarch by Maxime Labonne
- Jaskier-7b-dpo-v5.6 from bards.ai
β As an aside, is merging merges becoming a way to game eval leaderboards?
Neurotic-crown-clown
Neurotic-crown-clown GGUF quantise
Neurotic-crown-clown AWQ quantised
Neurotic-crown-clown EXL2 quantised
HF open LLM leaderboard