Model Merging Magic - Creating SOTA Models Without Advanced Skills or GPUs
๐งช Model merging takes multiple LLMs and combines into one. It's cost-effective (no GPU needed), and can be done on consumer hardware.
๐ Merging can result in SOTA models as evaluated on the Hugging Face Open LLM Leaderboard. My neurotic-crown-clown model just ranked 3rd (avg. 76.38) from thousands of 7B models on ๐ค. Whoop!
๐ Required absolutely no skill on my part, just time to read and try. As part of my learning journey, I worked through a great blog on merging from Maxime Labonne. I really โค๏ธ the knowledge sharing mindset of the open source LLM community, this space is really special and a great place to be.
โ๏ธ My strategy? Simple, build on the shoulders of giants with strong base models for the merge:
- NeuralMonarch by Maxime Labonne
- AlphaMonarch by Maxime Labonne
- Jaskier-7b-dpo-v5.6 from bards.ai
โ As an aside, is merging merges becoming a way to game eval leaderboards?
Neurotic-crown-clown
Neurotic-crown-clown GGUF quantise
Neurotic-crown-clown AWQ quantised
Neurotic-crown-clown EXL2 quantised
HF open LLM leaderboard
