Dolphin 2.0 based on mistral-7b released by Eric Hartford

noneabove1182@sh.itjust.works · edit-2 2 years ago

Dolphin 2.0 based on mistral-7b released by Eric Hartford

justynasty@lemmy.kya.moe · 2 years ago

I was concerned that a large dataset with low sentence similarity may take longer to train. I’m not sure if my idea that novels take less time to train than a Q&A dataset with detailed answers is true: generic roleplay vs encyclopedic knowledge.

Reading these datasets, I think these GPT3/4 conversations go into too much detail, and current (1-40B) language models cannot be trained in such detail. These conversations would be only useful for humans. But I might be wrong about training because I don’t have experience with 100B+ models, and how they scale down.

Dolphin 2.0 based on mistral-7b released by Eric Hartford

Dolphin 2.0 based on mistral-7b released by Eric Hartford

ehartford/dolphin-2.0-mistral-7b · Hugging Face