It's trained using GPT-NeoX right. So does it follow the same model architecture? or you guys have implemented custom model architecture? I don't the get the idea of what GPT-NeoX has been used for here. Is it used to trained on large scale just with custom datasets? or am I missing anything?
It's trained using GPT-NeoX right. So does it follow the same model architecture? or you guys have implemented custom model architecture? I don't the get the idea of what GPT-NeoX has been used for here. Is it used to trained on large scale just with custom datasets? or am I missing anything?