预训练语言模型参数量统计、中文预训练语言模型

BERT、BART、GPT-2、XLNet等模型参数量统计见:

https://huggingface.co/transformers/v2.2.0/pretrained_models.html

GPT-2有15亿参数(gpt2-xl,48-layer, 1600-hidden, 25-heads, 1558M parameters)

GPT-3 (around 175 Billion trainable parameters, 96 attention layers)

中文预训练语言模型见:

https://github.com/lonePatient/awesome-pretrained-chinese-nlp-models

 

 

参考:

GPT-3,https://www.springboard.com/blog/ai-machine-learning/machine-learning-gpt-3-open-ai/#:~:text=In%20fact%2C%20with%20around%20175,and%20limitations%20of%20the%20model.

发表评论