RoBERTa-A Robustly Optimized BERT Pretraining Approach
Pointer Networks
MASS-Masked Sequence to Sequence Pre-training for Language Generation
Transformer-XL Attentive Language Models Beyond a Fixed-Length Context
Ubuntu 18.04环境下搭建SSR
Multi-Sample Dropout for Accelerated Training and Better Generalization
标签平滑
深度学习权重初始化
训练神经网络不得不看的33个技巧
Contextual String Embeddings for Sequence Labeling
avatar
lonePatient
一个致力于记录技术的博客
Follow Me
公告
记录和分享一些学习和开源内容,若有任何问题可通过留言板或者微信公众号给我留言,谢谢!