These days, the size of largescale language models such as GPT-3 has been growing larger with parameters ranging in the 100 billions to trillions due to model scaling laws. Model scaling laws apply to models of various other fields as well, such as image and audio processing. However, there is a shortage of the information and technology required to train such large ML models.
In this session, I’ll explain in detail about the techniques that can be used to train largescale ML models. In addition, I’d like to share the experience gained from training an AI-Text filtering model with a large ML model and then providing it as a service, and finally how we improved the performance of the model.