Talk on natural language processing in today’s world

This Monday,  Ph.D. Diverso Lab’s student Antonio J. Dominguez gave a presentation about the latest advances in the natural language processing field connecting the advances with software engineering topics.  

The talk addressed the role of scaling a deep learning model to achieve unprecedented results across the whole set of natural language processing evaluations. This exposition of the scaling principles began with the critical property of model parallelism and the attention mechanism, first introduced in 2017 by the famous article “Attention is all you need.”

Then Antonio exposed the scaling laws, which dictate the role of the model number of parameters, training data, amount of tokens, and compute time are the fundamental building blocks for optimal training.

The talk showed the most influential model in the code generation scenario by the hand of DeepMind’s Alpha Code. Antonio’s latest research addresses mono and cross-lingual code using transformer-based language models. The unification of software engineering and generative language models creates an exciting future for program synthesis. However, these models’ vast intrinsic complexity needs classical and novel variability analysis if we want to leverage their maximal potential.

As a final remark, the latest disruptive OpenAI’s model Chat GPT inner properties, behavior, and the novel use of Reinforcement Learning from human feedback for fine-tuning a language model were shown and discussed.