neural machine translation (2014)

![[seq2seq-neuralmt.png]] The seq2seq model reads an input sentence “ABC” and produces “WXYZ” as the output sentence ([Sutskever2014](https://arxiv.org/abs/1409.3215)) The development of machine translation (MT) is a long and complex story that spans several decades. The idea of machine translation was first proposed in the 1940s when researchers began to explore the possibility of using computers to automatically translate languages. In the 1950s and 1960s, researchers began to develop rule-based MT systems that used hand-coded rules for language translation. In the 1990s, researchers began to explore the use of statistical models to improve MT. These models used large datasets of parallel texts to learn the probabilities of different translations and to generate more accurate translations. In the early 2010s, deep learning techniques such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) began to gain popularity in many areas of artificial intelligence, including computer vision, speech recognition, and natural language processing. In 2014, a team of researchers at Google published a groundbreaking paper on "Sequence to Sequence Learning with Neural Networks". This paper introduced a new approach to machine translation based on a type of RNN called an encoder-decoder. This approach allowed for the direct translation of sequences of words or characters, rather than relying on hand-coded rules or statistical models. The development of the Google Neural Machine Translation (GNMT) system was led by a team of researchers at Google, including Quoc Le, Mike Schuster, Yonghui Wu, Zhifeng Chen, and others. Quoc Le is widely considered to be one of the key figures behind the development of the GNMT system. Quoc Le is a Vietnamese-American computer scientist and artificial intelligence researcher who currently works as a research scientist at Google Brain. He received his PhD from Stanford University in 2011, where he worked on deep learning and natural language processing. After completing his PhD, he joined Google, where he has been involved in several high-profile projects, including the development of the GNMT system and the creation of the TensorFlow machine learning framework. In the years that followed, researchers made many improvements to the architecture and training methods of NMT systems, including the use of attention mechanisms to improve translation accuracy and the incorporation of additional linguistic features such as part-of-speech tags and syntax trees. Overall, the development of NMT represents a major milestone in the history of machine translation, offering significant improvements in accuracy and efficiency over earlier approaches. The field continues to evolve, with ongoing research focused on improving the robustness and flexibility of NMT systems, as well as expanding their capabilities to new languages and domains.