Machine Translation
Project Description
Deep Learning has been proven to lead to good outcomes for machine translation, as demonstrated by, for example, Google Translate. When the translation is confined to documents for a specific domain (or fixed context), we believe the translation outcome could be even better and practically used, especially with the support of HSMC’s School of Translation, which will provide the project with relevant expertise and advice
This project involves translation of financial documents (the specific domain) between English and Chinese, in particular, English-to-Chinese translation of initial public offering (IPO) prospectuses and similar offering documents (hereafter called “IPO Documents”) that are submitted to the Hong Kong Stock Exchange (HKEX) as the HKEX requires these documents to be provided in both English and Chinese. Since IPO Documents are generally drafted in English by professionals (lawyers and accountants) in Hong Kong, financial printers, which print the IPO Documents, often undertake the responsibility of the translation and translation constitutes a heavy part of their revenues and costs. An MOU has been signed with Alpha Financial Press Limited to carry on this project.
Reports
- “Work Report 16 Jan-9 Feb 2017”
- “Word2Vec Wiki Experiments”
- “Word2Vec: Word Preprocessing and Full-Wiki Experiment on Linux”
- “Running Word2Vec with Chinese/English Wikipedia Dump”
- “Downloading IPO Document with Python Script”
- “Running Sequential to Sequential Learning with Keras”
- “Processing Raw Text”
- “Progress Report: Neural Machine Translation for IPO Documents (16 Jan-10 Apr, 2017)”
- “Training Seq2Seq with Movie Subtitles”
- “Developing Seq2Seq Model with TensorFlow”
- “Visit to Alpha & Recent Work”
- “Developing Seq2Seq Model with Tensorflow (2)”
- “Neural Machine Translation Project Report”
- “Some Tips for Neural Machine Translation”
- “Neural Machine Translation Results for IPO Documents”
- NMT refinery
- Processing Data with Named Entity Recognizer
- Paper note: On Using Very Large Vocabulary for Neural Machine Translation
- Update of Recent Work: Demo for Alpha