Main menu

Detection of Hong Kong English in Students’ Writing and Good Practice Analysis

Project Description

The project aims to develop the automatic detection machine to locate students’ Hong Kong English (Chinese and Cantonese) in their writing.  It is widely recognised that peoples’ writings in a second language are influenced by their native language. Identification of the specific pattern has been used to study language transfer effects in second-language acquisition. This is useful for developing pedagogical material, teaching methods, and generating learner feedback.



Dr. Anora Wong, Dr. Amy Kong, Dr. Holly Chung (Department of English)
Dr. David Ling, Mr. Ronald Ching, Ms. Jessica Au, Mr. Jeffrey Lau (Deep Learning Research and Application Centre)


Building an Error Annotated Corpus

We are now building a human annotated corpus on writing of Hong Kong senior students, which is a digital collection of essays of senior Hong Kong students with classified error corrections marked by Hong Kong teachers. The niche of the current corpus is that semantic errors as a result of the influence of spoken Cantonese and Chinese will also be highlighted, and all the errors will be corrected by professional teachers.

The digital corpus will provide realistic English error statistics of Hong Kong senior students, like English error distribution and demographic segmentation. Such statistics are highly useful to English language specialists, researchers, and English language teachers, who may adjust their teaching and/or research focus according to the concrete numerical statistics, resulting in a more effective research-informed teaching and learning process for both teachers and students alike. Our team will also develop a set of notes that highlight common semantic errors made by senior students in Hong Kong, which serve as useful teaching materials for the second language writing classroom in secondary and even post-secondary institutions.


Sentence correction by Deep Learning

Traditional rule-based grammar checkers like Microsoft Word are only able to check simple grammatical errors, and Grammarly, on the other hand, could further detect complicated syntactic errors. However, both checkers fail to detect semantic flaws in students’ writing. Deep-learning-based checkers are able to identify certain semantic errors as well as improving sentence styles, see our preliminary results:



We had a project sharing session with school teachers on the eClass learning day 2018.