System Combination and Mixture of Experts for Grammatical Error Correction
Abstract:
Every year, the number of English speakers as a second language (ESL) and English speakers as a foreign language (EFL) keeps increasing. As such, the need for grammar correction tools is also increasing in parallel. Many grammatical error correction (GEC) systems with different characteristics have been proposed, and these systems have complementary strengths to one another. Studies have shown that combining multiple GEC systems can lead to a more accurate GEC system. However, not much research has been done on GEC system combination methods.
This thesis carries out an investigation into the best way to combine multiple GEC systems. We believe that existing GEC systems have the potential to generate more accurate corrections if their strengths are combined in the correct way. We believe that combining existing GEC systems is a more efficient way to get a more accurate system compared to building a larger and more complex model. In addition, even when a new and stronger GEC system is published, we can always use system combination methods to combine it with other GEC systems to further increase the accuracy.
We first look at a novel way to formulate the GEC system combination task as a simple machine learning task, which is binary classification. We investigate training a logistic regression model with only the edit type features to combine multiple GEC systems. We find that it is more effective in generating better correction compared to previous system combination methods and conventional ensemble. We also show that our method has more expressive power than previous GEC system combination methods.
We then investigate a GEC system combination method that also considers the textual features of the hypotheses from the base systems. We are specifically interested in knowing whether a GEC quality estimation model can be utilized for GEC system combination and how effective existing GEC quality estimation models are in that context. We found that existing GEC quality estimation models fail to differentiate good corrections from bad ones, which renders them ineffective for GEC system combination. We then continue our research to build a stronger GEC quality estimation model. The model successfully outperforms previous methods on quality estimation, re-ranking, and system combination tasks. We also investigate various biases that can be added to improve its performance further. By integrating our model with voting bias and with edit scores from our previous system combination method, our model produces the highest F0.5 scores on standard GEC benchmarks compared to prior work.
Lastly, we explore an approach to achieve the benefits of system combination in a more efficient manner. System combination incurs a high computational cost due to the need to run inference on the base systems before running the combination method itself. It would be more efficient to have a single model with multiple sub-networks that specialize in correcting different error types. To achieve this, we propose a novel GEC model with a mixture-of-experts architecture and a routing function that selects the experts based on the error type of each word in the text. The model successfully achieves the performance of a strong GEC model while utilizing three times fewer effective parameters. Furthermore, it generates interpretable corrections by identifying error types during inference.