Automatic Grammatical Error Correction System
English may be the third most commonly spoken language in the world but many people continue to struggle with the intricacies of the language, from spelling to grammar. In an increasingly digital world where emails and instant messages are becoming the de facto communication channel, making grammatical mistakes can mar a person’s first impression or undermine a brand’s credibility. Today, word processing software offer spelling and grammar checks but these are generally basic functions often limited to the word level. There is a need for more robust processes that can provide corrections to grammatical errors at the sentence level to support those whose written English may be lacking. This would be a powerful tool for application companies producing word processors and other related software, as well as those involved in English language education for native and non-native speakers. It could also bolster the online language learning market, which is expected to grow at a CAGR of 13% from 2019 to reach US$10.5 billion by 2025.
This invention discloses an artificial intelligence system for automatic grammatical error detection and correction. Unlike other existing algorithms, it is able to correct complete sentences that may contain multiple and interacting errors.
The invention comprises a decoder that performs a beam search over the possible hypotheses (i.e., corrected versions of the input sentence) to determine the best possible one available. The search starts from the original erroneous sentence. At each step, a set of proposers generates new hypotheses by making incremental changes to the current hypothesis. These hypotheses are scored by a set of ‘experts’ on grammatical correctness. These experts include discriminative classifiers for specific error types such as article and preposition errors. Each hypothesis receives a final score based on the linear combination of their expert scores according to the decoder model, and the hypothesis with the highest score is judged to be the best possible correction.