Recently, I’ve shared an article with a colleague of mine. The article had been published in a peer-reviewed journal and the contents were original and interesting. On the other hand, my colleague, being a meticulous reader of scientific texts, has immediately spotted a few simple grammar errors. It was very easy to blame the authors and editors for not correcting such errors before publication, but this triggered another question:
Why don’t we have open source and very high quality grammar checking software that is already integrated into major text editors such as VIM, Emacs, etc.?
Any user of recent version of MS Word is well aware of on-the-fly grammar checking, at least for English. But as many academicians know very well, many of them use LaTeX to typeset their articles and rely on either well-known text editors such as VIM and Emacs, or specialized software for handling LaTeX easily. Therefore, to tell these people “go and check your article using MS Word, or copy paste your article text to an online grammar checking service” does not make a lot of sense. Those methods are not convenient and thus not very usable by hundreds of thousands of scientists writing articles every day. But what would be the ideal way? The answer is simple in theory: We have high quality open source spell checkers, at least for English, and they have been already integrated into major text editors, therefore scientists who write in LaTeX have no excuse for spelling errors, it is simply a matter of activating the spell checker. If only they had similar software for grammar checking, it would be very straightforward and convenient to eliminate the easiest grammar errors, at least for English.
A quick search on the Internet revealed the following for grammar checking on GNU/Linux:
Even though Java based Language Tool software seems more comprehensive, and VIM integration looks much more promising than the one for Emacs (via link grammar), it still does not seem on par with the grammar checker of MS Word. Having said that, there are academicians who criticize MS Word’s grammar checking capabilities: Sandeep Krishnamurthy has dedicated a detailed web page for this at http://faculty.washington.edu/sandeep/check/.
My conclusion can be summarized as follows: Even though the current situation for grammar checking seems to be progressing, it is far from ideal. To have higher quality published articles without grammar errors, and also taking into account the huge population of academicians whose mother tongue is not English, we need smooth integration of grammar checkers into major text editors, as well as specialized LaTeX software.
Moreover, it is important to have a public web page that announces the results for comparing the accuracy and performance of various grammar checkers, including the commercial ones. This would raise the awareness for language quality and help competing software systems show where they stand.
The final question remains, though: Do software developers and Natural Language Processing (NLP) experts have the required incentives to enhance automated quality assurance of written text, not only for English but also for many other languages? The answer does not seem to be a definitive yes.
No amount of automation will probably be able to deal with all the intricacies of natural languages, and a few stupid mistakes in a computer science article will neither cost the authors any money, nor kill anyone. Nevertheless, the challenge remains and is a valid one.