Can Computers Provide Assessment of Descriptive Essays? The Answer is..............
Posted 3 months ago
While online multiple-choice quizzes are now common, one big challenge remains: how to mark descriptive, essay-style answers fairly and accurately without a human examiner. A recent study from Tanzania brought us closer to solving this problem. At the heart of this innovation is a system built using Natural Language Processing (NLP), a branch of artificial intelligence that helps computers understand human language. Developed at the Open University of Tanzania (OUT), the system can automatically evaluate written answers using cosine similarity and a clever technique to represent words as numbers.
This might sound like science fiction, but it's real and working.
Why This Matters?
Imagine you're a university professor with hundreds of students. Marking exams takes weeks. Imagine a computer doing it in minutes, with 73% accuracy, matching human scores in most cases. It would save time and resources, and students would get faster feedback, boosting transparency and learning. This tool could revolutionise higher education in countries like Tanzania, where online education is growing rapidly and lecturer workloads are heavy.
How It Works?
The system reads a student's answer and compares it to a model answer written by a lecturer. But instead of trying to "understand" the answer the way a human would, it transforms both into vectors, mathematical lists of numbers representing the words. Using TF-IDF, a popular search engine algorithm, it weighs each word's importance. Then it measures how close the two answers are, like checking the angle between two arrows.
If they point in the same direction, the computer assumes the student understood the topic, even if the wording is different. This method is known as cosine similarity, and it works well because it doesn't get tripped up by differences in sentence length or grammar.
Real-World Results
Over 100,000 exam answers were processed using this system and compared to grades from Moodle, a popular online learning platform. The results were promising: 75% precision, 71% recall, and 73.5% overall accuracy. More impressively, students themselves trusted the results; 66% said the scores were accurate for short answers, while over 50% were satisfied with the grading of complete essays.
The system didn't just work in theory; it was deployed during real exams with postgraduate and undergraduate ICT students. Even when students found it a bit tricky to type long answers online, their confidence in the marking process remained high.
The Road Ahead
Of course, it's not perfect. Computers still struggle with deep meaning and context, especially when students use different phrases with similar meanings. Words like build and construct might confuse a basic algorithm, but the Tanzanian system begins to bridge this gap by generating synonyms in advance.
Researchers suggest future versions will include even better understanding of semantics using techniques like lemmatization, breaking words down to their base form, and deeper synonym libraries.
What This Means for the World?
As more universities worldwide shift to online learning, especially in developing countries, systems like this could make quality education more accessible. Teachers could focus on mentoring and content development rather than spending hours marking.
It's not about replacing educators but empowering them with smarter tools. When the marking is fair, fast, and trusted, everyone wins.
Additional Reading
View of Automatic marking of descriptive questions of online examinations using NLP