Almost a quarter of the country’s schools are testing
‘thinking’ technology designed to assess everything from an essay’s style and
structure to its logic and remove human error
One in every four schools in China is quietly
testing a powerful machine that uses artificial intelligence to mark pupils’
work, according to scientists involved in the government programme. The
technology is designed to understand the general logic and meaning of the text
and make a reasonable, human-like judgment about the essay’s overall quality. It
then grades the work, adding recommended improvements in areas such as writing
style, structure and theme.
The technology, which is being used in around
60,000 schools, is supposed to “think” more deeply and do more than a standard
spellchecker. For instance, if a paragraph starts trailing off topic, the
computer would mark it down. It could help to reduce the amount of time
teachers spend on grading essays and help them avoid inconsistencies caused by
human errors such as lapses in attention or unconscious bias.
It could also help more students, especially
those in remote areas with limited access to resources, improve their writing
skills more quickly. The machine is similar to the e-rater, an automated system
used by the Education Testing Service in the US to grade prospective
postgraduate students’ essays.
But unlike the e-rater, it can read both
Chinese and English. Artificial intelligence is developing rapidly in China
with strong support from the government and the technology is used in many
areas of everyday life. But the extensive tests of the essay grading machine –
built by some of the leading language processing teams involved in the
government and military’s internet surveillance programme – were carried out
with unusual security measures in place.
In most of the schools taking part in the
programme, parents were not informed, access to the system terminals was
limited to authorised staff, test results were strictly classified, and in some
classes even the pupils were unaware that their work had been read and scored
by a machine. Wang Jing, director of academic affairs office in the High School
Affiliated to Renmin University, one of the country’s most prestigious schools,
said: “We are treating [the test] with extreme caution.
“What happens on campus stays on campus. The
test results will not be revealed to the public,” he added, in line with the
school’s agreement with the project organisers. Most schools interviewed by
the South China Morning Post – including the Baita
Middle School in Nanchang, Sichuan province; the Fifth High School in Fuyang,
Anhui and the 58th High School in Qingdao, Shandong – gave a similar
assessment.
The schools said the AI grading machine was
far from perfect, with teachers citing many examples where a brilliant piece of
writing was given low marks. The software is presently being used to mark only
internal tests and none of the schools had plans to use the technology to grade
essays in exams that would affect pupils’ official academic record. “It’s still
in its infancy,” Wang said. But the developers say the machine is already 10
years old and they are increasingly confident about its potential.
A scientist involved in the project at the
school of computer science and engineering at Beihang University in Beijing
compared it to the AlphaGo, an AI Go player developed by Google which has
defeated human world champions over the past couple of years. The essay grading
machine, embedded in a cluster of fast computers in Beijing, is improving its
ability to understand human language by using deep learning algorithms to
plough through essays written by Chinese students and “compare notes” with
human teachers’ grading and comments.
It is also able to collect and build its own
“knowledge base” with little or no human intervention. “It has evolved
continuously and become so complex, we no longer know for sure what it was
thinking and how it made a judgment,” said the researcher, who requested not to
be named due to the sensitivity of the project.
According to a government document seen by the
South China Morning Post, the tests involved 60,000
schools with more than 120 million people involved. The AI and human grader
gave the same score 92 per cent of the time, but the document did not specify
the content and scale of the tests. The researcher confirmed the figures but
declined to reveal more details.
“In the future it may be used to relieve the
teacher’s burden but it will never replace teachers. The machine has no soul,”
he added. The essay grading machine project was led by professor Zhou Jianshe,
director of the research centre for language intelligence in China in Capital
Normal University. Zhou and other senior members of the project have received
government and military awards for their contributions to natural language
processing and mining information from big data.
Zhou could not be reached for comment. The
machine can be accessed from various online portals but are only open to
registered users. One English portal, pigai.org, requires a user to register
either as a teacher or student and provide information, such as school name and
class number, to verify their identity. Users gave a similarly mixed response
to the machines. While some said they were useful and more accurate than
similar essay grading systems overseas, others described them as rubbish.
Some users have argued the software cannot
distinguish between academic essays and other forms of writing. One user on
Zhihu, the largest question-and-answer website in China, posted a screenshot
showing how the machine had assessed an April 2015 Washington Post comment piece “Why is Obama sticking it to stay-at-home moms?” as if
it were answering an essay question.
The piece got a score of 71.5 out of 100 and
the machine said that while the vocabulary used was “rich and appropriate” it
was “slightly short for academic language”. It concluded: “The flow can be
improved on smoothness; and please improve the focus of the article; the
paragraphs and sentences should be related to the topic.” Zhu Xiaoyan, head of
the state key laboratory of intelligent technology and systems at Tsinghua
University, said human language AI technology has achieved significant progress
in recent years.
She said some machines have written articles
that went viral on social media, attracting more than 10 million views, but did
not provide further details. But Zhu said she had not heard of the
essay-grading program adding that she would not use a machine to grade her
students’ papers, adding: “It’s a human job.” Yu Yafeng, professor at the
institute of educational theories at Beijing Normal University, said computers
could help grade candidates in subjects such as mathematics and physics because
the answers were objective.
But essays can contain cultural, emotional or
personal elements that a machine would not be able to gauge. “There is no law
forbidding AI from grading student essays, but this practice should raise
ethical questions,” she said. An eight-year-old primary school pupil in
Chaoyang district, Beijing said he did not mind an AI machine checking his
essays, pointing out that his teachers already used readily available
technology to check the answers to basic maths questions. “Our teachers are
already using mobile phones to grade our maths homework,” he said. “Take a
photo and the score is out. What’s the difference to an essay?”
Thanks for sharing such a great information.
ReplyDeleteNLP Techniques
NLP Techniques
NLP Techniques
NLP Techniques