PITTSBURGH (Nov. 6, 2019) — Technology that translates
spoken languages in real time is a boon to travelers and the hearing impaired
alike. But what about a language that isn’t spoken?
That problem inspired a team of students from the University
of Pittsburgh’s Swanson School of Engineering to create
a program that translates American Sign Language (ASL) to voice using
machine learning. The project recently won the Bronze Award at the InnovateFPGA
2019 Global Contest Regional Final.
The sign language reader uses a camera and AI to identify
the hand gestures used in ASL and translates them into sentences, which would
benefit the hundreds of thousands of people in the U.S. who rely on ASL to
communicate. The program could run on a smartphone, for example.
“The idea for an ASL translator was formed when our team was
researching what kind of embedded AI applications can improve the experience of
communication among different groups of people,” says Haihui Zhu, a student
studying computer engineering and member of the team. Zhu notes that the
Americans with Disabilities Act requires places like hospitals and other public
services to provide human ASL interpreters. “Now imagine that a software that
translates ASL into English can be deployed on a smartphone and executed
real-time in an FPGA hardware accelerator. We believe that such a solution can
improve the service of public facilities.”
In addition to being a useful tool for the hard of hearing, a
key feature of the program is its scalability.
“I think the biggest challenge in this project was to design
a fast and scalable machine-learning pipeline. On the input side, it is the
video stream from the camera. On the output side, it is the English text,”
explains Zhu. “To solve this problem, our strategy was to divide it into
multiple stages: hand detection, hand keypoint detection, keypoint-to-alphabet,
and finally, construct lexicons from the alphabet stream. To add a new sign to
the ‘vocabulary,’ we just need to encode the hand motion of that sign.”
The team included Zhu, Christopher Pasquinelli, and Roman
Hamilton, all undergraduates in computer engineering at Pitt. Though the
competition is over, Zhu says they plan to continue their work.
“There are several challenges that have not been solved yet:
one, our solution only looks at the hand motion, but to truly understand the
sign language, we must look at facial expressions and hand motion
simultaneously; and two, we want to improve the performance of the machine
learning model. There are lots of exciting research and development tasks that
we can further work on.”
Students who are
interested in machine learning, speech processing, or the project will be
welcome to join the team.
2019 Global Contest invited students, professors, makers and industry to
showcase their idea of how field programmable gate arrays (FPGAs) can be used
to develop cutting edge smart devices.
The team’s Bronze Award in the Regional Finals includes a
certificate, a cash award of $800, and the Max 10 Plus FPGA main board.
Maggie Pavlick, 11/6/2019
Contact: Maggie Pavlick