The Future of Speech-Based Human-Computer Interaction
Reporter: Ethan Coomber, Research Assistant III
2021 LPBI Summer Internship in Data Science and Podcast Library Development
This article reports on a research conducted by the Tokyo Institute of Technology, published on 9 June 2021.
As technology continues to advance, the human-computer relationship develops alongside with it. As researchers and developers find new ways to improve a computer’s ability to recognize the distinct pitches that compose a human’s voice, the potential of technology begins to push back what people previously thought was possible. This constant improvement in technology has allowed us to identify new potential challenges in voice-based technological interaction.
When humans interact with one another, we do not convey our message with only our voices. There are a multitude of complexities to our emotional states and personality that cannot be obtained simply through the sound coming out of our mouths. Aspects of our communication such as rhythm, tone, and pitch are essential in our understanding of one another. This presents a challenge to artificial intelligence as technology is not able to pick up on these cues.

In the modern day, our interactions with voice-based devices and services continue to increase. In this light, researchers at Tokyo Institute of Technology and RIKEN, Japan, have performed a meta-synthesis to understand how we perceive and interact with the voice (and the body) of various machines. Their findings have generated insights into human preferences, and can be used by engineers and designers to develop future vocal technologies.
– Kate Seaborn
While it will always be difficult for technology to perfectly replicate a human interaction, the inclusion of filler terms such as “I mean…”, “um” and “like…” have been shown to improve human’s interaction and comfort when communicating with technology. Humans prefer communicating with agents that match their personality and overall communication style. The illusion of making the artificial intelligence appear human has a dramatic affect on the overall comfort of the person interacting with the technology. Several factors that have been proven to improve communication are when the artificial intelligence comes across as happy or empathetic with a higher pitched voice.
Using machine learning, computers are able to recognize patterns within human speech rather than requiring programming for specific patterns. This allows for the technology to adapt to human tendencies as they continue to see them. Over time, humans develop nuances in the way they speak and communicate which frequently results in a tendency to shorten certain words. One of the more common examples is the expression “I don’t know”. This expression is frequently reduced to the phrase “dunno”. Using machine learning, computers would be able to recognize this pattern and realize what the human’s intention is.
With advances in technology and the development of voice assistance in our lives, we are expanding our interactions to include computer interfaces and environments. While there are still many advances that need to be made in order to achieve the desirable level of communication, developers have identified the necessary steps to achieve the desirable human-computer interaction.
Sources:
Tokyo Institute of Technology. “The role of computer voice in the future of speech-based human-computer interaction.” ScienceDaily. ScienceDaily, 9 June 2021.
Rev. “Speech Recognition Trends to Watch in 2021 and Beyond: Responsible AI.” Rev, 2 June 2021, http://www.rev.com/blog/artificial-intelligence-machine-learning-speech-recognition.
“The Role of Computer Voice in the Future of Speech-Based Human-Computer Interaction.” EurekAlert!, 1 June 2021, http://www.eurekalert.org/pub_releases/2021-06/tiot-tro060121.php.
Other related articles published in this Open Access Online Scientific Journal include the Following:
Deep Medicine: How Artificial Intelligence Can Make Health Care Human Again
Reporter: Aviva Lev-Ari, PhD, RN
https://pharmaceuticalintelligence.com/2020/11/11/deep-medicine-how-artificial-intelligence-can-make-health-care-human-again/
Supporting the elderly: A caring robot with ‘emotions’ and memory
Reporter: Aviva Lev-Ari, PhD, RN
https://pharmaceuticalintelligence.com/2015/02/10/supporting-the-elderly-a-caring-robot-with-emotions-and-memory/
Developing Deep Learning Models (DL) for Classifying Emotions through Brainwaves
Reporter: Abhisar Anand, Research Assistant I
https://pharmaceuticalintelligence.com/2021/06/22/developing-deep-learning-models-dl-for-classifying-emotions-through-brainwaves/
Evolution of the Human Cell Genome Biology Field of Gene Expression, Gene Regulation, Gene Regulatory Networks and Application of Machine Learning Algorithms in Large-Scale Biological Data Analysis
Reporter: Aviva Lev-Ari, PhD, RN
https://pharmaceuticalintelligence.com/2019/12/08/evolution-of-the-human-cell-genome-biology-field-of-gene-expression-gene-regulation-gene-regulatory-networks-and-application-of-machine-learning-algorithms-in-large-scale-biological-data-analysis/
The Human Genome Project
Reporter: Larry H Bernstein, MD, FCAP, Curator
https://pharmaceuticalintelligence.com/2015/09/09/the-human-genome-project/
This is very insightful. There is no doubt that there is the bias you refer to. 42 years ago, when I was postdocing in biochemistry/enzymology before completing my residency in pathology, I knew that there were very influential mambers of the faculty, who also had large programs, and attracted exceptional students. My mentor, it was said (although he was a great writer), could draft a project on toilet paper and call the NIH. It can’t be true, but it was a time in our history preceding a great explosion. It is bizarre for me to read now about eNOS and iNOS, and about CaMKII-á, â, ã, ä – isoenzymes. They were overlooked during the search for the genome, so intermediary metabolism took a back seat. But the work on protein conformation, and on the mechanism of action of enzymes and ligand and coenzyme was just out there, and became more important with the research on signaling pathways. The work on the mechanism of pyridine nucleotide isoenzymes preceded the work by Burton Sobel on the MB isoenzyme in heart. The Vietnam War cut into the funding, and it has actually declined linearly since.
A few years later, I was an Associate Professor at a new Medical School and I submitted a proposal that was reviewed by the Chairman of Pharmacology, who was a former Director of NSF. He thought it was good enough. I was a pathologist and it went to a Biochemistry Review Committee. It was approved, but not funded. The verdict was that I would not be able to carry out the studies needed, and they would have approached it differently. A thousand young investigators are out there now with similar letters. I was told that the Department Chairmen have to build up their faculty. It’s harder now than then. So I filed for and received 3 patents based on my work at the suggestion of my brother-in-law. When I took it to Boehringer-Mannheim, they were actually clueless.