Computational Linguistics: Giving Computers a Voice
Jarvis, you there?
At your service sir.
Ever since the first computer was invented, people have wanted the ability to communicate with a machine as they would with another person. Jarvis, Ironman’s main virtual assistant, is a great example of a computer that’s able to effectively communicate with humans. While we might not have virtual assistants as capable as Jarvis yet, the world is getting closer and closer to making programs like Jarvis a reality with the field of Computational Linguistics.
What is Computational Linguistics?
Computational Linguistics (often used interchangeably with and is closely related to the terms “speech and language processing”, “human language technology”, “natural language processing”, and “speech recognition and synthesis”) uses properties of words and sentences to create computer algorithms and programs that can decipher text or spoken words and extract meaning (which can then be used to complete other actions). Essentially, Computational Linguistics is the science behind making computers understand language.
One of the earliest examples of Computational Linguistics in practice is ELIZA, a program created in 1966. ELIZA was an early natural language processing system that could carry out limited conversations with users (See image below). ELIZA used simple pattern matching to process inputs and come up with a reply. However, ELIZA had limited responses and could only maintain a specific set of conversations effectively.
In the roughly 60 years since ELIZA, natural language processing has improved drastically. In 2020, virtual assistants like the Google Assistant, Apple’s Siri, Microsoft’s Cortana, Samsung’s Bixby, and Amazon’s Alexa can be found in smartphones, computers, speakers, fridges, TV’s, and more. These virtual assistants can understand commands and maintain communication in multiple languages. People can ask Siri to call their mom or ask Alexa to order Clorox Wipes. But at the moment, these programs are mostly limited to answering commands and running through set procedures. This makes them great for online booking systems and smart home control. However, they are not yet at the level needed to replace a human in conversation. That’s the challenge that researchers and people in the field of Computational Linguistics seek to overcome.
How Does It Work?
Computational Linguistics is the intersection between Linguistics and Computer Science, and as such, uses principles from each field in conjunction. The main challenge is figuring out how to decipher text or sound.
Text is easily captured and read using basic programming. Sound, on the other hand, has to be captured and converted into a form of data that can be processed using knowledge about phonetics and phonology (a future post will explain this process in greater detail). From there, the process of analyzing is essentially the same for text and sound.
Properties of words and sentences are used to create patterns that computers can use to match the input information with. For example, morphology is used to differentiate between a singular “pencil” and multiple “pencils”, and Compositional semantics is used to define what “Northern Water Tribe” means in comparison to “Southern Water Tribe”. Similarly, lexical disambiguation helps determine the meaning of words based on context (“I completed the project” vs. “I project that our income will increase”).
Using this set of rules and patterns that words and sentences follow, mathematicians can create state machines, probabilistic models, and vector-space models that allow computer programmers to code depth-first search and A* search algorithms that compare inputs with the patterns, trying to decipher the meaning behind text or sounds (a future post will explain this process in greater detail).
Computational Linguists help advance this process and work with experts in many fields to continue the research and development of language processing systems.
Where are we now?
Along with making communication with computers more feasible, Computational Linguistics has many other applications. Data Analysts are using natural language processing and Computational Linguistics to extract and research data from thousands of documents, social media posts, emails, and texts. Healthcare companies are using Computational Linguistics to analyze and identify lumps from CT, MR, and X-Ray reports. Historians are using AI and Computational Linguistics to recreate historical smells.
As the field of Computational Linguistics develops, its applications will continue to broaden and make a greater impact in the world we live.
References:
Henderson, Harry. “Linguistics and Computing.” Encyclopedia of Computer Science and Technology, Third Edition, Facts On File, 2017. Science Online, online.infobase.com/Auth/Index?aid=17972&itemid=WE40&articleId=285507. Accessed 17 Nov. 2020.
Jurafsky, Dan, and James H. Martin. Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Dorling Kindersley Pvt, Ltd., 2014.
Do you have any thoughts, questions, or concerns? Leave a comment below!