- Current State of Artificial Intelligence and Data Mining

Current State of Artificial Intelligence and Data Mining

AI as Media Spectacle

British science fiction writer Arthur Clarke presents three laws that underpin his writings, and the very last, often-cited law reads "any sufficiently advanced technology is indistinguishable from magic." Various marketing tactics surrounding the domain, seems eager to benefit from such sentiments: human experts are pitted against AI algorithms in a game of Jeopardy, while consumers are now familiar with state-of-the-art virtual assistants available on mobile phones and smart speakers.

Artificial intelligence and its related fields, including machine learning and cognitive computing, seem to be at the cusp of something revolutionary. However, our fascination with regards to sentient and intelligent robots (along with doomsday scenarios) isn't anything new, and was often quickly extinguished each time --- thanks to the gap between impressive media spectacles and consumer products that soon followed.

Technologies that imitate human speech emerged in the early 1930s, and military research focused on computer-assisted language processing throughout the 1960s Cold War. Media and government interests in AI research, however, fluctuated throughout the 1970s and 1990s as the domain failed to produce generalizable artificial intelligence products. Beyond the game of chess (and much later, Go) between the computer and the human champion, many now recognize that AI products are closer to the washing machine than to the Terminator.

Turing Test and Derivatives

One of the fundamental concepts underpinning artificial intelligence research is Alan Turing's eponymous Turing Test: a thought experiment where a human user interacts with two anonymous respondents, where one of them is a computer, and is asked whether the user can determine which of the two is a computer. According to the test, should the user fail to correctly discern one from another, the computer is considered to be as smart as a human being.

Naturally, this inspired a countless number of critics and projects: ELIZA was a 1966 chatbot that poses as a client-centred therapist, while CAPTCHA, later ReCAPTCHA with crowdsourcing capabilities was a reverse Turing Test product where a computer algorithm tests whether the website visitor is a robot or not by presenting visually garbled keywords.

Against the big promises made by AI scientists, the Chinese room argument is another thought experiment stating that passing the Turing Test alone cannot prove that a machine is "human," no matter how human-like it may be. A human respondent may be able to translate every foreign phrase if the respondent has an infinitely large set of instructions that one can use to map the provided phrase to the answer, but the Chinese room argument stipulates that this should not equate to the respondent well-versed in the language. This also spawned a slew of philosophical discussion surrounding artificial intelligence in comparison to humans.

Automation and Data Mining

Large-scale information in the digital space, often dubbed big data, continues to be a contentious subject as well as a valuable currency. Consumers fret over privacy issues highlighted during the Facebook-Cambridge Analytica scandal, while companies obsessively collect publicly-available information to build high impact products and even help police investigation. Amazon's Mechanical Turk platform creates a marketplace where researchers can easily outsource difficult-to-automate, human-specific tasks by paying a small amount per "human intelligence task," and Twitter is eager to sell its historical data to academic institutions and interested enterprises.

With the recent court ruling preventing LinkedIn from stopping those who scrape publicly available data on their website, the game of cat-and-mouse continues on. Fundamental concepts in computer science including regular expression allow the developers to cut through the complex code and retrieve only the relevant strings that match a specified pattern, while browsers nowadays are automated using Selenium and Puppeteer to perform the monumental task of browsing the web and emulating human behaviour. Such automation led to other side effects, including the state-sponsored propaganda on social media.