Does AI need to be magical?
craft ai team
Jul 12, 2016 |
You often see this kind of picture in articles about AI
When AlphaGo defeated Go player Lee Sedol, many commentators were quick to argue that it was not “AI”. Yet regardless of considerations on whether programs can be made truly intelligent, the field of software engineering known as AI is thriving, with very practical applications like web ads, chatbots and self-driving cars. So you could think that among its practitioners, philosophical considerations on what is or is not AI would take a back seat to results and algorithms. Well, not exactly.
The distinction between weak and strong artificial intelligence is generally accepted, and in the academic and industrial field called AI most would willingly agree that they’re not making a strong AI. But even among users of weak AI techniques, we keep hearing conversations about which methods and algorithms truly deserve to be called AI. Even among developers, there is a sense that an algorithm has to be inscrutable or based on complex math for it to deserve the moniker. And by extension, to be “good”. But do you need to look nerdy to be smart?
The AI effect
We don’t understand intelligence. The details of how our own minds work still elude us. And at the same time, we find it hard to believe that something as mysterious and versatile as intelligence could be as straightforward as an automated process. So whenever the realm of what computers can do expands to include something that was previously done only by humans, technical-minded observers scramble to discount the advance as not truly AI. This is the AI effect.
In this way AI is very similar to magic. Magic is another word for something inexplicable, a phenomenon that we cannot understand or break down into reproducible bits. If someone makes a house levitate without using any visible physical device, it’s magic, but if we see cables attached to it that’s just boring physics… even if the result is the same. If a chatbot accurately detects a user’s mood, it’s advanced AI that understands human emotions, but if we realize it did that from simple rules based on clever markers in the user’s speech, it’s just plain old basic programming.
The interesting contraposition to this effect is the conception that for something to be intelligent, it cannot be “just” a computer program. And this idea seems to be widespread even among developers and tech companies: if you want to add some top notch intelligence to your program, you need to use a “true” AI technique, like deep learning.
AI == deep learning
The undeniable success of deep learning in making some cool applications work has given it the aura of a magic bullet, a key to intelligence. And the inscrutability of the algorithm has much to do with this aura: as some other methods are now commonplace and less mysterious, deep learning is at the frontier of AI. Not only is it the main category of AI algorithms that people talk about, but also the one that they have in mind when they want to analyze complex data but are not sure how. This seems to be true even among users of other machine learning techniques, who see their everyday analysis methods as mundane and imperfect number crunchers, compared to the inexplicable prowess of a real AI technique.
The boom of big data has spread the idea that whenever you have data, you should throw it all at a magical learning algorithm to see what comes out. Deep learning architectures are the latest manifestation of this mythical creature, which is supposed to uncover the secrets of data, whatever the data. But if you already know what you want your system to learn, why wait until it finds out on its own?
Cheap tricks and gamesmanship
Legendary data scientist
Here at craft ai, part of the team used to work on game AI. While arguably very different from the AI you would want in a non-gaming application, this gave us a culture of using the tricks that do the job.
If you want a virtual opponent to find a position from where to shoot at the player, you don’t need it to have a full-fledged understanding of ballistics: by considering only probable shooting positions that were annotated in advance and using simple criteria to choose among them, you can get a much better result than if you implement a more generic solution that starts with less assumptions and considers the whole environment. Not only would it be immensely more difficult to make the generic method work, but even then it would most likely just re-discover the same candidate positions and yield results similar to the simpler method, while being harder to tweak.
For a virtual character to act autonomously in a given environment, you want it to use as many assumptions about the environment as possible to reduce the complexity of what it will process, first of all because that’s a way to just make it work. The same principle can apply to applications where you would use machine learning: if you already know which parts of your data are important for your application, you can use methods targeted on those parts directly, instead of a generic solution that you will then attempt to steer in the direction you already know you want it to take.
We talk to a lot of people who are interested in using their data in a smarter way, especially to personalize their application to each of their users. And sometimes they really do need a full-fledged machine learning solution to sift through the whole thing and discover the relevant features and the complex correlations within. But more often than not, for their needs and the relatively straightforward data they have, they can get a satisfactory and much more reliable result by analyzing the parts of their data and the features that they already know to be the ones that matter for their business, with a more intelligible algorithm.
In those cases they could use something like deep learning, but the main drawback of this approach is how much of an opaque black box it is, which is precisely the reason why it is seen as real AI.
AI is software
At least for now, AI is made with software. The problem of making AI work is related to the problem of making software work. So just like any piece of software, you want to be able to efficiently develop, debug and maintain your AI. This can be made much easier if your program is simple, and comprehensible.
Working with a more targeted algorithm reduces the complexity of what the program has to handle correctly, and therefore reduces the space of what can cause bugs or unsatisfactory results by as much. This can also make execution much faster. But the main argument against solutions that have an air of mystery like deep learning is how hard they can be to tweak and debug. If you get predictions that are not completely satisfactory in a subtle way, it can be very difficult to figure out how to improve them. Conversely, the better you understand how an algorithm works and the more information you have about how it produces its results, the easier it is to improve your program.
The culture of using a solution that will be easier to maintain is common in software development in general, yet for AI it seems that the cover is sometimes as important as the book. The tendency to discount an algorithm as simplistic if it becomes understood can lead to looking for a solution that seems magical and full of promises, but the very reason why it looks smart can make it work against you.
As long as it does the trick, the fact that you understand how your data analysis solution works is not such a bad thing. And if we’re calling deep learning AI, then we might as well call that AI too.