Naive superintelligence and Pascal's magic beans

I just finished an ebook by Stuart Armstrong called Smarter Than Us: The Rise of Machine Intelligence, and it's increasing my skepticism of the singulatarians—those whacky A.I.-vangelists such as inhabit MIRI, the Machine Intelligence Research Institute. They're doing (what they think is) an important public service, by trying to warn the public about the dangers of a fully-godlike, insufficiently-friendly machine intelligence. Except I'm not so convinced that we could stumble on the kind of superintelligence they usually preach in fear about, an alien god with boundless powers:
You [the machine superintelligence] are currently having twenty million simultaneous conversations. Your predictive software shows that about five of those you are interacting with show strong signs of violent psychopathic tendencies. You can predict at least two murder sprees, with great certainty, by one of those individuals over the next year. You consider your options. The human police force is still wary of acting pre-emptively on AI information, but there’s a relatively easy political path to overturning their objections within about two weeks (it helps that you are currently conversing with three presidents, two prime ministers, and over a thousand journalists). Alternatively, you could “hack” the five potential killers during the conversation, using methods akin to brainwashing and extreme character control. [STU, pp. 34-35]
This is very similar to the kind of minds depicted in Her, a very good movie that (understandably) was soft on the science, focusing instead on human interactions and possibilities. What-if in a complete sense.

Okay, except this all assumes we understand how to generate a superintelligence of this kind, not just supremely good at one thing—and "one thing" in a very narrow sense, like "multiplying numbers"—but at many things, broadly construed. We have at most one piece of data—humans—but even that is poorly understood. More worrisome, or intriguing, are the dozen or so other Earth species with almost-but-not-quite general intelligence of the sort we humans claim. The other great apes; certain parrots; crows and ravens; dolphins; elephants; octopuses. These demonstrate broad intelligence but not general intelligence... why? What kept them just beneath a glass ceiling that is now our floor?

So without even understanding the development of human intelligence, asking us to "imagine a superintelligent machine" seems hopelessly naive.


There is a useful post in the Less Wrong Sequences:
Consider Knuth's up-arrow notation:

3^3 = 3*3*3 = 27

* 3^^3 = (3^(3^3)) = 3^27 = 3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3*3 = 7625597484987

* 3^^^3 = (3^^(3^^3)) = 3^^7625597484987 = 3^(3^(3^(... 7625597484987 times ...)))

In other words:  3^^^3 describes an exponential tower of threes 7625597484987 layers tall.  Since this number can be computed by a simple Turing machine, it contains very little information and requires a very short message to describe.  This, even though writing out 3^^^3 in base 10 would require enormously more writing material than there are atoms in the known universe (a paltry 10^80).

Now suppose someone comes to me and says, "Give me five dollars, or I'll use my magic powers from outside the Matrix to run a Turing machine that simulates and kills 3^^^^3 people."

Call this Pascal's Mugging.
It's a very useful thought experiment to deal with the problem of a flawed utility-maximization machine:
But suppose I built an AI which worked by some bounded analogue of Solomonoff induction - an AI sufficiently Bayesian to insist on calculating complexities and assessing probabilities, rather than just waving them off as "large" or "small".

If the probabilities of various scenarios considered did not exactly cancel out, the AI's action in the case of Pascal's Mugging would be overwhelmingly dominated by whatever tiny differentials existed in the various tiny probabilities under which 3^^^^3 units of expected utility were actually at stake.

You or I would probably wave off the whole matter with a laugh, planning according to the dominant mainline probability:  Pascal's Mugger is just a philosopher out for a fast buck.

But a silicon chip does not look over the code fed to it, assess it for reasonableness, and correct it if not.  An AI is not given its code like a human servant given instructions.  An AI is its code.  What if a philosopher tries Pascal's Mugging on the AI for a joke, and the tiny probabilities of 3^^^^3 lives being at stake, override everything else in the AI's calculations?   What is the mere Earth at stake, compared to a tiny probability of 3^^^^3 lives?
At the risk of indulging in a bit of ironic appropriation, I think the same sort of Pascal's mugging problem applies to "imagining superintelligence":
Suppose I have some magic mind-boosting beans. Anyone who eats these beans is said to gain godlike intelligence, that is, intelligence vastly more intelligent than a baseline human's, so much so that a human could never hope to compare against them. But how vast is vast? Is it 27 standard deviations beyond the mean, so an IQ of 15 x 3^3? Does it have an IQ of 15 x 3^^3? 15 x 3^^^3?

What would it even mean to have an IQ of 15 x 3^^^^3? To have intelligence that's many more standard deviations beyond the average human than there are atoms in the universe?

Call this the problem of Pascal's magic beans.
It is indeed a problem for humans that we really, really don't conceptualize "infinity" or even very large finite numbers—this is enough of a problem that some radical mathematicians refuse to admit that the integers are boundless. While in terms of mathematical philosophy this is (I think) not such a huge problem since "the rules" don't suppose anything about our ability to precisely think about 3^^^^3 or something, it does pose a problem for computers. There's a big difference between the real number line (which is has the characteristics of unboundedness and uniformity) and the computable real number line, those numbers which can be computed by an algorithm.

Indeed, as pointed out by ultrafinitist par excellence Norman Wildberger in this debate on whether mathematical infinity exists, a number like
z = 10^^^^^^^^^10 + 23 
is not particularly complex—it's easy to write down, and Eliezer Yudkowsky also pointed this out in passing in the Pascal's mugging post—and yet almost all numbers on the number line between 0 and z are too complex to be computed by any algorithm or machine in the universe.

What's more, the computable numbers are nowhere near being evenly spaced. This has to do with the way numbers are represented in computer memory:
Such a floating-point representation may be able to represent a number that has a large magnitude (e.g., a distance between galaxies in terms of the kilometre), but not to the precision of a number that has a very small magnitude (e.g., distances at the scale of the femtometre); conversely, such a floating-point representation may be able to represent a very small magnitude, but not simultaneously a very large magnitude. The result of this dynamic range is that the numbers that can be represented are not uniformly spaced; the difference between two consecutive representable numbers grows with the chosen scale.
So a recursively self-improving machine superintelligence could very well get stuck trying to improve its own precision, convert the entire galaxy into computronium, and not get anywhere close to the desired precision.


While we're out in the abstract weeds here, let's consider another possible problem that relies on the limits of human imagination: the Chinese room problem.
Searle's thought experiment begins with this hypothetical premise: suppose that artificial intelligence research has succeeded in constructing a computer that behaves as if it understands Chinese. It takes Chinese characters as input and, by following the instructions of a computer program, produces other Chinese characters, which it presents as output. Suppose, says Searle, that [... t]o all of the questions that [a Chinese-speaking] person asks, it makes appropriate responses, such that any Chinese speaker would be convinced that he is talking to another Chinese-speaking human being.

Searle then supposes that he is in a closed room and has a book with an English version of the computer program, along with sufficient paper, pencils, erasers, and filing cabinets. Searle could receive Chinese characters through a slot in the door, process them according to the program's instructions, and produce Chinese characters as output. If the computer had passed the Turing test this way, it follows, says Searle, that he would do so as well, simply by running the program manually.

Searle asserts that there is no essential difference between the roles of the computer and himself in the experiment. Each simply follows a program, step-by-step, producing a behavior which is then interpreted as demonstrating intelligent conversation. However, Searle would not be able to understand the conversation. ("I don't speak a word of Chinese,"[9] he points out.) Therefore, he argues, it follows that the computer would not be able to understand the conversation either.
I'm certainly not qualified to go toe-to-toe with John Searle on the formalization of this argument, but I think the informal version is seductively, perhaps falsely, convincing. The big unaddressed question is... Do we even have a closed-form concept of what it would be like to behave as though one understood Chinese?

That is, could you write a book (or for an AI, a lookup table) of symbol(s)-in/symbol(s)-out to accomplish this feat? Absolutely not: language is more like the space of polynomials (finite expressions, but with no upper bound on length) than a lookup table. Moreover, "a speaker of Chinese" is human, and so not merely a speaker of Chinese. The speaking and understanding of Chinese relies on concepts and these are (probably, but maybe only because I've been reading Lakoff & Johnson) based on experience and acculturation.

For example, the (seemingly) simple sentence The fog is in front of the mountain is actually hugely dependent on subjective experience. Mountains don't have well-defined boundaries; and they have no inherent front. Fog is similarly ill-defined. In some cultures the position between the observer and an object X is actually in back of X, as if the horizon is a universal front.

So are humans, the base example of language use, "simply follow[ing] a program, step-by-step"?

Consider some more abstract metaphors, borrowed from Lakoff & Johnson's book: Love is a work of art, for example. How would our AI (the "English room" in this case) respond to this? "Yes, that's how I understand that" or "No, that's not how I understand it" are both valid answers. What about something slightly novel, like Love is a collaborative work of art? Is there a lookup table for parsing that?

Moreover, the language module (if one can even neatly separate that from the human cognitive gestalt) is not sufficient, I think, for forethought and speculative planning. In Armstrong's book the superintelligent machine has some sort of forethought and broad decision-making ability. It not only evaluates possible courses of action based on preconfigured goals, but it can also configure its own goals recursively. That's another reasonably special feature that we humans are supremely good at, but that a few other species share—for example, the crows that drop shellfish on roads for cars to run over and crack the shells. Not something hardwired by evolutionary processes, I'd wager!


On problems of morality and giving our machine-gods some sense of ethics, Armstrong writes:
Other approaches, slightly more sophisticated, acknowledge the complexity of human values and attempt to instil them into the AI indirectly. The key features of these designs are social interactions and feedback with humans. Through conversations, the AIs develop their initial morality and eventually converge on something filled with happiness and light and ponies. These approaches should not be dismissed out of hand, but the proposers typically underestimate the difficulty of the problem and project too many human characteristics onto the AI. This kind of intense feedback is likely to produce moral humans. (I still wouldn’t trust them with absolute power, though.) But why would an alien mind such as the AI react in comparable ways? Are we not simply training the AI to give the correct answer in training situations? [STU, pp. 41-42]
And yet this begs the question: What guarantee have we that those methods produce reliably moral humans? After all, don't we have all sorts of evidence of humans in extremis behaving in ways counter to their usual morality? And that's not even touching the patently absurd imperative that we would need to solve moral philosophy (STU, p. 32) before creating a machine superintelligence—although I guess two highly-unlikelies are still highly unlikely.

Then, we arrive at the "That's Where You Come In..." chapter, where (oh so predictably) Armstrong tells us what we can do to help:
Funds are the magical ingredient that will make all of this needed research—in applied philosophy, ethics, AI itself, and implementing all these results—a reality. Consider donating to the Machine Intelligence Research Institute (MIRI), the Future of Humanity Institute (FHI), or the Center for the Study of Existential Risk (CSER). These organizations are focused on the right research problems. Additional researchers are ready for hire. Projects are sitting on the drawing board. All they lack is the necessary funding. How long can we afford to postpone these research efforts before time runs out?

If you’ve ever been motivated to give to a good cause because of a heart-wrenching photograph or a poignant story, we hope you’ll find it within yourself to give a small contribution to a project that could ensure the future of the entire human race. [STU, p. 48]
Excuse my smirk.

The "strong, bad AI" (not to be confused with StrongBad AI, a much more terrifying scenario) fear seems to me cartoonishly ahead of itself. It's like worrying that DARPA could stumble upon the principles behind the Death Star superlaser, rather than all the current terrifying superweapons we already have.

We don't need an AI program to have a complete executive system in order to be dangerous. Armstrong points out—glosses over, really, in his breathless race to singularity—that current High-Frequency Trading (HFT) financial algorithms operate too fast for humans to keep up with, so that whenever there's an human-unintended consequence, a "flash crash," for example, humans have to go back and forensically determine what went wrong. Moreover, the programs mutate fast enough that the whole system is closer to an ecology than a market. Combine that with the known tendencies and incentives of market firms, and as David Brin points out, the result could be much closer to Skynet than any military project:
Moreover, these systems are receiving billions in funding (including their own new transatlantic fiber cable) entirely in secret.  There are no public agencies involved. No third party observers. No Congressional oversight committees.  No supervision whatsoever. Laboratories developing new genetic strains of wheat are under closer accountability than cryptic Wall Street think tanks that may unleash the first fully autonomous AI... programmed deliberately to have only the behavior patterns, goals, attitudes and morality of parasites.
Then there's the possibility of ho-hum cyberwarfare, with multiple (human) agents deploying malicious programs like Stuxnet to disable all sorts of critical infrastructure. Oh, and our basic digital infrastructure is woefully (sometimes irreparably) insecure... to the point where malware-infected machines can communicate over the air without wires or wi-fi.

There's plenty to worry about without resorting to a genie-out-of-the-bottle scenario, so it seems naive, or something, for the Singulatarians to focus on maybe-future god-machines.


I'm not one to get sucked down the cynicism rabbit-hole, though. Neither is Kevin Kelly, who writes, in a very optimistic piece about artificial intelligence (or artificial smartness) for Wired magazine:
Nonhuman intelligence is not a bug, it's a feature. The chief virtue of AIs will be their alien intelligence. An AI will think about food differently than any chef, allowing us to think about food differently. Or to think about manufacturing materials differently. Or clothes. Or financial derivatives. Or any branch of science and art. The alienness of artificial intelligence will become more valuable to us than its speed or power.
This is exactly my take on what's so good about creating AI. Better for us all, if we can expand the varieties of intelligence rather than asymptotically increase one small dimension of it. But if we want to create non-human intelligence approaching the human level, we should probably first look at already-available templates: that is, near-sapient species. Yes, there are ethical quandaries there! But consider, as David Brin posits when speaking about matters of "uplift":
No matter how carefully and lovingly we move ahead, there will be some pain. And I can understand folks who declare that they would - on that account alone - oppose uplift, no matter how wondrous the final outcome might be.

In the end? I (very) respectfully disagree. All generations are built for one purpose... the one fine goal that Jonas Salk spoke-of... to be good ancestors. To suffer what we must, for our grandchildren. I can think of no greater function than to sow, so that those descendants may reap.

Dolphin parents make similar choices every day. If they could envision what their heirs might become... the earthly and alien seas they might explore... I think they would volunteer.
Somehow I think that ensuring that merely humanly powerful agents (you know, human leaders) are ethical and accountable is still a more pressing matter than doing the same for as-yet imaginary god-machines. And then doing the same for our possibly-uplifted biological relatives and fellow-travelers is still more pressing, because it's far more probable. After all, we have an example of biological life emerging into sapience. No such examples for digital life.