Daniel Kokotaljo and I agreed on the following bet: I paid Daniel $1000 today. Daniel will pay me $1100 inflation adjusted if there is no AGI in 2030.
Ramana Kumar will serve as the arbiter. Under unforeseen events we will renegotiate in good-faith.
As a guideline for 'what counts as AGI' I suggested the following, to which Daniel agreed:
"the Arbiter agrees with the statement "there is convincing evidence that there is an operational Artificial General Intelligence" on 6/7/2030"
Defining an artificial general intelligence is a little hard and has a strong 'know it when you see it vibe' which is why I'd like to leave it up to Ramana's discretion.
We hold these properties to be self-evident requirements for a true Artificial General Intelligence:
1. be able to equal or outperform any human on virtually all relevant domains, at least theoretically
-> there might be e.g. physical tasks that it is artificially constrained from completing because it is lacks actuators for instance - but it should be able to do this 'in theory'. again I leave it up to the arbiter to make the right judgement call here.
2. it should be able to asymptotically outperform or equal human performance for a task with equal fixed data, compute, and prior knowledge
3. it should autonomously be able to formalize vaguely stated directives into tasks and solve these (if possible by a human)
4. it should be able to solve difficult unsolved maths problems for which there are no similar cases in its dataset
(again difficult, know it when you see it)
5. it should be immune / atleast outperform humans against an adversarial opponent (e.g. it shouldn't fail Gary Marcus style questioning)
6. outperform or equals humans on causal & counterfactual reasoning
7. This list is not a complete enumeration but a moving goalpost (but importantly set by Ramana! not me)
-> as we understand more about intelligence we peel off capability layers that turn out to not be essential /downstream of 'true' intelligence.
Importantly, I think near-future ML systems to be start to outperform humans in virtually all (data-rich) clearly defined tasks (almost) purely on scale but I feel that an AGI should be able to solve data-poor, vaguely defined tasks, be robust to adversarial actions, correctly perform counterfactual & causal reasoning and be able to autonomously 'formalize questions'.
There is a general phenomena in mathematics [and outside maths as well!] where in a certain context/ theory we have two equivalent definitions of a concept that become inequivalent when we move to a more general context/theory . In our case we are moving from the concept of probability distributions to the concept of an imprecise distribution (i.e. a convex set of probability distributions, which in particular could be just one probability distribution). In this case the concepts of 'independence' and 'invariant under group action' will splinter into inequivalent concepts.
Example (splintering of Indepence) In classical probability theory there are three equivalent ways to state that a distribution is independent
1.
2.
3.
In imprecise probability these notions split into three inequivalent notions. The first is 'strong independence' or 'aleatoric independence'. The second and third are called 'irrelevance', i.e. knowing does not tell us anything about [or for 3 knowing does not tell us anything about ].
Example (splintering of invariance). There are often debates in foundations of probability, especially subjective Bayesian accounts about the 'right' prior. An ultra-Jaynesian point of view would argue that we are compelled to adopt a prior invariant under some symmetry if we do not posses subjective knowledge that breaks that symmetry ['epistemic invariance'], while a more frequentist or physicalist point of view would retort that we would need evidence that the system in question is in fact invariant under said symmetry ['aleatoric invariance']. In imprecise probability the notion of invariance under a symmetry splits into a weak 'epistemic' invariance and a strong 'aleatoric' invariance. Roughly spreaking, latter means that each individual distribution in the convex set , is invariant under the group action while the former just means that the convex set is closed under the action
Failure of convergence to social optimum in high frequency trading with technological speed-up
Possible market failures in high-frequency trading are of course a hot topic recently with various widely published Flash Crashes. There has a loud call for a reign in of high frequency trading and several bodies are moving towards heavier regulation. But it is not immediately clear whether or not high-frequency trading firms are a net cost to society. For instance, it is sometimes argued that High-Frequency trading firms as simply very fast market makers. One would want a precise analytical argument for a market failure.
There are two features that make this kind of market failure work: the first is a first-mover advantage in arbitrage, the second is the possibility of high-frequency trading firms to invest in capital, technology, or labor that increases their effective trading speed.
The argument runs as follows.
Suppose we have a market without any fast traders. There are many arbitrage opportunities open for very fast traders. This inaccurate pricing inflicts a dead-weight loss D on total production P. The net production N equals P-D. Now a group of fast traders enter the market. At first they provide for arbitrage which gives more accurate pricing and net production rises to N=P.
Fast traders gain control of a part of the total production S. However there is a first-mover advantage in arbitrage so any firm will want to invest in technology, labor, capital that will speed up their ability to engage in arbitrage. This is a completely unbounded process, meaning that trading firms are incentived to trade faster and faster beyond what is beneficial to real production. There is a race to the bottom phenomenon. In the end a part A of S is invested in 'completely useless' technology, capital and labor. The new net production is N=P-A and the market does not achieve a locally maximal Pareto efficient outcome.
As an example suppose the real economy R consults market prices every minute. Trading firms invest in technology, labor and capital and eventually reach perfect arbitrage within one minute of any real market movement or consult (so this includes any new market information, consults by real firms etc). At this point the real economy R clearly benefits from more accurate pricing. But any one trading firm is incentivized to be faster than the competition. By investing in tech, capital, suppose trading firms can achieve perfect arbitrage within 10 microseconds of any real market movement. This clearly does not help the real economy R in achieving any higher production at all since it does not consult the market more than once every minute but there is a large attached cost.
Measuring the information-theoretic optimizing power of evolutionary-like processes
Intelligent-Design advocates often argue that the extraordinary complexity that we see in the natural world cannot be explained simply by a 'random process' as natural selection, hence a designer. Counterarguments include:
To my eyes these are good arguments but they certainly are not conclusive. Compare:
The take-away message is that we should be careful when we say that evolution explains biological complexity. This might certainly be true - but can we prove it?