User Comment Replies — AI Alignment Forum

The Problem With The Current State of AGI Definitions

It's probably worth noting you seem to be empirically wrong: I'm pretty confident I'd be able to do >half of human jobs, with maybe ~3 weeks of training, if I was able to understand all human languages (obviously not in parallel!) Many others here would be able to do the same.

The criterion is not as hard as it seems, because there are many jobs like cashiers or administratrative workers or assembly line workers which are not that hard to learn.

1David Manheim3y

Depends on how you define the measure over jobs. If you mean "the jobs of half of all people," probably true. If you mean "half of the distinct jobs as they are classified by NAICS or similar," I think I disagree.

The alignment problem in different capability regimes

Jan Kulveit4y*20

Similary to johnswentworth: My current impression is core alignment problems are the same and manifest at all levels - often sub-human version just looks like a toy version of the scaled-up problem, and the main difference is, in the sub-human version problem, you can often solve it for practical purposes by plugging in human at some strategic spot. (While I don't think there are deep differences in the alignment problem space, I do think there are differences in the "alignment solutions" space, where you can use non-scalable solutions, or in risk space, w... (read more)

2Buck Shlegeris4y

This is what I was referring to with The superintelligence can answer any operationalizable question about human values, but as you say, it's not clear how to elicit the right operationalization.

Risks from Learned Optimization: Introduction

Jan Kulveit6y40

I don't see why portion of a system turning into an agent would be "very unlikely". In a different perspective, if the system lives in something like an evolutionary landscape, there can be various basins of attraction which lead to sub-agent emergence, not just mesa-optimisation.

AI ALIGNMENT FORUM
AF

All of Jan Kulveit's Comments + Replies