AI ALIGNMENT FORUM
AF

Personal Blog

1

Attempting to refine "maximization" with 3 new -izers

by agilecaveman
11th Aug 2015
1 min read
1

1

This is a linkpost for https://www.overleaf.com/read/pxkqtdwhwgkc
Personal Blog
Attempting to refine "maximization" with 3 new -izers
0IAFF-User-111
New Comment
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 5:23 AM
[-]IAFF-User-11110y00

skimmed it.

It would be helpful to define "stopping point" and "stopping distance".


Wrt local optima:

Deep Neural Nets were historically thought to suffer from local optima. Recently, this viewpoint has been challenged; see, e.g. "The Loss Surfaces of Multilayer Networks" http://arxiv.org/abs/1412.0233 and references.

Although the issue remains unclear, I currently suspect that local optima are not a practical obstacle for an (omniscient) hill-climber in the real world.


I wasn't convinced overall by the statement about tiling (or not). I think you should give more detailed arguments about why you do or don't expect these agents to tile, and explain the set-up a bit more, too: are you imagining agents that take a single action, based on their current policy, to adopt a new policy, which is then not subject to further modification? Or how can you ensure that agents do not modify their policy in such a way that policy_new encourages further modifications which can compound?

Reply
Moderation Log
More from agilecaveman
View more
Curated and popular this week
1Comments