Topography in Machine Learning
Published on 7 Jul 2007 at 2:38 am.
No Comments.
Filed under Intelligent Systems, Philosophy, Research, Solutions.
I recently chaired the Science Session of the Data Mining in Aeronautics, Science and Exploration Systems (DMASES 2007) Conference at the Computer History Museum in Mountain View CA. I gave a short 20 minute talk on Problem Solving and referred to a book by David Perkins that I really enjoy, called The Eureka Effect: The Art and Logic of Breakthrough Thinking.
Perkins gives names to various topographical traps that impair our efforts toward effective problem solving. By giving something a name, we have power over it. This enables us to think more clearly about these problems in learning that must be overcome.
The Wilderness Trap reflects the fact that the space is immense. There are so many possible solutions, it is virtually impossible to explore them all.
The Clueless Plateau is a flat region of the solution space that contains virtually no information as to where to go to find the more probable solutions (the peaks).
The Canyon Trap is artificially imposed by the problem solver as they restrict themselves to a subspace of the solution space that does not contain the solution to the problem.
The Oasis Trap is a local solution (local maximum in probability) that seems so promising that you don’t want to leave it. However, it is nowhere near the true solution to the problem.
Last, the Solution is a high peak in the space (there’s gold in them thar hills!). The base of the solution peak can be wide, which makes the peak easy to find, or extremely narrow, which makes the problem difficult.
In his book he relates problem solving to puzzles, which provides a set of clear and familiar examples in which to explore the various traps that inhibit effective problem solving. In my talk, I referred to the Nine Dot Puzzle, which Perkins as explains, exhibits features of each of the four traps above.
I also explored communication as inference, where the job of the listener is to infer and model the thoughts of the speaker. This perspective of communication as a pairing of encoding and inference leads nicely to seeing jokes as problems that exhibit one or more of these classic topographical difficulties. Many jokes and much of humor relies on trapping the audience.
A favorite of mine is raised by the question “When geese fly in a ‘V’, why is it that one leg of the ‘V’ is often longer than the other?”
Think about it before going on… maybe those of you with some aeronautics or biology experience can think of the answer. The solution will be revealed at the end.
This joke/riddle is effective because it leads the listener to a Canyon Trap where he/she constrains themself to a subspace of the solution space. The subspace searched is the subspace of profound answers; such as: aerodynamics, efficiency, biological considerations, etc. However, the solution lies in the space of mundane solutions, as you can see below.
Solving problems, puzzles, jokes and riddles are analogous to solving problems in physics, chemsitry, and machine learning. The traps are the same, and this fact can give us unique insights into designing machine learning algorithms. Specifically, the best algorithms will rely on both educated guesses and exploration… that is heuristics and sampling. Breakthroughs in machine learning are analogous to breakthroughs in creative thinking and problem solving. These are the AHAs of Martin Gardiner.
In a future post, I will describe how these AHAs are analogous to phase transitions in statistical mechanics.
Well, you made it to the end, and here is the answer to the riddle: “There are more geese in it!”
Happy problem solving!
Kevin Knuth
Albany NY
