Comments 1 - 4 of 4 Search these comments
It actually doesn't take that many neurons to do edge detection. In fact, edge detection can be done by running an image through several "matrix convolutions" as explained by Catenary. [Although, personally I don't like to call them matrices since they aren't being used as such, but that's another issue. I just call them tables.]
Basically, the following three tables detect horizontal, vertical, and diagonal edges respectively.
-1 -1 -1
0 0 0
1 1 1
-1 0 1
-1 0 1
-1 0 1
-5 0 0
0 0 0
0 0 5
Note that the reason -5 and +5 are used in the last table is to exaggerate the effects.
So, what do you do with these tables in order to detect edges. I.e., what do they mean?
Well, let's take a grayscale image for simplicity's sake. A grayscale image has only a single scalar measurement called brightness for each pixel. So you can image the picture as being a two-dimensional array of brightness measurements. You can use any scale you like, but let's use 0.0 for black and 1.0 for white and other values in between.
So, we take a 3x3 table and apply it to a 100x100 picture by doing the following.
1. We call the picture we want to process the source. We reference each pixel in source by saying source[x, y] where x and y are coordinates starting from the top-left of the picture.
2. We create a blank picture called result that has the same size as source. Once again, the pixels in result are referenced as result[x, y].
3. We call our table "table" and reference it's scalar values as table[x, y].
4. Metaphorically, we place our table on top of source to produce part of the image in destination. This is done by going pixel by pixel through source to produce the picture result. It looks like this using zero-based indices.
for y = 1 to 98 for x = 1 to 98 begin result[x - 1, y - 1] += source[x, y] * table[0, 0]; result[x + 0, y - 1] += source[x, y] * table[1, 0]; result[x + 1, y - 1] += source[x, y] * table[2, 0]; result[x - 1, y + 0] += source[x, y] * table[0, 1]; result[x + 0, y + 0] += source[x, y] * table[1, 1]; result[x + 1, y + 0] += source[x, y] * table[2, 1]; result[x - 1, y + 1] += source[x, y] * table[0, 2]; result[x + 0, y + 1] += source[x, y] * table[1, 2]; result[x + 1, y + 1] += source[x, y] * table[2, 2]; end
The reason for x and y going from 1 to 98 is that you can't process the outermost images since we're using a 3x3 table. If we were using a 5x5 table, we couldn't process the outermost 2 pixels around the edges of source and result.
Now, of course, we could generalize the above loop to handle any size table, but I wanted to keep things as clear as possible.
The point is that the result image starts out with nothing but zero values for all of its pixels. As you go through the above loop, each pixel in result is either brightened or darken based on the corresponding pixel in source as well as the pixels that surround the corresponding pixel in source.
So, if we use the first table ((-1, -1, -1), (0, 0, 0), (1, 1, 1)) then the destination pixels get darken every time there is a bright pixel above them in the source picture. And then the destination pixels get lightened every time there is a bright pixel below them in the source picture. The actual corresponding pixel in source has no effect on the pixel in result since anything times 0 is zero and adding zero to a value does not change it.
Now, we can perform all three edge detections (horizontal, vertical, and diagonal) simply by adding their tables together. The result we get is
-7 -1 0
-1 0 1
0 1 7
Now, let's run that through Photoshop (Filter --> Other --> Custom).
In the screen shot below, the top picture is the before image (source) taken from Catenary. The second picture is the after image (result) and the third picture shows the table I used. Notice that I got pretty much the same results as Catenary. You can tweak the table to get the exact results.
Since there is loss of information by performing these convolutions, you cannot perfectly undo the convolutions, but if you take the inverted table you can partially undo the edge detection and regain some detail. For example,
Getting back to neurons... You can code the above algorithm in a neural net and it turns out that you need only a couple of neurons for each entry in your table. So a 5x5 table would take more neurons than a 3x3 table. But it doesn't take a massive brain to perform edge detection using this technique.
In other words, there's no way you can go back to the puzzling state of mind you would have been in, the first time you saw it!
A scientific justification for hindsight bias.
It actually doesn't take that many neurons to do edge detection.
But it doesn't take a massive brain to perform edge detection using this technique.
LOL, you should know -- clearly that is not my point.
My point is edge detection is one of the spatial redundancy reduction functions that the brain can do. I am not saying it takes the entire processing power of the brain to do it; Regions in visual and temporal lobe cortex will be involved in the recognition process. This is probably more rudimentary in say frogs, than humans.
My intent is to show that the evolutionary variation in gene pool could be resulting from redundancy reduction and pattern recognition on a massive scale. It is a hypothesis, but looks like a pretty darn good one at that.
Yeah, I went off on a tangent. But I hope it was an interesting one. I just find it interesting that you can do so much with so little computation.
Richard Dawkins writes:
This is quite easy to understand. Our brain is excellent at detecting redundancies and "filling in". Say you see a uniform white picture. The human brain doesn't really need all the pixel information (which is highly redundant) so it can "fill in" the picture.
For the image below, close your right eye. With your left eye, look at the red circle. Slowly move your head closer to the image. At a certain distance, the blue line will not look broken!!
This is because your brain is "filling in" the missing information.
Dawkins again:
OTOH, the brain is ultra-brilliant at detecting sharp edges. See the image below: what do you identify?
Given the brain's limited attentional resources and shortage of neural space for competing representations, every stage in the processing of visual information offers an opportunity to generate a signal that says, "Look, here is a clue to something potentially object-like!" . Partial solutions or conjectures (partial "aha" feeling) to perceptual problems are fed back from every level in the hierarchy to every earlier module to impose a small bias in processing, allowing the final percept to emerge from such progressive "bootstrapping". (final realizing "aha"). You should eventually have identified a Dalmatian sniffing on the ground.
The fascinating fact about this image is that once you have done the perceptual grouping and identified the dalmatian, it can never be undone because the neurons in the temporal lobe region of your brain are permanently altered after the initial brief exposure. In other words, there's no way you can go back to the puzzling state of mind you would have been in, the first time you saw it!
Dawkins continues:
What Dawkins is suggesting is that the redundancy reduction and pattern recognition can happen at the evolutionary level and therefore lead to better (in terms of overall survival) species over time. Of course we'll have to collect massive gene pool data and perform analysis to prove this.
But it demonstrates how a brilliant scientist thinks.
Compare this beauty of explanation to blind belief which assumes away most of the things and explains nothing.
Full link: http://edge.org/response-detail/2825/what-is-your-favorite-deep-elegant-or-beautiful-explanation