Well, what he actually says is the ‘phase transition’ in computer science. Two things make that possible: 1- too much data and 2- processing speed.
One of the nicest example he gives is that a learning algorithm X is the best with 1 million examples and another algorithm Y comes at the third rank but when the same algorithms are run on a data set of 1 billion examples then Y becomes the best one.
Another good examples: Scene completion example where the algorithm did not provide meaningful results with 10.000 images and researchers kept on trying with 100.000 images, again no good results, then with 1.000.000 images, again no results but then with 10.000.000 images it worked very well! So there’s some kind of phase transition – or a quantum leap – is going on here. The situation is similar to Google Image Search where they were trying to find the canonical images, e.g. the image that best represents ‘Mona Lisa’ and not some variation of it. By taking pairs of images, doing a feature comparison, calculating a distance and arranging data as graph and running a pagerank-like algorithm on the graph they were able to find the images that represent the given set of keywords best.
It is always fun and revealing to listen to Norvig. If you are interested in cutting edge research in machine learning, pattern recognition and machine translation I recommend this video enthusiastically. Especially the parts where Norvig shows some single page Python source code for word segmentation and typo checking programs (first is about %97 correct, running on a laptop with a data set of about 1.7 billion words, the second is about %75 correct, again running on a not-very-high-end laptop). He also mentions MapReduce programming paradigm and some wrong claims about the model, showing how it helps to do parallel programming for very large amounts of data.