If you are building yet another YouTube-like service, eventually, you will be facing the problem of generating thumbnails for video clips. The simplest solution would be to get some random or fixed frame, and use is a thumb. Unfortunately, this does not always produce best results. You can try to implement more sophisticated algorithms, like described here (Quote: "The shot and key frame are selected based on measures of motion and spatial activity and the likeliness to include people. The latter is determined by skin-color detection and face detection."). I did not have time and patience to implement such complex algorithms, so I came up with one of my own, which is really simple, could be implemented in couple of hundreds lines of code and works pretty well.
The main idea is very simple: we analyze first few seconds of a clip and build histograms of color distribution per frame. Then, we average them, building an averaged color distribution histogram. Then we find a frame, which is closest to the average value (I am using RMSE to estimate "closeness"). We select a frame close to beginning of the video, which makes selection process faster (less frames to examine) and less likely to include spoilers. Selected picture is similar in color distribution to the overall video theme, making it more likely to display typical frame.
I run it on few hundreds video clips, and it shows pretty good results. Of course, these results are not representative, I've selected most interesting ones but generally I think it is very usable. You can grab the source code and try it yourself.
Hmm - interesting. This probably finds an average frame but not necessarily the key frame. Try this test: pan across the sky for 3 seconds then focus on a bird in flight for 2 seconds then pan across blank sky again for 3 seconds. would your algorithm find the frame with the bird?
Since in this hypothetical video you see the sky for six seconds and bird only for two, arguably the frame with the sky is better characterize this video. I could not know whether the bird was the object of your movie. It might as well being about nice cloud formations and bird gets into the frame by mistake.
But I agree, this algorithm is not perfect. But in a lack of more sophisticated solution, this is better that selecting random frame.