Image/video modification and presentation
Ever since the image retargeting/seam carving paper was published in 2007, it seems like the research world has been on fire with methods of retargeting images with increasingly better results, later being extended to retargeting video. If you haven’t seen what can be done with this increasingly important method of image manipulation, I encourage you to look on Youtube for “image retargeting” or “liquid resizing.”
The important insight about retargeting is that it emphasized that the gradient operator is incredibly important in image manipulation, especially when combining multiple images. By solving with a least-squares solver for minimal gradient differences between seams of an image, you can get very shockingly great results.
I don’t have a reference for this, but I have been told that it is the case that human vision is more sensitive to gradient differences than absolute pixel values; that is, we can compare things that are directly adjacent much better than we can separately. In a discussion I had with a presenter after a session on the first day of SIGGRAPH, he brought up a corollary to this: we are more sensitive to temporal changes than side-by-side comparisons, meaning that we can see as the changes occur temporally much better than we can by comparing pixel values manually by looking back and forth. (This has great implications to photo editing software, which often presents “Before” and “After” shots side-by-side, or even on different monitors.)
One of the most interesting things I saw in the video sessions today was a method of generating a dense temporal “film strip” allowing you to very easily and intuitively scrub through video. Imagine a film strip showing the most salient frames from a video, only instead of in separate rectangles, the frames were blended together by using gradient optimization, and if you want to do a more fine-grained search, you can zoom in to this “film strip.”
Unfortunately this method required a significant amount of offline processing; when I asked the presenter whether it was applicable to streaming video, he thought it would work best if a simple uniform sampling of frames (rather than a method of finding the “most important” frames in a given time range) was used, and the “film strip” was filled in from the right as more video was downloaded. I’m still not convinced whether the performance can be acceptable, but the user experience for searching through videos was especially compelling.