The field of image and video recognition is quickly progressing, with new technology now allowing for content-based search and classification of digital media. In the past, the accuracy of video recognition doubled every three years, but recent advances employing deep convolutional neural networks have nearly doubled in accuracy in a single year.
At this stage, coarse-grained classifications of concepts or nouns, such as cars, sunsets and
sunrises, mountains, and indoor scenes, can be recognized. It’s also possible to recognize finegrained classifications such as German shepherds, Afghan hounds, terriers, or spaniels. It
has even become possible to recognize events in videos, such as a person changing a car tire, a
person cleaning an appliance, or a couple at a wedding ceremony.