![]() ![]() After downloading and uncompressing it, you’ll create a new dataset containing three subsets: a training set with 1,000 samples of each class, a validation set with 500 samples of each class, and a test set with 500 samples of each class. This dataset contains 25,000 images of dogs and cats (12,500 from each class) and is 543 MB (compressed). Below you’ll end up with a 97% accuracy, even though you’ll train your models on less than 10% of the data that was available to the competitors. The best entries achieved up to 95% accuracy. Unsurprisingly, the dogs-versus-cats Kaggle competition in 2013 was won by entrants who used convnets. The pictures are medium-resolution color JPEGs. You can download the original dataset from (you’ll need to create a Kaggle account if you don’t already have one – don’t worry, the process is painless). It was made available by Kaggle as part of a computer-vision competition in late 2013, back when convnets weren’t mainstream. The Dogs vs. Cats dataset that you’ll use isn’t packaged with Keras. Let’s start by getting your hands on the data. That’s what you’ll do in the next section. Specifically, in the case of computer vision, many pretrained models (usually trained on the ImageNet dataset) are now publicly available for download and can be used to bootstrap powerful vision models out of very little data. What’s more, deep-learning models are by nature highly repurposable: you can take, say, an image-classification or speech-to-text model trained on a large-scale dataset and reuse it on a significantly different problem with only minor changes. You’ll see this in action in this section. ![]() Training a convnet from scratch on a very small image dataset will still yield reasonable results despite a relative lack of data, without the need for any custom feature engineering. Because convnets learn local, translation-invariant features, they’re highly data efficient on perceptual problems. It isn’t possible to train a convnet to solve a complex problem with just a few tens of samples, but a few hundred can potentially suffice if the model is small and well regularized and the task is simple. This is especially true for problems where the input samples are very high-dimensional, like images.īut what constitutes lots of samples is relative – relative to the size and depth of the network you’re trying to train, for starters. This is valid in part: one fundamental characteristic of deep learning is that it can find interesting features in the training data on its own, without any need for manual feature engineering, and this can only be achieved when lots of training examples are available. You’ll sometimes hear that deep learning only works when lots of data is available. The relevance of deep learning for small-data problems In this post we’ll cover only the second and third techniques. Subsequently we use feature extraction with a pretrained network (resulting in an accuracy of 90%) and fine-tuning a pretrained network (with a final accuracy of 97%). The first of these is training a small model from scratch on what little data you have (which achieves an accuracy of 82%). In Chapter 5 of the Deep Learning with R book we review three techniques for tackling this problem. We’ll use 2,000 pictures for training – 1,000 for validation, and 1,000 for testing. As a practical example, we’ll focus on classifying images as dogs or cats, in a dataset containing 4,000 pictures of cats and dogs (2,000 cats, 2,000 dogs). A “few” samples can mean anywhere from a few hundred to a few tens of thousands of images. Having to train an image-classification model using very little data is a common situation, which you’ll likely encounter in practice if you ever do computer vision in a professional context. ![]()
0 Comments
Leave a Reply. |