Changing input size of pre-trained models in Keras

Keras is a useful API for deep learning that also includes various pretrained models that you can used for transfer learning.

UPDATE! Now works with tf.keras!

The Keras API

is a high level API (Application Programming Interface) for deep learning. That is it does not itself implement deep learning functionality but is built on-top of existing deep learning frameworks such as and provides improved functionality, faster implementation cycles, and added features. One of its most useful features is that it provided access to a large pool of existing deep learning models which are pre-trained on (a rather time consuming and computationally demanding process). Hence, it facilitates transfer learning which is the process of re-purposing an existing deep learning model for another task.

One of the challenges/limitations of the existing approach is that the deep learning models come with reconfigured architecture and support a particularly input image size. But what if we want to use them for a smaller or larger image size?

Well thankfully there is a workaround for this which I discovered recently on . I will go through this process in this post and will also post a colab link to the code so that anyone can freely play with it.

Deep dive into code

The main function that performs the modification of the network to support the new image size is the following:

Image for post
Image for post
The function that creates a new model based on the JSON specification

The function first changes the input shape parameters of the network. At this point the internals of the model have not been registered. To register them we first convert the keras model to the JSON specification and then read it back essentially converting it to the keras model. Through this conversion the model is recreated but using the provided tensor shapes as input thus it modifies all the internal parameters to fit to this particular image size.

Lets look at a real example next. Lets consider the V1 model which makes use of depthwise and is considered an efficient deep neural network for image understanding tasks (classification/detection/segmentation). When loading the model with ImageNet weights you can only specify a few input size images otherwise an error message is prompted on screen. By examining the input size of a loaded MobileNet model we observe that the default input size is 224x224.

Image for post
Image for post
Initial MobileNet Structure with input 224x224

Using the change_model function with an input size of 130x130 (which is not listed on the default MobileNet inputsizes)on the initial MobileNet model effectively changes its receptive input image size.

new_model = change_model(MobileNet,new_input_shape=(None, 128, 128, 3))

Image for post
Image for post
Adapted MobileNet Structure for input size 130x130

Notice that the input size has been halved as well as the subsequent feature maps produced by the internal layers. The model has been adapted to a new input image size.

Lets test it on an input image. For this we use an image from the cifar10 dataset which comes with keras and features similar classes to ImageNet. This makes it easier to reproduce the results since all is build in to keras. We load a truck image as shown below (image number 1 from the default cifar10 dataset as included in keras). Since the cifar10 images are of size 32x32 we upscale it to 130x130 and proceed to classify it with the modified MobileNet.

Image for post
Image for post
(top) cifar10 original image of size 32x32 (middle) resized image (bottom) MobileNet Predictions

Notice that the network outputs relevant labels, even though the image is blurry and almost half of the size it has been trained on demonstrating that the weights have been loaded correctly and the network retains its discrimination capabilities which are useful for transfer learning.

Remarks

Keras is a powerful tool and the pre-trained models it provides facilitate an excellent starting point for deep learning projects. Re-configuring the input size allows for a greater flexibility in choosing the best model. However, there are some pitfalls that should be considered. First, the original models have been trained on a particular image size and changing the input can affect the original classification accuracy. Also, changing the input size may not be possible when the network layers are trained for a specific input size and have a hard-coded number of parameters such as in the case of fully connected layers. In such a case the original weights will not be loaded for those layers and the fully connected layers will have a different number of neurons depending on the resulting feature map dimensions. One way to alleviate this since the main purpose is transfer learning, is to substitute the fully connected layers at the end with global average pooling operations that do not depend on the width and height of the feature maps. As is the case with almost everything in the deep learning domain the best results come after experimentation and empirical evidence. Enjoy Coding!!

PhD in Computer Engineering, Self-Driving Car Engineering Nanodegree, Computer Vision, Visual Perception and Computing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store