Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Last Updated: Mar 27, 2024
Difficulty: Medium

Neural Style Transfer with TensorFlow

Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

The current generation has a great interest in art and craft. Youths nowadays use different types of tools to make an image even more attractive and creative. AI has come up with a unique way of styling, designing, and working with those images.

Neural Style Transfer with TensorFlow

This blog will discuss the topic of Neural Style Transfer with TensorFlow. Let's start our topic with the definition of the Neural Style Transfer with TensorFlow.

Neural Style Transfer

The Neural Style Transfer with TensorFlow is one of the funny apps that is created using AI in the field of art and craft. It takes two or more images and mixes them up to create a new design with creativity. 

As we know, there has been a drastic change in the field of face recognition and object capture detection using tools like one-shot learning. Hence the chance of image creativity using the Neural Style Transfer also has a good scope in future.

Let us now check the complete process step by step.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Components of Neural Style Transfer

There are mainly three components of Neural Style Transfer with TensorFlow which are as follows.

  • Primary Image: This is the first component of Neural Style Transfer, which is also known as the Content image. We can add the modifications to the primary image. Also, it is the base image on which the complete art relies. It is denoted by the letter (c)
     
  • Modification Picture: This is the second component of the Neural Style Transfer, which is also known as the Style image. The style image is the variation that you are going to add to your art. The style image leads the art to a new design. It is denoted by the letter (s)
     
  • Generated Image: This is the last component of the Neural Style Transfer, which is the final output image after using the Neural Style Transfer algorithm. The generated image is the merger of the primary image and the style picture. It is denoted by the letter (g)

Working of Neural Style Transfer

We will use the VGG-19 learning model to use the Neural Style Transfer with TensorFlow. There will be three components, as we discussed earlier, which are content, style and generated image. The VGG-19 algorithm is a type of deep conv nets. The deep conv nets are used to identify patterns in images.

The generated image is first treated as noise. The task is to make the generated image a mixture of both content and style images after the training process. You have to make sure that you remove the output and dense layers while passing input to the VGG-19 layers. This helps the generated image to come up with more clear image.

Formulas of Neural Style Transfer

There are many formulas that we are going to use in the Neural Style Transfer with TensorFlow. Let us discuss the formulas with concepts in brief.

Content Loss

The mean square is used to calculate the content loss. It tells the difference between content layer matrices when passed to the generated image and the original image. 

Suppose x and are the generated image and the original image. Also, the F and P be their features representation in layer l. The below formula defines the squared-error loss in both features.

Lcontent (p,x,L) = ½ Σij (Flij - Plij)2

Style Loss

We will calculate the gram matrix to know the style cost. The gram matrix is the calculation of the inner product that lies between the vectorized feature maps of a given layer. Let's look at the formula of style cost that is given below.

Glij = Σk FlikFljk

Here, Gij(l) is the inner product between the vectorized features i,j of layer l.

Total Loss

The linear connection between the content and style loss is known as the Total loss of Neural Style Transfer with TensorFlow.

Let us have a look at how we can calculate the total loss.

Ltotal (P, a, x) = α x Lcontent + β x Lstyle

In the above formula, α and β are the weight factors that are used for the content and style image, respectively.

Implementation of Neural Style Transfer

Let us now look at the implementation of the Neural Style Transfer with TensorFlow. To do so, we need two images that will be our primary and modification images. 

Content Image

Content image

Style Image:

Style image

Importing Libraries

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# import VGG 19 model and keras Model API
from tensorflow.keras.applications.vgg19 import VGG19, preprocess_input
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras.models import Model

 

Explanation:

Here we have imported all the important libraries that are needed for the Neural Style Transfer with TensorFlow.

Loading Images

content_path = tf.keras.utils.get_file('content.jpg',
                                       'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg')

style_path = tf.keras.utils.get_file('style.jpg',
                                     'https://storage.googleapis.com/download.tensorflow.org/example_images/Vassily_Kandinsky%2C_1913_-_Composition_7.jpg')

 

Explanation:

Here, we have loaded two images, one in the content_path and the other in the style_path.

Creating Functions

In this step, we will create a few functions that will help our model to give the outputs more quickly.

Coste_contenido:

def coste_contenido(base, combination):
    return tf.reduce_sum(tf.square(combination - base))


Coste_estilo:

def coste_estilo(style, combination):
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_nrows * img_ncols
    return tf.reduce_sum(tf.square(S - C)) / (4.0 * (channels ** 2) * (size ** 2))

 

Gram Matrix:

def gram_matrix(x):
    x = tf.transpose(x, (2, 0, 1))
    features = tf.reshape(x, (tf.shape(x)[0], -1))
    gram = tf.matmul(features, tf.transpose(features))
    return gram

Initializing VGG Model

Now our task is to download and initialize the VGG Model into the system with the help of the imagenet.

# VGG model initializing
model = VGG19(
    include_top=False,
    weights='imagenet'
)
# set training to False
model.trainable = False
# Print details of different layers

model.summary()


Output:

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5
80134624/80134624 [==============================] - 25s 0us/step
Model: "vgg19"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, None, 3)]   0         
                                                                 
 block1_conv1 (Conv2D)       (None, None, None, 64)    1792      
                                                                 
 block1_conv2 (Conv2D)       (None, None, None, 64)    36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, None, None, 64)    0         
                                                                 
 block2_conv1 (Conv2D)       (None, None, None, 128)   73856     
                                                                 
 block2_conv2 (Conv2D)       (None, None, None, 128)   147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, None, None, 128)   0         
                                                                 
 block3_conv1 (Conv2D)       (None, None, None, 256)   295168    
                                                                 
 block3_conv2 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_conv3 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_conv4 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, None, None, 256)   0         
                                                                 
 block4_conv1 (Conv2D)       (None, None, None, 512)   1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_conv4 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
 block5_conv1 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv4 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
=================================================================
Total params: 20,024,384
Trainable params: 0
Non-trainable params: 20,024,384
_________________________________________________________________


Explanation:

Once you run the above code, you will see something like this, which means your model has successfully downloaded the images and is ready to perform the tasks.

Loss Function Calculation

Now we have to create a function that will extract the values of the model for the given layers. This process will help us to use it for both content and style issues.

from keras import Model
outputs_dict= dict([(layer.name, layer.output) for layer in model.layers])
feature_extractor = Model(inputs=model.inputs, outputs=outputs_dict)

 

Now we have to define the layer that will calculate the layer on which we will use the loss function of style or content.

capas_estilo = [
    "block1_conv1",
    "block2_conv1",
    "block3_conv1",
    "block4_conv1",
    "block5_conv1",
]

capas_contenido = "block5_conv2"

content_weight = 2.5e-8
style_weight = 1e-6

def loss_function(combination_image, base_image, style_reference_image):

    #  Here we will combine all the images in the same tensioner.
    input_tensor = tf.concat(
        [base_image, style_reference_image, combination_image], axis=0
    )

    # Now, get the values in all the layers for the three images.
    features = feature_extractor(input_tensor)

    # Inicializar the loss

    loss = tf.zeros(shape=())

    # Now just extract the content layers + content loss
    layer_features = features[capas_contenido]
    base_image_features = layer_features[0, :, :, :]
    combination_features = layer_features[2, :, :, :]

    loss = loss + content_weight * coste_contenido(
        base_image_features, combination_features
    )
    # At last, extract the style layers + style loss
    for layer_name in capas_estilo:
        layer_features = features[layer_name]
        style_reference_features = layer_features[1, :, :, :]
        combination_features = layer_features[2, :, :, :]
        sl = coste_estilo(style_reference_features, combination_features)
        loss += (style_weight / len(capas_estilo)) * sl

    return loss

 

Explanation:

In the above code, 

  • At first, we combined all the images in the same tensioner
     
  • Next, we need to get the data from all the layers of images. This will make it easier to change the extraction style at any point
     
  • Now, we have to initialize the loss vector at the place where the addition of the result is being performed
     
  • Now extract the calculated loss from the content and the style loss function
     
  • Now merge both the extracted loss in the third image

Derivatives Calculations

Now we need to calculate the deltas, which are used by the gradient descent optimizer to find the optimal values. We need to calculate the derivatives to know the values of these deltas.

import tensorflow as tf
@tf.function
def compute_loss_and_grads(combination_image, base_image, style_reference_image):
    with tf.GradientTape() as tape:
        loss = loss_function(combination_image, base_image, style_reference_image)
    grads = tape.gradient(loss, combination_image)
    return loss, grads

 

Explanation:

After this step, we are done with the learning process. Now we have two most important things to do. So, let us move to perform the preprocess and deprocess of the images.

Preprocessing

The preprocessing is used to give the format of the network required for the images. There are five important keywords that need to be declared while doing the preprocessing, which are as follows.

  • load_image
  • img_to_array
  • expand_dims
  • preprocess_input
  • convert_to_tensor
     

Let us perform the preprocessing now.

import keras
from tensorflow.keras.applications import vgg19
import numpy as np

def preprocess_image(image_path):
    img = keras.preprocessing.image.load_img(
        image_path, target_size=(img_nrows, img_ncols)
    )
    img = keras.preprocessing.image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = vgg19.preprocess_input(img)
    return tf.convert_to_tensor(img)


The preprocessing of the image is completed here. Now we will do the deprocessing of the images.

Deprocessing

The deproccess is completely the reverse of preprocessing. To perform the deprocessing, we have to follow the given rules.

  • Change the tensor into a usable array
     
  • Now, we have to make data that has a zero average. Also, ensure that the value does not exceed the limit of 255. It should be greater than 0 and less than 255
     
  • In the final step, just convert the given images from BGR to RGB
     
def deprocess_image(x):
    # Convertimos el tensor en Array
    x = x.reshape((img_nrows, img_ncols, 3))

    # Hacemos que no tengan promedio 0
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68

    # Convertimos de BGR a RGB.
    x = x[:, :, ::-1]

    # Nos aseguramos que están entre 0 y 255
    x = np.clip(x, 0, 255).astype("uint8")
    return x

 

The task is completed now. We have performed the preprocessing and deprocessing. It's time to save the generated images.

Saving Generated Images

The below code will just make a function to generate the image and save it.

from datetime import datetime

def result_saver(iteration):
  now = datetime.now()
  now = now.strftime("%Y%m%d_%H%M%S")
  #model_name = str(i) + '_' + str(now)+"_model_" + '.h5'
  image_name = str(i) + '_' + str(now)+"_image" + '.png'

  # Saving the image
  img = deprocess_image(combination_image.numpy())
  keras.preprocessing.image.save_img(image_name, img)

Train the Model

The end game is here. Now we will train our model to see the Neural Style Transfer with TensorFlow. Let's do it.

from keras.optimizers import SGD
from tensorflow import keras
from tensorflow.keras import optimizers
from tensorflow.keras.optimizers import schedules

width, height = tf.keras.utils.load_img(content_path).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)

optimizer = SGD(
    keras.optimizers.schedules.ExponentialDecay(
        initial_learning_rate=100.0, decay_steps=100, decay_rate=0.96
    )
)

base_image = preprocess_image(content_path)
style_reference_image = preprocess_image(style_path)
combination_image = tf.Variable(preprocess_image(content_path))

iterations = 4000d

for i in range(1, iterations + 1):
    loss, grads = compute_loss_and_grads(
        combination_image, base_image, style_reference_image
    )
    optimizer.apply_gradients([(grads, combination_image)])
    if i % 10 == 0:
        print("Iteration %d: loss=%.2f" % (i, loss))
        result_saver(i)


Output:

Output

Explanation:

This process can take time to go from 10 to 4000, but when it finishes, you can see your output image in the folder where all the Python files get saved.

 

Final Result:

Final Result

Frequently Asked Questions

Define Neural Style Transfer with TensorFlow.

The Neural Style Transfer with TensorFlow is one of the funny apps that is created using AI in the field of art and craft. It takes two or more images and mixes them up to create a new design with creativity. 

What are the main components of the Neural Style Transfer with TensorFlow?

There are mainly three components present in the Neural Style Transfer, which are the Content image, the Style image, and the Generated image.

Define TensorFlow.

TensorFlow is an open-source library for computations and a platform for machine learning which is spread on a large scale.

Conclusion

This article discusses the topic of Neural Style Transfer with TensorFlow. In detail, we have seen the definition of the Neural Style Transfer, its componentsworkingformulas, and implementation with explanation.

We hope this blog has helped you enhance your knowledge of Neural Style Transfer with TensorFlow. If you want to learn more, then check out our articles.

And many more on our platform CodeStudio.

Refer to our Guided Path to upskill yourself in DSACompetitive ProgrammingJavaScriptSystem Design, and many more! If you want to test your competency in coding, you may check out the mock test series and participate in the contests hosted on CodeStudio!

But suppose you have just started your learning process and are looking for questions from tech giants like Amazon, Microsoft, Uber, etc. In that case, you must look at the problemsinterview experiences, and interview bundles for placement preparations.

However, you may consider our paid courses to give your career an edge over others!

Happy Learning!

Topics covered
1.
Introduction
2.
Neural Style Transfer
3.
Components of Neural Style Transfer
4.
Working of Neural Style Transfer
5.
Formulas of Neural Style Transfer
5.1.
Content Loss
5.2.
Style Loss
5.3.
Total Loss
6.
Implementation of Neural Style Transfer
6.1.
Importing Libraries
6.2.
Loading Images
6.3.
Creating Functions
6.4.
Initializing VGG Model
6.5.
Loss Function Calculation
6.6.
Derivatives Calculations
6.7.
Preprocessing
6.8.
Deprocessing
6.9.
Saving Generated Images
6.10.
Train the Model
7.
Frequently Asked Questions
7.1.
Define Neural Style Transfer with TensorFlow.
7.2.
What are the main components of the Neural Style Transfer with TensorFlow?
7.3.
Define TensorFlow.
8.
Conclusion