Identification of flower species using deep learning and pytorch

Mohamed Fawas
8 min readMay 17, 2022

Recently, I was improving my knowledge in computer vision by working on different CV related projects. Then I found an interesting dataset in kaggle which consisted of images of different flowers. So, In this project I have built a deep learning model that can identify the flower species if we give an input image of the flower. This deep learning model is built with the help of pytorch. We need help of GPUs to finish this process , so I would recommend to use google colab to do this project.

Following is the plan of action I followed while doing this project.

  1. Pick a dataset
  2. Download the dataset
  3. Import the dataset using PyTorch
  4. Explore the dataset
  5. Prepare the dataset for training
  6. Move the dataset to the GPU
  7. Define neural networks
  8. Train the model
  9. Make predictions on sample images
  10. Iterate on it with different networks and hyperparameters

Anyone can reproduce this work with the help of this python notebook.

Picking the dataset

I took this dataset from kaggle, which is available here. Before doing the project, Just go through some of the image samples in the dataset and make an idea about how you are going to approach this problem.

Download the dataset

Now let us start working on the coding part. For this we can use google colab. Open google colab, then we need to change the hardware accelerator to GPU. By using GPU we can finish working on image related projects without much time delay. Then we need to import the necessary libraries. Here initially we need ‘opendatasets’ library, which helps us to get the dataset from Kaggle.

import opendatasets as oddataset_url = 'https://www.kaggle.com/datasets/alxmamaev/flowers-recognition'od.download(dataset_url)

After downloading the dataset check for number of images of flowers in different classes of flowers available in the dataset. For this execute the below code.

for cls in os.listdir(data_dir):print(cls, ':', len(os.listdir(data_dir + '/' + cls)))

This will yield us the output :

tulip : 984 
rose : 784
sunflower : 733
dandelion : 1052
daisy : 764

Import the dataset using PyTorch

Now we need to import the downloaded dataset using Pytorch, for this initially we need to import the ‘ImageFolder’ class from the ‘torchvision’ library. Then import the dataset with the help of ImageFolder class using the below mentioned code.

from torchvision.datasets import ImageFolderdataset = ImageFolder(data_dir)

Explore the dataset

Now display the dataset with help of matplotlib. You can do that with the help of the following code.

import matplotlib.pyplot as plt%matplotlib inlineimg, label = dataset[0]plt.imshow(img)

For me it gave the following output:

Image output from the above code

Now we need to transform the images into pytorch tensor so that we will be able to see the above image as a pytorch tensor. Also, in our dataset images have different dimensions. We should make the dimensions of all the images uniform. All the above mentioned steps can be performed with the help of the following code.

import torchvision.transforms as ttdataset = ImageFolder(data_dir,tt.Compose([tt.Resize(64),tt.RandomCrop(64),tt.ToTensor()]))

Prepare the dataset for training

Now we need to split our dataset into two, training dataset and validation dataset. Let us choose 10% of the entire dataset as the validation dataset. This can be done with the help of the following code.

val_pct = 0.1 # validation percentageval_size = int(val_pct * len(dataset)) # length of the validation settrain_size = len(dataset) - val_sizetrain_size, val_size

Now let us randomly split the entire dataset into training and validation data. This can be done with the help of following code.

from torch.utils.data import random_splittrain_ds , valid_ds = random_split(dataset, [train_size, val_size])len(train_ds), len(valid_ds)

Now we can’t train our model by giving whole of image data into the model. This will consume a lot of time. To avoid this issue we give this image data as batches into the model. This is done with the help of ‘DataLoader’ class. We can do all these steps with the following code.

# to work with data in batches we need to define the batch sizebatch_size = 256# Also we need dataloaders to load the data in batchesfrom torch.utils.data.dataloader import DataLoader
train_dl = DataLoader(train_ds,batch_size,shuffle=True,num_workers=4,pin_memory=True)valid_dl = DataLoader(valid_ds,batch_size,num_workers=4,pin_memory=True)

Now, let us make a grid of image data which consists of all the images we use in batch. Then we can display this grid of images.

from torchvision.utils import make_griddef show_batch(dl):
for images, labels in dl:
fig, ax = plt.subplots(figsize=(12, 6))
ax.set_xticks([]); ax.set_yticks([])
ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))
break

Move the dataset to the GPU

Now we need to move the data to the GPU, we need some helper functions for using the GPU.

Now we can use the following code for this:

import torch# Below function will pick the GPU if it is available otherwise it is gonna use the CPUdef get_default_device():"""Pick GPU if available, else CPU"""if torch.cuda.is_available():return torch.device('cuda')else:return torch.device('cpu')# Below function will move data to a particular device.def to_device(data, device):"""Move tensor(s) to chosen device"""if isinstance(data, (list,tuple)):return [to_device(x, device) for x in data]return data.to(device, non_blocking=True)# Below function is used to wrap our data loaderclass DeviceDataLoader():"""Wrap a dataloader to move data to a device"""def __init__(self, dl, device):self.dl = dlself.device = devicedef __iter__(self):"""Yield a batch of data after moving it to device"""for b in self.dl:yield to_device(b, self.device)def __len__(self):"""Number of batches"""return len(self.dl)

Now we need to verify whether we have access to CUDA or not.

torch.cuda.is_available()

If the above code returns ‘True’ , then it means a GPU device is available to us.

Now we can define the device we are going to use .

device = get_default_device()device

If the code returns ‘device(type=’cuda’)’, then it means we have access to the GPU.

Now we can wrap out data loaders using the DeviceDataLoader class.

train_dl = DeviceDataLoader(train_dl,device)valid_dl = DeviceDataLoader(valid_dl,device)

With the above code whenever batches are requested , those batches of images will be on the GPU.

Now, before creating a model we are going to define a class named ‘ImageClassificationBase’.

Now we can define the fit and valuate function.

Define neural networks

We can either create a model from scratch or we can use a pretrained model. If you create a model from scratch then you have to write the entire network structure of the model.

Resnet9 Architecture explanation: Input goes into a conv layer, then batch norm layer and then a relu layer. Batch norm layer is used for regularization which is used to prevent over fitting. Then you have relu layer which introduces linearity.

Then we keep putting layers on layers. Then we have maxpooling which helps to reduce the output size of the feature map.

Now we can define the model.

model = to_device(ResNet9(3, len(dataset.classes)), device)model

In the above code you can see that the number of input channels are 3. We have colour images with 3 channels. Number of output classes are shown by the code len(dataset.classes) which will give us the output 5. Then we put the model on to the device.

Pass one batch of input tensor through the model.

Whenever you set something up, for example while setting up the dataloader check that you are able to load the images properly from the dataloader and you are able to display them.

If you have just created a model just check that you can take a batch of data and you can run it through the model. If you are also working with the GPU, just check that the model’s weights are on the GPU and the images are actually on the GPU.

So practice this methods to easily solve the errors.

Always derisk your project by checking and testing things.

For loading images in batches, use the following code:

Train the model

First thing you want to do is to valuate your model on the validation data loader. because you want to get a benchmark of where is the model starting from. If the randomly initialized classes is starting from 20% and after training all the things are only leading to 20% accuracy , then there is something wrong.

history = [evaluate(model, valid_dl)]history

From the output of the model, you will know the accuracy for the model which is a randomly initialized model. Also, we have recorded those accuracy data into an array called ‘history’. In later, we can keep appending training information into the ‘history’.

Now let’s call fit function and retrain the model again and again.

history += fit(5, 0.001, model, train_dl, valid_dl, torch.optim.Adam)

By doing the above steps for 3–4 times repeatedly will improve the performance of your model very well.

As soon as your validation accuracy is flattening out in the graph, It is a good idea is to reduce the learning rate by a factor of 10. Then the model is in the area where it is already near the optimum values. But it will be bouncing around due to a high learning rate.

If the validation accuracy is above a good level then you can try using SGD instead of Adam Optimizer.

history += fit(5, 0.001, model, train_dl, valid_dl)

Now we can plot those accuracy and loss data. By analyzing this, we can make a decision whether to continue the training or stop it that step.

By plotting the accuracy and loss, I got the below diagrams as output.

In the above plot, the point at which validation loss is going up and the gap between training loss and validation loss increases is the point where the overfitting starts to happen. overfitting is not happening when your training loss is less than your validation loss. Your training loss is always going to be less than your validation loss if you are training for long enough, because your model has seen the training data and training data is used to perform the gradient descent and change the weights. So, at some point your training data will be less than validation data which your model is not using for training itself. So, overfitting happens when your training loss starts to decrease and validation loss starts to decrease. As long as both are decreasing overfitting doesn’t happens.

Make predictions on sample images

We are taking an image, converting it into a batch and passing it through the model and getting an output from the model. Also, we will get a label out of that model.

Below function moves the image into GPU and it converts the image into a batch of one image because our model operates with batches.

The below function takes an image and it’s label and then it compares with the predicted output and the target output.

Now you can check different predictions by our model and compare it with the target output. Do this by repeating the code below and change the number each time.

show_image_prediction(*valid_ds[120])

Save the model

You can save the model using the code mentioned below:

torch.save(model.state_dict(),'flowers-resnet9.pth')

--

--