技能

Deep Convolutional Neural Network Yolo V8 Deep Learning Tensorflow PyTorch Image Processing Automation Languages Command ANN CNN Neural Networks Computer Vision

Project Detail

Objective:

Submit a research-style report, using the provided Jupyter Notebook, in which you build a neural network to investigate three small experiments and present/discuss the results.

Task Summary:

The main task is to develop a deep convolutional neural network to perform multi-class classification. The dataset for this assignment is the CIFAR10 dataset (link), which is also available from the torchvision library in PyTorch (link). The dataset is made up of 32x32 colour images, which are split into a training set of 50,000 images and a test set of 10,000 images. The images have labels (one label per image) from 10 classes: 1) airplane; 2) automobile; 3) bird; 4) cat; 5) deer; 6) dog; 7) frog; 8) horse; 9) ship; 10) truck.  The network should be trained to predict the labels using the training data, and its generalisation performance is evaluated using the test data. You will be asked to perform three experiments in which you analyse the influence of different methods on the performance and functionality of your neural network.

In this assignment, you need to implement a deep convolutional neural network, which you will later build upon. The baseline version of your CNN must have an input, three convolutional hidden layers, then two fully connected layers, where the second fully connected layer is 10-dimensional and provides the output classifications of your network. Each convolutional layer block must include pooling and a non-linear activation function of your choice. You can use PyTorch (recommended), or any other deep learning frameworks (e.g. TensorFlow) with automatic differentiation. The neural network should be trained to classify the images from the CIFAR10 dataset. You can use the modules available in packages like PyTorch to load the dataset (e.g. in PyTorch, you can use torch vision.datasets.CIFAR10) and to build the layers in your model. The assignment is split into three experiments:

Experiment 1:

Choose an appropriate split of the training data to include a subset used for validation. Investigate the effect that the learning rate has on your model’s performance, i.e. it’s classification accuracy, or error rate. Compare the performance of your model for 5 different learning rates. The learning rates you choose must be high enough to allow the performance on the validation data to saturate, and they must be low enough to prevent your model from becoming unstable. For each learning rate, run the model at least 5 times, using different seeds to initialise any random number generators (e.g. for weight initialisation) such that each run of the model produces different results. You can then use the average of the model runs to plot the model’s mean performance with respect to the training episodes for the training and validation data (these plots are also called learning curves). Using these results, design a learning rate scheduler that reduces the learning rate during training, again running the model at least 5 times to build up a picture of its average performance. Plot the learning curves for this new version of the model.

Compare performance on the test data, and in terms of the generisability gap between training and validation data, between the learning rate scheduler model and the best performing model that didn’t use a learning rate scheduler (out of the five you tested previously).

Experiment 2:

Investigate the impact that regularisation has on the performance of your model. First, split the training data into two halves. One half will be used for training, the other half for validation. Then implement Dropout in the fully connected layers of your model. Investigate how the dropout rate affects average performance on the test data, and the generalisation gap between the training and validation data. Choose 5 Dropout rates, one of which must have a value of zero (i.e. no Dropout). Remember to run each version of the model 5 times to compute the mean performance. Second, swap around your two datasets, so that the training data is now used for validation, and the validation data is now used for training, and investigate how your model performs in a transfer learning task. To do this, freeze all of the weights in the convolutional layers (i.e. prevent them from being updated during optimisation), re-initialise the weights in the fully connected layers, and retrain the two fully connected layers. Apply only two models to the transfer learning task: one without Dropout, and the other using a non-zero Dropout rate that gave the best average performance. How does the average performance of these retrained models compare with the average performance of the models in which all layers were trained together? Refer to their average performance on the test data, as well as the generalisation gap between the training data and validation data. Provide learning curve plots to back up your analyses of these models.

Experiment 3:

Investigate the quality of gradient flow through your network. First, for the model that does not use Dropout, record gradients for the optimised parameters (i.e. dL/dw) in each layer of your model. Record the gradients over the first 5 episodes of training, and build a separate record of gradients over the final 5 episodes of training. Produce plots to show how the mean and standard deviation of gradients change as a function of layer number, for both the beginning (first 5 episodes) and the end (final 5 episodes) of training. Second, using the best performing, non-zero Dropout rate from Experiment 2, perform the same calculations to compute the mean and stand deviation of the gradients for all layers, and over the first and final 5 episodes of training. Does Dropout affect gradient flow in your model? If so, how are the gradients affected? Third, without including Dropout, add batch normalisation to each hidden layer of your model. Again, compute and plot the mean and standard deviation of the gradients as you had done before. How is gradient flow affected by batch normalisation in your model? Fourth, using learning curves for training and validation data, show how batch normalisation affects the performance of your model, referring to its average performance on the test data as well as to the generalisation gap between the average training and validation learning curves.

Further details will be shared if you agree on this project.

Location:
Compensation:
PKR 30,000/fixed
在之前申请:
Jun 12, 2024
发布日期:
May 13, 2024

Caffeine Studio

· 1-10 员工 -

你最大的竞争优势

快速得到有竞争力的分析和专业的对你的评定
联系我们团队的专业顾问来提升你的简历
尝试罗资 专业版
我在ROZEE上找到工作啦!