NPTEL Deep Learning – IIT Ropar Week 7 Assignment Answers 2025
1. Which of the following statements about L2 regularization is true?
- It adds a penalty term to the loss function that is proportional to the absolute value of the weights
- It results in sparse solutions for w
- It adds a penalty term to the loss function that is proportional to the square of the weights
- It is equivalent to adding Gaussian noise to the weights
Answer :- For Answers Click Here
2.

Answer :-
3.

Answer :-
4. Suppose that we apply Dropout regularization to a feed forward neural network. Suppose further that mini-batch gradient descent algorithm is used for updating the parameters of the network. Choose the correct statement(s) from the following statements.
- The dropout probability p can be different for each hidden layer
- Batch gradient descent cannot be used to update the parameters of the network
- Dropout with p=0.5 acts as a ensemble regularize
- The weights of the neurons which were dropped during the forward propagation at tth iteration will not get updated during t+1th iteration
Answer :-
5. We have trained four different models on the same dataset using various hyperparameters. The training and validation errors for each model are provided below. Based on this information, which model is likely to perform best on the test dataset?

- Model 1
- Model 2
- Model 3
- Model 4
Answer :-
Common Data Q6-Q9
Consider a function L(w,b)=0.4w2+7b2+1 and its contour plot given below:

6. What is the value of L(w∗,b∗) where w∗ and b∗ are the values that minimize the function.
Answer :- For Answers Click Here
7. What is the sum of the elements of ∇L(w∗,b∗) ?
Answer :-
8. What is the determinant of HL(w∗,b∗), where H is the Hessian of the function?
Answer :-
9. Compute the Eigenvalues and Eigenvectors of the Hessian. According to the eigenvalues of the Hessian, which parameter is the loss more sensitive to?
- b
- w
Answer :-
10. Consider the problem of recognizing an alphabet (in upper case or lower case) of English language in an image. There are 26 alphabets in the language. Therefore, a team decided to use CNN network to solve this problem. Suppose that data augmentation technique is being used for regularization. Then which of the following transformation(s) on all the training images is (are) appropriate to the problem
- Rotating the images by ±10∘
- Rotating the images by ±180∘
- Translating image by 1 pixel in all direction
- Cropping
Answer :- For Answers Click Here