(Continued) Week2- Neural Networks Basics #100DaysOfMLCode Challenge

*This post is part of my #100DaysOfMLCode challenge. Learning about Machine Learning for 100days continuously. I have taken images from my specialization course which I have taken on Coursera.

*please ignore my writing skills. and also I didn’t organize the information well. This Post is about what I have understood from the videos which I have watched.

Coursera’s Deep Learning Specialization course homepage: https://www.coursera.org/learn/neural-networks-deep-learning#

Python and Vectorization


  • Vectorization is basically a method of getting rid of explicit for-loops.
  • In the deep learning era, code runs for relatively large datasets. So our code needs to perform well. Vectorization plays a key role here.
  • vectorized code.png vectorized code runs faster than non-vectorized code.
  • SIMD(Single Instruction Multiple Data)

More Vectorization Examples

  • By avoiding explicit for loops, your code will run significantly faster.
  • The rule of thumb to keep in mind is “whenever you are programming neural network or logistic regression just avoid the explicit for-loops whenever possible.”
  • forloopWhenever you want to compute Vector ‘u’ as a product of matrix A and another vector ‘v’, then
  •  matrix multiplication.pngdefinition of matrix multiplication
  • nonvector.png in the non-vectorized version, we need to have two for loops.
  • vecthis is the vectorized version, this will eliminate the use of two- for loops.
  • expsuppose you need to apply the exponential operation on every element of a vector/matrix.
  • implementation.png As you can see , there is a non-vectorized implementation on the left side. Where as right side contains Vectorized implementation. Vectorized implementation will allows us to eliminate the explicit for-loop.
  • numpyusing numpy functions we can able to apply the similar operation to all the elements of a vector without the use of explicit for loop.
  • eliminate for loo we have eliminated the second for-loop from the algorithm. Instead of a for-loop we have used vector.

Vectorizing Logistic Regression

  • Vectorizing Logistic Regression means implementing Logistic Regression without a single for-loop.
  • predictionspredictions can be calculated in a single step rather than for each individual training element.
  • linecodewith one line of code, we can able to calculate capital Z.

Vectorizing Logistic Regression’s Gradient Output

  • 7.48.pngwe have done forward propagation and backward propagation by computing the derivatives and predictions on all the training examples without using a for-loop.
  • 8.23Gradient Descent is updated as following.
  • 8.53we may need an explicit for-loop whenever we need to iterate it for multiple times.
  • We can able to do a single iteration of for-loop without using a for-loop.
  • Broadcasting a technique through which certain parts of the code can be made more efficient and faster.

Broadcasting in Python

  • broadcasting.png Can you calculate the percentage of calories from carbs, protein, and fats without the use of explicit for-loop.
  • 4.21 axis=0 represents the adding the sum of the elements vertically.
  • 4.56 this is an example of broadcasting in python.
  • 7.25.pngbasic process of broadcasting.
  • 8.51General principle of broadcasting.

A Note on python/numpy vectors

  • 4.27 whenever you create a vector, don’t use the “rank 1” type vectors.
  • 5.55.png if you are not sure about the dimensions then you can use assert statement.
  • Either use column vector or row vector.

Logistic Regression Cost Function

  • 3.13.pngsingle equation depicting Logistic Regression in two forms.

Python Basics with Numpy

  1. In this particular Programming exercise we will learn
  2. How to use Numpy
  3. Implement some basic core deep learning functions such as the softmax, sigmoid, dsigmoid, etc.
  4. Learn how to handle data by normalizing inputs and reshaping images.
  5. Recognize the importance of vectorization.
  6. Understand how python broadcasting works.
  7. I have learnt about building basic functions with numpy.
  8. The data structures we use in numpy to represent the shapes ( vectors, matrices, etc) are called numpy arrays.
  9. Implemented logistic sigmoid function using numpy.
  10. Implemented sigmoid gradient using numpy.
  11. Two common numpy functions used in deep learning are np.shape and np.reshape.
  12. Learnt how to reshape the numpy arrays.
  13. Implemented broadcasting and softmax function.
  14. How vectorization plays an important role in computation of vectors and matrices.
  • Vectorization plays an important role in deep learning. It provides computational efficiency and clarity.
  • Implemented the basicsigmoid, sigmoid, sigmoid_derivative, image2vector, normalize rows, softmax, L1 function and L2 function.

Logistic Regression with a Neural Network Mindset

  1. In this Programming assignment we are going to work with logistic regression in a way that builds intuition relevant to neural networks
  2. By doing this assignment you will get to know about building the general architecture of a learning algorithm
  3. You will get to know about initializing parameters, calculating the cost function and its gradient. Using an optimization algorithm as well
  • numpy is the fundamental package for scientific computing with Python.
  • h5py is a common package to interact with a dataset that is stored on an H5 file.
  • matplotlib is a famous library to plot graphs in Python.
  • PIL and scipy are used here to test your model with your own picture at the end.
  1. Many software bugs in deep learning come from having matrix/vector dimensions that don’t fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.

Common steps for pre-processing a new dataset are:

  1. Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, …)
  2. Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)
  3. “Standardize” the data

The main steps for building a Neural Network are:

  • Define the model structure(such as number of input features)
  • Initialize the model’s parameters.
  • LOOP( calculate current loss (forward propagation), Calculate current gradient (backward propagation), update parameters(gradient descent))

Implemented the following functions by doing this exercise.

  1. Initializing the values of w,b.
  2. Optimize the loss iteratively to learn the parameters (w,b)
  3. Computing the cost and its gradient.
  4. Updating the parameters using gradient descent.
  5. Used the learned (w,b) to predict the labels for a given set of examples.
  • Different learning rates gives different costs.
  • In deep learning it is recommended to chose the learning rate which minimizes the overall cost. But if the model overfits , use other techniques to reduce overfitting.



Week2- Neural Networks Basics #100DaysOfMLCode Challenge

*This post is part of my #100DaysOfMLCode challenge. Learning about Machine Learning for 100days continuously. I have taken images from my specialization course which I have taken on Coursera.

*please ignore my writing skills. and also I didn’t organize the information well. This Post is about what I have understood from the videos which I have watched.

Coursera’s Deep Learning Specialization course homepage: https://www.coursera.org/learn/neural-networks-deep-learning#

Neural Networks Basics

This is the Week2 of the Neural Networks and Deep Learning Specialization course.

Logistic Regression as a Neural Network

This Week we go over the basics of Neural Network programming.

  • While implementing the Neural Network, Implementation techniques also play an important role.
  • In this week we learn about a method where neural network processes entire training dataset(m) without taking help of explicit ‘for’ loop for processing the training set.
  • We also learn about why the computation of neural network requires Forward propagation and backward propagation.

Binary Classification

  • Logistic regression is an algorithm for Binary Classification.
  • binary classificationdenotes binary classification problem. Input can be an image and we need to find whether the image is the cat or not. we will use ‘y’ to denote the output.
  • Computer will store the image in the form of 3 matrices(Red, Green and Blue color channels). These are the pixel Intensity Values.


  •  fig shows that if you have a 64 by 64 image then you should have three 64 by 64 matrices denoting Red, Green and blue color channels.
  • matricesas you can see. We can un row the pixel intensity values from matrices and put them in feature X. X has the elements around 3*64*64 as there are three matrices.
  • In Binary Classification our goal is to learn or train a classifier which takes images as input in the form of feature Vector X and outputs the value as Y.
  • notations shows the basic notations used for logistic regression. m represents training examples. Here X contains m-training set values of x which are arranged in column wise.
  • column wise as you can see Y contains the output values which are arranged in column wise.
  •  shape.pngIn python there is a library( NumPy) with which we can able to find the dimensions of a particular matrix. The function is ‘shape’. Usage = X.shape()

Logistic Regression

  • Logistic Regression is a learning algorithm which is used for Binary Classification problem which has output labels 0 or 1( Supervised Learning Problem).
  • Here w, b are parameters .
  • linear function.png if we use this Linear function, then it would be not appropriate because values can be negative or values can be larger than 1. So this linear function is not appropriate for Binary classification.
  • We use Sigmoid function for the output value of Y.
  • sigmoid As you can see in the image. If the value of Z is large then sigmoid of Z will be nearly equals to 1. But when the value of Z is large negative number. Then sigmoid of Z will be nearly Zero.
  • The graph depicts the values of Z as a function of (sigmoid)Z.
  • Our goal is to learn to find the parameters w,b such that y(cap) becomes good estimate when Y=1.

Logistic Regression Cost Function

  • To train the parameters w,b we need to have a Cost function.
  • ith element.pngSuperScript (i) represents ith element in a training set.
  • Loss function will tell us how well our algorithm is working( in this case the algorithm is Logistic regression).
  • loss functionwe often don’t use this loss function in logistic regression because of Gradient descent fails to find optimum value.as there are several global optimum values.
  • Loss function is nothing but a squared error of actual and predicted value.
  • lossfunLogistic RegressionThis loss function is used for Logistic Regression
  • los as you can see if y=1, then we want y(cap) to be as large as possible and when
    y=0, we want y(cap) to be as small as possible.
  • Loss function is defined to tell how well our algorithm is working for a single example from a
    training set. But a cost function is defined for entire training set.
  •  cost functionshows the cost function. Which is defined for entire training set.
  • We will find the values of parameters(w,b) such that the overall cost function is minimum.
  • Logistic Regression can be viewed as a small neural network.

Gradient Descent

  • mincost.png we need to find the values of w,b such that Cost function is minimum.
  • w-can be high dimensional but for plotting we will consider it as a single real number.
  • This function J is a type of convex function. So we can able to find the minimum value.
  • The main reason why we use this type of cost function is because it is a convex Function.
  • In order to find the good values of parameters w,b which makes the overall cost function to a minimum value, we will set w,b with some initial values.
  • We can use any initialization method for the parameters. As this is a convex function whatever value we initialize it will come to minimum value. No need to worry about that.
  • Gradient Descent will start from the initial step with initial values of the parameters and gradually descents in steepest way. Gradually it goes to global optimum.
  • gradient descent worksshows how gradient Descent works. It will repeat the step as shown in the picture. It will do the derivative of the cost function with the help of alpa.
  • Alpa () is the learning rate. It controls how bigger step we take in each iteration of the gradient Descent.
  • (dw) denote the variable which has derivative.
  • learning rateas you can see if the derivative value is negative then w will proceed in positive way
    and when derivative value is positive then will proceed in negative way.
  • partial derivative.pngIn calculus we use different symbol to represent a derivative of a function which has two variables. Here we will use partial derivative function as there are two variables.

Computation Graph

  • Computation Graph is used to find the derivatives for the complex functions.
  • Computations of a neural network are organized in the form of Forward propagation and backward propagation.
  • In forward propagation output of the neural network is expected.
  • In backward propagation step, we compute gradient or derivatives.
  • The Computation Graph explains why this is organized in this manner.

Derivatives with a Computation Graph

  • computation graph shows the computation graph.

Logistic Regression Gradient Descent

  • This tells us how to compute Derivatives for the gradient descent of the Logistic Regression.
  • LogisticRegGradshows how we can find derivatives for gradient descent for a particular training example.

Gradient Descent on m Examples

  • mexamples.pngthere are some methods like Vectorization in order to avoid explicit for-loops.
  • Explicit for-loops will make code inefficient if the size of the database is too large.


What is a Neural Network?

Hello Friends,

Story Behind this post:

I always wanted to write posts about Computer Science. I feel happy when I write about these stuff. Since last month I started exploring DataScience Field. I am focusing to get a job in DataScience Field. The thing which got me attracted to DataScience field is Deep Learning.

Most Deep Learning models are based on Artificial Neural Network which came from the concept of Neural network in the Human Brain. The main reason for starting this Website is to explore the fields related to Human Brain, Space and Computer Science.

Deep Learning deals with two of my favourite Topics(Human Brain and Computer Science ). So this is the main reason why I started to explore Deep Learning.


As a part of Exploring Deep Learning. I have taken a Specialization course on Coursera.

Course Homepage : https://www.coursera.org/specializations/deep-learning

This Specialization course is taught by Andrew Ng. Andrew Ng is a great person with the greatest teaching skills.


I am going to teach whatever concepts which I have learnt in this course. Images are taken from the Course Videos.

Today Topic is about ” What is Neural Network?housing price prediction

Let us consider an example of housing price prediction problem. Let’s say we have a dataset of 6 houses containing the features like the price of the house and the size of the house. After depicting the values of 6 dataset values on the graph. We need to have a function which satisfies these dataset values and able to predict next house price with a Value of size of the house.  We can fit a straight line satisfying these values


Using the concept of linear regression we can able to set a straight line satisfying these values. What we have done as you can see in the above image is an example of Simplest Neural Network.

Simplest Neural Network.


In the image as you can see, there is one input named “ size “  and it goes into a Neuron( little circle as shown in the image) where the actual function resides and gives output in the form of a price. Neuron implements the function which we have drawn in the image as shown above.  Neuron Computes input values using the Linear function.


Sometimes in the Neural Network Literature, we often come up with a graph(or a function ) which starts with zero and continues which a straight line with a certain slope.

That particular type of function is called as “ReLU” Function. Rectified Linear Unit.

Larger Neural Network is built by stacking small neurons (Like shown in the above image ).


As you can see in the image, it’s an example of a larger Neural Network. This Larger Neural Network contains more than one input ( more than one features ).

**Here I got a doubt whether we can use different activation functions for some set of neurons and a different set of activation functions for another set of neurons.??? ( I googled this doubt and find out that Yes, we can use )


As you can see in the image, we just need to give the inputs and it will predict the price of the house using Training examples which we have given.


As you can see in the image, each of the hidden units( neurons) are connected by each input features so that each neuron in the hidden unit has the opportunity to decide and think in several aspects to predict the output.

->Given enough examples of (X,Y) neural network will do a remarkable job in figuring out the functions that accurately map (X) to (Y)

Whatever explained so far in the Neural network is an example of Supervised Learning. Which means the system takes certain input (X) and gives back the result (Y) as output.





Blog at WordPress.com.

Up ↑