Day2 #100DaysOfMLCode challenge

*This post is part of my #100DaysOfMLCode challenge. Learning about Machine Learning for 100days continuously. I have taken images from my specialization course which I have taken on Coursera.

*please ignore my writing skills. and also I didn’t organize the information well. This Post is about what I have understood from the videos which I have watched.

Coursera’s Deep Learning Specialization course homepage:

check my Previous blog post : What is a Neural Network?

Supervised Learning with Neural Networks


Neural networks plays a good role in Online Advertising where it takes all the user’s information and predicts whether the user clicks on the ads or not.

Supervised learning is the one which takes a certain input X and uses certain function to map with the output Y .


As you can see in the image. These are the applications of SuperVised Learning. And their input and output is given respectively.


For image applications we often use Convolutional Neural Network.
For Sequence Data, we often use Recurrent Neural network.
For Autonomous Driving etc, we often use Custom/Hybrid Neural Network.


There are basically three types of neural networks. They are
1.Standard Neural Network
2.Convolutional Neural Network
3.Recurrent Neural Network


Example of Structured and Unstructured Data.

Structured data Contains clearly define features values.

Why is Deep Learning taking off?


Andrew Ng talks about why deep learning is taking off.
He explains with a graph with Amount of data on X-axis and Performance on Y-axis.

As you can see in the image if we use  Traditional learning Algorithms, performance improves for a while but after some time it becomes a plateau .


This graph depicts that “scale” drives the deep learning progress. A larger Neural network can perform well with a large amount of data.


As you can see in the image. Scale of data and scale of computation played a huge role in early days of deep learning. But in the recent years Algorithmic innovation also played a huge role.


Andrew Ng discusses huge breakthrough which occurred by changing activation function like sigmoid function to ReLU function.


at the areas which are pointed out by the arrows shows that value of gradient descent is nearly zeros at these locations. And this inturn makes learning slow.

For ReLU(Rectified Linear Unit) activation function, the value of the gradient descent is nearly one for all the positive values.


Computation plays a great role . as you can see in the image. If the computation is fasts , then we can able to experiment faster.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

Up ↑

%d bloggers like this: