How Computers Interpret Images?




We all love to see beautiful images, but have you ever thought how do computers see an image? In this tutorial, we will give an explanation of how images are stored in a computer. I ‘ll be using the example from the MNIST Dataset to understand how computers interpret images.

Let’s get started,

Any grayscale image is interpreted by a computer as an array. A grid of values where each grid cell is called a pixel and each pixel has a numerical value.

Udacity

To use this grayscale image in our Machine Learning model, we had to normalize the pixel values.

Pre-Processing the Data

Normalization

It is a method used in Machine Learning to bring features in a dataset to the same scale. when you normalize a feature all feature values will be in the range of 0 to 1. It will help your algorithm to train better. We do normalization because Neural Networks rely on gradient calculations. Data normalization is typically done by subtracting the mean (the average of all pixel values) from each pixel and then dividing the result by the standard deviation of all the pixel values. Sometimes you’ll see an approximation here, where we use a mean and standard deviation of 0.5 to center the pixel values.

Here, our pixel values range from 0 to 255. To scale our pixel values in a range of 0 to 1, we divide each pixel value with 255 to get our new pixel values.

Flattening

Converting any image array into a vector from. We cannot give the array values directly to our model, it should be converted into a vector form by flattening the array values.

Ex: 28 X 28 = 784 pixel values

This way computers interpret the images, and by applying different machine learning techniques like normalization, flattening, etc. we can use this in our machine learning models.

Thank you!

Sandeep Yadav



Comments