Machine learning is a method in which a system learns from data to make predictions without being explicitly programmed for each task.
This approach uses labeled data, meaning the training dataset contains input-output pairs where the output (or label) is known. For example, if you have a dataset where each customer is labeled as “high spender” or “bargain hunter,” then you have labeled data. The labels are the output values that the model aims to predict.
This approach uses unlabeled data, meaning the training dataset does not contain any labels or predefined outputs. For example, in your customer purchase data, you know the details of the transactions (amount spent, items bought, etc.), but you don’t have information on which customers are “high spenders” or “bargain hunters.” Thus, you don’t have predefined categories or outcomes (output values) provided with the data.
For machine learning, it's important to remember three key concepts, often abbreviated as HCG: Hypothesis, Cost Function, and Gradient Descent Algorithm. Let’s explore these:
To predict the output value Y^, we need a hypothesis. Assuming that all human and natural phenomena can be explained using a linear model, we can express the hypothesis as:
H(x) = Wx + b, where:
Consider the dataset X = [1, 2, 3] with corresponding labels Y = [1, 2, 3]. After setting an initial hypothesis, we need to calculate the cost (or loss). The cost function measures the error, which is the difference between the predicted values and the actual data values.
The resulting cost function resembles a bowl-shaped 2D graph, with the minimum cost indicating an optimal hypothesis. In this graph, our goal is to find the weight and bias values at the global minimum, where the slope of the graph approaches zero.
To find the point where the slope is zero, we use differentiation. We iteratively update our weight values until we reach the global minimum. The update rule for the weights is given by:
W = W - α * (dJ/dW), where:
The gradient descent algorithm is an iterative optimization algorithm used to minimize the cost function in machine learning. The basic steps are:
In summary, gradient descent helps us find the optimal parameters that minimize the cost function, leading to better predictions by our machine learning model.
Back to Blog