Python TensorFlow: Implementing Gradient Descent for Linear Regression

Last update on June 14 2025 13:06:55 (UTC/GMT +8 hours)

Python TensorFlow Building and Training a Simple Model: Exercise-10 with Solution

Write a Python program that implements a gradient descent optimizer using TensorFlow for a simple linear regression model.

Sample Solution:

Python Code:

import tensorflow as tf
import numpy as np

# Generate some random data for a simple linear regression problem
np.random.seed(0)
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

# Define the neural network model
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1)
])

# Define the mean squared error (MSE) loss function
loss_function = tf.keras.losses.MeanSquaredError()

# Define the gradient descent optimizer with a specified learning rate
learning_rate = 0.01
optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate)

# Training loop
num_epochs = 100
for epoch in range(num_epochs):
    with tf.GradientTape() as tape:
        # Forward pass
        y_pred = model(X)
        loss = loss_function(y, y_pred)
    
    # Compute gradients
    gradients = tape.gradient(loss, model.trainable_variables)
    
    # Update model weights using gradients and optimizer
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    
    # Print the loss for monitoring
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {loss.numpy()}")

# Get the final model parameters (weights and bias)
final_weights, final_bias = model.layers[0].get_weights()
print("Final Weights:", final_weights)
print("Final Bias:", final_bias)

Output:

Epoch 1/100, Loss: 1.362046480178833
Epoch 2/100, Loss: 1.2959612607955933
Epoch 3/100, Loss: 1.23311185836792
Epoch 4/100, Loss: 1.1733397245407104
Epoch 5/100, Loss: 1.1164944171905518
Epoch 6/100, Loss: 1.0624324083328247
Epoch 7/100, Loss: 1.0110173225402832

Epoch 94/100, Loss: 0.024721495807170868
Epoch 95/100, Loss: 0.024096103385090828
Epoch 96/100, Loss: 0.023501060903072357
Epoch 97/100, Loss: 0.02293490804731846
Epoch 98/100, Loss: 0.022396206855773926
Epoch 99/100, Loss: 0.02188362553715706
Epoch 100/100, Loss: 0.02139587700366974

Explanation:

Import TensorFlow and NumPy libraries:

import tensorflow as tf
import numpy as np

-----------------------------------------------------

Generate Random Data:

np.random.seed(0)
X = np.random.rand(100, 1)
y = 2 * X + 1 + 0.1 * np.random.randn(100, 1)

This code generates random input data X and target data y for a simple linear regression problem. It's a dataset with 100 samples.

------------------------------------------------------

Define the Neural Network Model:

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1)
])

Here, a simple neural network model is defined using TensorFlow's Keras API. It consists of a single dense (fully connected) layer. The Input layer specifies the input shape, and the Dense layer represents the output layer.

---------------------------------------------------------

Define Loss Function and Optimizer:

loss_function = tf.keras.losses.MeanSquaredError()
learning_rate = 0.01
optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate)

The mean squared error (MSE) loss function is defined to measure the model's prediction error.
The stochastic gradient descent (SGD) optimizer with a specified learning rate (learning_rate) is used for model training.

------------------------------------------------------------

num_epochs = 100
for epoch in range(num_epochs):
    with tf.GradientTape() as tape:
        # Forward pass
        y_pred = model(X)
        loss = loss_function(y, y_pred)
    
    # Compute gradients
    gradients = tape.gradient(loss, model.trainable_variables)
    
    # Update model weights using gradients and optimizer
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    
    # Print the loss for monitoring
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {loss.numpy()}")

The training loop runs for a specified number of epochs (num_epochs). In each epoch:
Use a tf.GradientTape() context to record operations for automatic differentiation.
Forward pass: We compute predictions (y_pred) using the model.
Calculate the loss by comparing the predictions (y_pred) to the ground truth (y) using the MSE loss function.
Compute gradients of the loss with respect to the model's trainable variables.
Update the model's weights using the computed gradients and the SGD optimizer.

---------------------------------------------------------------

Get the Final Model Parameters:

final_weights, final_bias = model.layers[0].get_weights()
print("Final Weights:", final_weights)
print("Final Bias:", final_bias)

After training, we retrieve and print the final model parameters (weights and bias).

---------------------------------------------------------------

This code demonstrates how to implement gradient descent optimization for a simple linear regression model in TensorFlow. The optimizer adjusts the model's parameters to minimize the mean squared error loss during training.

Go to:

PREV : Custom loss function in TensorFlow for positive and negative examples.
NEXT : Training Neural Networks with Adam Optimizer.

Python Code Editor: