Low Rank Approximation Implementation 2/4

Tech
Singular Value Decomposition (SVD) - Model Compression : Resnet50
Author

Leila Mozaffari

Published

October 24, 2024

Model Compression

Singular Value Decomposition (SVD) - Resnet50

SVD (a mathematical technique) helps us approximate complex weight matrices with simpler ones, to compress the model’s convolutional layers. * Goal: Reduce the model’s size and speed up computations without significantly sacrificing its accuracy.

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader
import numpy as np
import os

Download and Load the Dataset

Using the Imagenette2-320 dataset, which is a smaller version of the ImageNet dataset. It has images that belong to just 10 classes (e.g., different breeds of dogs, cats, etc.).

from torchvision.datasets.utils import download_and_extract_archive

# Download Imagenette2-320
url = "https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-320.tgz"
root = "./data"
download_and_extract_archive(url, download_root=root)

# Set dataset path
dataset_path = os.path.join(root, "imagenette2-320")
Using downloaded and verified file: ./data\imagenette2-320.tgz
Extracting ./data\imagenette2-320.tgz to ./data

Prepare the Dataset

# Data transformation: Resize, Convert to Tensor, Normalize
transform = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize images to 224x224
    transforms.ToTensor(),          # Convert images to PyTorch tensors
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),  # Normalize as per ImageNet pre-training
])

# Load dataset
train_dataset = datasets.ImageFolder(root=os.path.join(dataset_path, 'train'), transform=transform)
valid_dataset = datasets.ImageFolder(root=os.path.join(dataset_path, 'val'), transform=transform)

# Create DataLoaders
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=32, shuffle=False)

Load the Pre-trained ResNet-50 Model

# Load pre-trained ResNet-50 model
model = models.resnet50(pretrained=True)

# Move model to device (CPU or GPU if available)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

Apply SVD for Model Compression

This technique reduces the size of the model by approximating its convolutional layers (layers that detect image features like edges, shapes, etc.).

def svd_compress_conv_layer(conv_layer, rank):
    # Get the weight tensor of the convolutional layer
    weight = conv_layer.weight.data
    out_channels, in_channels, h, w = weight.shape
    
    # Reshape the weight tensor to a 2D matrix of shape (out_channels, in_channels * h * w)
    weight_reshaped = weight.view(out_channels, -1)

    # Apply SVD to the weight matrix
    U, S, V = torch.svd(weight_reshaped)

    # Keep only the top `rank` singular values/vectors
    U_reduced = U[:, :rank]
    S_reduced = S[:rank]
    V_reduced = V[:, :rank]

    # Construct the compressed weight matrix
    compressed_weight = torch.mm(U_reduced, torch.diag(S_reduced))
    compressed_weight = torch.mm(compressed_weight, V_reduced.t())

    # Reshape back to the original convolutional weight shape
    compressed_weight = compressed_weight.view(out_channels, in_channels, h, w)

    # Replace the original weights with the compressed weights
    conv_layer.weight.data = compressed_weight

    return conv_layer

Apply SVD to Each Convolutional Layer in ResNet-50

We apply SVD to every convolutional layer in the model. By keeping only the most important components (e.g., 20 components), we make each layer smaller and therefore reduce the entire model’s size.

def compress_resnet50(model, rank):
    for name, module in model.named_modules():
        if isinstance(module, nn.Conv2d):
            print(f"Compressing layer: {name}")
            compressed_layer = svd_compress_conv_layer(module, rank=rank)
            setattr(model, name, compressed_layer)

# Compress the model with a reduced rank (e.g., keep top 20 components)
compress_resnet50(model, rank=20)
Compressing layer: conv1
Compressing layer: layer1.0.conv1
Compressing layer: layer1.0.conv2
Compressing layer: layer1.0.conv3
Compressing layer: layer1.0.downsample.0
Compressing layer: layer1.1.conv1
Compressing layer: layer1.1.conv2
Compressing layer: layer1.1.conv3
Compressing layer: layer1.2.conv1
Compressing layer: layer1.2.conv2
Compressing layer: layer1.2.conv3
Compressing layer: layer2.0.conv1
Compressing layer: layer2.0.conv2
Compressing layer: layer2.0.conv3
Compressing layer: layer2.0.downsample.0
Compressing layer: layer2.1.conv1
Compressing layer: layer2.1.conv2
Compressing layer: layer2.1.conv3
Compressing layer: layer2.2.conv1
Compressing layer: layer2.2.conv2
Compressing layer: layer2.2.conv3
Compressing layer: layer2.3.conv1
Compressing layer: layer2.3.conv2
Compressing layer: layer2.3.conv3
Compressing layer: layer3.0.conv1
Compressing layer: layer3.0.conv2
Compressing layer: layer3.0.conv3
Compressing layer: layer3.0.downsample.0
Compressing layer: layer3.1.conv1
Compressing layer: layer3.1.conv2
Compressing layer: layer3.1.conv3
Compressing layer: layer3.2.conv1
Compressing layer: layer3.2.conv2
Compressing layer: layer3.2.conv3
Compressing layer: layer3.3.conv1
Compressing layer: layer3.3.conv2
Compressing layer: layer3.3.conv3
Compressing layer: layer3.4.conv1
Compressing layer: layer3.4.conv2
Compressing layer: layer3.4.conv3
Compressing layer: layer3.5.conv1
Compressing layer: layer3.5.conv2
Compressing layer: layer3.5.conv3
Compressing layer: layer4.0.conv1
Compressing layer: layer4.0.conv2
Compressing layer: layer4.0.conv3
Compressing layer: layer4.0.downsample.0
Compressing layer: layer4.1.conv1
Compressing layer: layer4.1.conv2
Compressing layer: layer4.1.conv3
Compressing layer: layer4.2.conv1
Compressing layer: layer4.2.conv2
Compressing layer: layer4.2.conv3

Fine-Tune the Compressed Model

After compression, the model needs to be retrained slightly to adjust the weights.

Compression changes the weights significantly, which may affect the model’s accuracy. Fine-tuning helps adjust these weights and bring back the accuracy closer to its original value.

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 3
model.train()

for epoch in range(num_epochs):
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimize
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}")

print("Training complete.")
Epoch [1/3], Loss: 1.4015
Epoch [2/3], Loss: 0.8525
Epoch [3/3], Loss: 0.6684
Training complete.

Evaluate the Compressed Model

model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in valid_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Validation Accuracy after Compression: {100 * correct / total:.2f}%')
Validation Accuracy after Compression: 72.20%