w3resource

Python URL Shortener Project: Comparing Hashing and Base62 Encoding Methods


URL Shortener:

Input values:
User provides a long URL to be shortened.

Output value:
Shortened URL generated and associated with the provided long URL.

Example:

Input values:
Input the long URL to be shortened: https://www.w3resource.com/python-exercises/python-conditional-statements-and-loop-exercises.php
Output value:
Short URL: https://short.url/sw12r 

Here are two different solutions for a "URL Shortener" project in Python. The goal is to create a program that takes a long URL provided by the user and returns a shortened version.

Solution 1: Using Hashing

Code:

# Solution 1: URL Shortener Using Hashing

import hashlib  # Import hashlib to generate a hash for the long URL

# Dictionary to store the mapping between long URLs and short URLs
url_mapping = {}

def shorten_url(long_url):
    """Generate a short URL by hashing the long URL."""
    # Generate a unique hash using SHA-256 and take the first 6 characters for short URL
    short_url_hash = hashlib.sha256(long_url.encode()).hexdigest()[:6]
    short_url = f"https://short.url/{short_url_hash}"
    
    # Store the mapping in the dictionary
    url_mapping[short_url] = long_url
    
    return short_url

def main():
    """Main function to interact with the user."""
    # Get user input for the long URL
    long_url = input("Input the long URL to be shortened: ")
    
    # Generate a shortened URL
    short_url = shorten_url(long_url)
    
    # Display the shortened URL
    print(f"Short URL: {short_url}")

# Run the URL Shortener
main() 

Output:

Input the long URL to be shortened: https://www.w3resource.com/python-exercises/python-conditional-statements-and-loop-exercises.php
Short URL: https://short.url/2d933e

Explanation:

  • hashlib library: Used to generate a hash for the long URL.
  • Hash-based approach:
    • shorten_url(long_url) function generates a short URL by hashing the long URL using SHA-256 and takes the first 6 characters of the hash.
    • The mapping between the long URL and the short URL is stored in a dictionary url_mapping.
  • main() function: Handles user input and outputs the generated short URL.
  • Advantages: Quick and simple, but hash collisions might occur (two different URLs could potentially generate the same hash).

Solution 2: Using Base62 Encoding

Code:

# Solution 2: URL Shortener Using Base62 Encoding

import string  # Import string module for characters used in Base62
import random  # Import random module to generate random IDs

# Dictionary to store the mapping between long URLs and short URLs
url_mapping = {}
# Characters used for Base62 encoding (0-9, a-z, A-Z)
base62_chars = string.digits + string.ascii_letters

def generate_short_id(length=6):
    """Generate a random short ID of specified length using Base62 encoding."""
    return ''.join(random.choice(base62_chars) for _ in range(length))

def shorten_url(long_url):
    """Generate a short URL using Base62 encoding."""
    # Generate a random short ID for the URL
    short_id = generate_short_id()
    short_url = f"https://short.url/{short_id}"
    
    # Ensure that the short URL is unique
    while short_url in url_mapping:
        short_id = generate_short_id()
        short_url = f"https://short.url/{short_id}"
    
    # Store the mapping in the dictionary
    url_mapping[short_url] = long_url
    
    return short_url

def main():
    """Main function to interact with the user."""
    # Get user input for the long URL
    long_url = input("Input the long URL to be shortened: ")
    
    # Generate a shortened URL
    short_url = shorten_url(long_url)
    
    # Display the shortened URL
    print(f"Short URL: {short_url}")

# Run the URL Shortener
main()  

Output:

Input the long URL to be shortened: https://www.w3resource.com/python-exercises/list-advanced/index.php
Short URL: https://short.url/fBeoWj

Explanation:

  • Base62 Encoding:
    • generate_short_id(length) generates a random ID of the specified length (default 6 characters) using a combination of digits, lowercase, and uppercase letters (62 possible characters).
    • shorten_url(long_url) generates a short URL with the random ID and checks for uniqueness by ensuring that the generated short URL is not already in url_mapping.
  • main() function: Handles user input and outputs the generated short URL.
  • Advantages: Reduces the risk of collisions compared to hashing. The length of the short URL can be adjusted by changing the length parameter in generate_short_id().

Summary of Differences:

  • Solution 1: Using Hashing
    • Quick and simple to implement.
    • Potential risk of hash collisions (two different URLs generating the same hash).
  • Solution 2: Using Base62 Encoding
    • Generates a unique short ID using Base62 encoding.
    • Reduces collision risk, more flexible in customizing the length of the short URL.
    • Suitable for generating short URLs where uniqueness is important.

Both solutions are effective, but Solution 2 provides better control over the length and format of the short URL while minimizing the risk of collisions.



Become a Patron!

Follow us on Facebook and Twitter for latest update.