w3resource

Python URL Shortener Project: Comparing Hashing and Base62 Encoding Methods


URL Shortener:

Input values:
User provides a long URL to be shortened.

Output value:
Shortened URL generated and associated with the provided long URL.

Example:

Input values:
Input the long URL to be shortened: https://www.w3resource.com/python-exercises/python-conditional-statements-and-loop-exercises.php
Output value:
Short URL: https://short.url/sw12r 

Here are two different solutions for a "URL Shortener" project in Python. The goal is to create a program that takes a long URL provided by the user and returns a shortened version.

Solution 1: Using Hashing

Code:

# Solution 1: URL Shortener Using Hashing

import hashlib  # Import hashlib to generate a hash for the long URL

# Dictionary to store the mapping between long URLs and short URLs
url_mapping = {}

def shorten_url(long_url):
    """Generate a short URL by hashing the long URL."""
    # Generate a unique hash using SHA-256 and take the first 6 characters for short URL
    short_url_hash = hashlib.sha256(long_url.encode()).hexdigest()[:6]
    short_url = f"https://short.url/{short_url_hash}"
    
    # Store the mapping in the dictionary
    url_mapping[short_url] = long_url
    
    return short_url

def main():
    """Main function to interact with the user."""
    # Get user input for the long URL
    long_url = input("Input the long URL to be shortened: ")
    
    # Generate a shortened URL
    short_url = shorten_url(long_url)
    
    # Display the shortened URL
    print(f"Short URL: {short_url}")

# Run the URL Shortener
main() 

Output:

Input the long URL to be shortened: https://www.w3resource.com/python-exercises/python-conditional-statements-and-loop-exercises.php
Short URL: https://short.url/2d933e

Explanation:

  • hashlib library: Used to generate a hash for the long URL.
  • Hash-based approach:
    • shorten_url(long_url) function generates a short URL by hashing the long URL using SHA-256 and takes the first 6 characters of the hash.
    • The mapping between the long URL and the short URL is stored in a dictionary url_mapping.
  • main() function: Handles user input and outputs the generated short URL.
  • Advantages: Quick and simple, but hash collisions might occur (two different URLs could potentially generate the same hash).

Solution 2: Using Base62 Encoding

Code:

# Solution 2: URL Shortener Using Base62 Encoding

import string  # Import string module for characters used in Base62
import random  # Import random module to generate random IDs

# Dictionary to store the mapping between long URLs and short URLs
url_mapping = {}
# Characters used for Base62 encoding (0-9, a-z, A-Z)
base62_chars = string.digits + string.ascii_letters

def generate_short_id(length=6):
    """Generate a random short ID of specified length using Base62 encoding."""
    return ''.join(random.choice(base62_chars) for _ in range(length))

def shorten_url(long_url):
    """Generate a short URL using Base62 encoding."""
    # Generate a random short ID for the URL
    short_id = generate_short_id()
    short_url = f"https://short.url/{short_id}"
    
    # Ensure that the short URL is unique
    while short_url in url_mapping:
        short_id = generate_short_id()
        short_url = f"https://short.url/{short_id}"
    
    # Store the mapping in the dictionary
    url_mapping[short_url] = long_url
    
    return short_url

def main():
    """Main function to interact with the user."""
    # Get user input for the long URL
    long_url = input("Input the long URL to be shortened: ")
    
    # Generate a shortened URL
    short_url = shorten_url(long_url)
    
    # Display the shortened URL
    print(f"Short URL: {short_url}")

# Run the URL Shortener
main()  

Output:

Input the long URL to be shortened: https://www.w3resource.com/python-exercises/list-advanced/index.php
Short URL: https://short.url/fBeoWj

Explanation:

  • Base62 Encoding:
    • generate_short_id(length) generates a random ID of the specified length (default 6 characters) using a combination of digits, lowercase, and uppercase letters (62 possible characters).
    • shorten_url(long_url) generates a short URL with the random ID and checks for uniqueness by ensuring that the generated short URL is not already in url_mapping.
  • main() function: Handles user input and outputs the generated short URL.
  • Advantages: Reduces the risk of collisions compared to hashing. The length of the short URL can be adjusted by changing the length parameter in generate_short_id().

Summary of Differences:

  • Solution 1: Using Hashing
    • Quick and simple to implement.
    • Potential risk of hash collisions (two different URLs generating the same hash).
  • Solution 2: Using Base62 Encoding
    • Generates a unique short ID using Base62 encoding.
    • Reduces collision risk, more flexible in customizing the length of the short URL.
    • Suitable for generating short URLs where uniqueness is important.

Both solutions are effective, but Solution 2 provides better control over the length and format of the short URL while minimizing the risk of collisions.



Become a Patron!

Follow us on Facebook and Twitter for latest update.

It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.

https://w3resource.com/projects/python/python-url-shortener-project.php