Efficiently Import CSV Data into SQLite
Converting CSV to SQLite: Efficient Data Import
Converting data from a CSV file to an SQLite database is a common task for data manipulation and storage. SQLite provides built-in tools like .import in the SQLite command-line interface and libraries in Python or other programming languages for efficient conversion.
This guide will explore the different methods to import CSV data into SQLite, including using the command-line tool, Python, and SQLite features.
Syntax (Command-line Import)
To import a CSV file directly into SQLite:
sqlite3 database_name.db .mode csv .import file_name.csv table_name
Examples
1. Using SQLite Command-line Tool
Code:
# Open the SQLite CLI and specify the database
sqlite3 my_database.db
# Set the mode to CSV for import
.mode csv
# Import the CSV file into the specified table
.import my_data.csv my_table
Explanation:
- The .mode csv command sets the mode to CSV.
- .import reads the my_data.csv file and inserts the data into the my_table.
Ensure the table exists before importing, or SQLite will attempt to create one.
2. Import CSV with Python
Using Python's sqlite3 and csv modules, you can automate the import process.
Code:
# Import required modules
import sqlite3 # For SQLite database operations
import csv # For handling CSV files
# Establish a connection to the SQLite database (creates the database if it doesn't exist)
conn = sqlite3.connect('my_database.db')
# Create a cursor object to execute SQL commands
cursor = conn.cursor()
# Create a table (if it doesn't already exist)
cursor.execute('''
CREATE TABLE IF NOT EXISTS my_table (
id INTEGER PRIMARY KEY,
name TEXT,
age INTEGER,
city TEXT
)
''')
# Open the CSV file
with open('my_data.csv', 'r') as file:
# Create a CSV reader object
csv_reader = csv.reader(file)
# Skip the header row
next(csv_reader)
# Insert each row into the table
for row in csv_reader:
cursor.execute('''
INSERT INTO my_table (id, name, age, city) VALUES (?, ?, ?, ?)
''', row)
# Commit changes and close the connection
conn.commit()
conn.close()
Explanation:
1. Establishes a connection to my_database.db.
2. Creates a my_table if it doesn’t already exist.
3. Opens my_data.csv and reads its content.
4. Skips the header and inserts each row into the database.
3. Import with Pandas (Python)
If you prefer Python’s Pandas library:
Code:
# Import Pandas and SQLite modules
import pandas as pd
import sqlite3
# Read the CSV file into a Pandas DataFrame
df = pd.read_csv('my_data.csv')
# Establish a database connection
conn = sqlite3.connect('my_database.db')
# Write the DataFrame to an SQLite table
df.to_sql('my_table', conn, if_exists='replace', index=False)
# Close the connection
conn.close()
Explanation:
1. Reads the CSV file into a Pandas DataFrame.
2. Uses to_sql() to write the DataFrame into the SQLite table my_table.
3. The if_exists='replace' parameter ensures the table is replaced if it exists.
4. Error Handling and Best Practices
1. Header Matching: Ensure the CSV headers match the table column names.
2. Data Validation: Check for data type mismatches to avoid runtime errors.
3. Transaction Management: Use transactions to ensure atomic operations, especially when importing large datasets.
4. Command-line Debugging: Use .mode list or .mode column to verify the imported data.
Advanced Features
- Skipping Specific Columns: Use a custom script to modify or clean CSV data before importing.
- Batch Inserts: Import data in batches for better performance, especially for large files.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics