MySQL Partitioning
What is Partitioning?
Partitioning (a database design technique) improves performance, manageability, simplifies maintenance and reduce the cost of storing large amounts of data. Partitioning can be achieved without splitting tables by physically putting tables on individual disk drives. Partitioning allows tables, indexes, and index-organized tables to be subdivided into smaller pieces, therefore queries that access only a fraction of the data can run faster because there is fewer data to scan. There are two major forms of partitioning :
- Horizontal Partitioning : Horizontal partitioning divides table rows into multiple partitions (based on a logic). All columns defined to a table are found in each partition, so no actual table attributes are missing. All the partition can be addressed individually or collectively. For example, a table that contains whole year sale transaction being partitioned horizontally into twelve distinct partitions, where each partition contains one month's data.
- Vertical Partitioning : Vertical partitioning divides a table into multiple tables that contain fewer columns. Like horizontal partitioning, in vertical partitioning a query scan fewer data which increases query performance. For example, a table that contains a number of very wide text or BLOB columns that aren't addressed often being broken into two tables that have the most referenced columns in one table and the text or BLOB data in another.
MySQL partitioning
Version: MySQL 5.6
MySQL supports basic table partitioning but does not support vertical partitioning ( MySQL 5.6). This section describes in detail how to implement partitioning as part of your database.
By checking the output of the SHOW PLUGINS statement you will be sure whether your MySQL server supports the partition or not. See the following output :
Sample Output:
mysql> SHOW PLUGINS; +----------------------------+----------+--------------------+---------+---------+ | Name | Status | Type | Library | License | +----------------------------+----------+--------------------+---------+---------+ | binlog | ACTIVE | STORAGE ENGINE | NULL | GPL | | MySQL_native_password | ACTIVE | AUTHENTICATION | NULL | GPL | | MySQL_old_password | ACTIVE | AUTHENTICATION | NULL | GPL | | sha256_password | ACTIVE | AUTHENTICATION | NULL | GPL | | CSV | ACTIVE | STORAGE ENGINE | NULL | GPL | | MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL | | MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL | | BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | GPL | | FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL | | InnoDB | ACTIVE | STORAGE ENGINE | NULL | GPL | | INNODB_TRX | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | ---------------------------| ------ | ------------------ | ---- | --- | | ---------------------------| ------ | ------------------ | ---- | --- | | ---------------------------| ------ | ------------------ | ---- | --- | | INNODB_SYS_TABLESPACES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_DATAFILES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | PERFORMANCE_SCHEMA | ACTIVE | STORAGE ENGINE | NULL | GPL | | partition | ACTIVE | STORAGE ENGINE | NULL | GPL | +----------------------------+----------+--------------------+---------+---------+ 42 rows in set (0.21 sec)
MySQL 5.6 Community binaries include partitioning support.
Enable and disable partitioning support :
- To enable partitioning (if you are compiling MySQL 5.6 from source), the build must be configured with the -DWITH_PARTITION_STORAGE_ENGINE option.
- To disable partitioning support, you can start the MySQL Server with the --skip-partition option, in which case the value of have_partitioning is DISABLED.
How to partition a table?
In MySQL you can partition a table using CREATE TABLE or ALTER TABLE command. See the following CREATE TABLE syntax :
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name (create_definition,...) [table_options] [partition_options]
partition_options:
PARTITION BY { [LINEAR] HASH(expr) | [LINEAR] KEY(column_list) | RANGE(expr) | LIST(expr) } [PARTITIONS num] [SUBPARTITION BY { [LINEAR] HASH(expr) | [LINEAR] KEY(column_list) } [SUBPARTITIONS num] ] [(partition_definition[, partition_definition] ...)
partition_definition:
PARTITION partition_name [VALUES {LESS THAN {(expr) | MAXVALUE} | IN (value_list)}] [[STORAGE] ENGINE [=] engine_name] [COMMENT [=] 'comment_text' ] [DATA DIRECTORY [=] 'data_dir'] [INDEX DIRECTORY [=] 'index_dir'] [MAX_ROWS [=] max_number_of_rows] [MIN_ROWS [=] min_number_of_rows] [TABLESPACE [=] tablespace_name] [NODEGROUP [=] node_group_id] [(subpartition_definition [, subpartition_definition] ...)]
subpartition_definition:
SUBPARTITION logical_name [[STORAGE] ENGINE [=] engine_name] [COMMENT [=] 'comment_text' ] [DATA DIRECTORY [=] 'data_dir'] [INDEX DIRECTORY [=] 'index_dir'] [MAX_ROWS [=] max_number_of_rows] [MIN_ROWS [=] min_number_of_rows] [TABLESPACE [=] tablespace_name] [NODEGROUP [=] node_group_id]
ALTER TABLE: Partition operations
ALTER TABLE statement can be used for adding, dropping, merging, and splitting partitions, and for performing partitioning maintenance. Here we have defined a nonpartitioned table:
CREATE TABLE sale_mast (
bill_no INT,
bill_date DATETIME
);
This table can be partitioned by HASH (or in another type), using the bill_no column as the partitioning key, into 6 (or other) partitions using ALTER TABLE statement :
-- Alter the table "t1" to implement partitioning using HASH partitioning method
ALTER TABLE t1 PARTITION BY HASH(id)
-- Define the number of partitions to be created as 6
PARTITIONS 6;
Explanation:
- This SQL statement alters the table named "t1" to implement partitioning using the HASH partitioning method.
- ALTER TABLE is a SQL statement used to modify the structure or properties of a table.
- t1 is the name of the table being altered.
- PARTITION BY HASH(id) specifies the partitioning method to be used. In this case, the HASH partitioning method is used, and it is applied on the column "id".
- HASH partitioning divides rows into partitions based on a hash function applied to the specified column. This method evenly distributes rows across partitions based on the hash value of the column.
- PARTITIONS 6 indicates the number of partitions to be created. In this case, the table will be partitioned into 6 partitions.
- Partitioning can improve query performance by distributing data across multiple physical storage devices, enhancing data retrieval and management efficiency. It is often used for large tables to optimize query processing and data maintenance operations.
Partition naming :
Names of partitions follow the rules of other MySQL identifiers such as databases, tables, constraint, stored procedure etc. Partition names are not case-sensitive.
Advantages of partitioning
- During the scan operation, MySQL optimizer accesses those partitions that will satisfy a particular query. For example, a whole year sale records table may be broken up into 4 partitions (i.e. sale data from of Apr-Jun (partition p0), Jul-Sep (partition p1) , Oct-Dec (partition p2), Jam-Mar (partition p0)) . If a query is issued that contains sale data between Jul-Sep quarter, then it scans the partition p1 only instead of total table records and the query will complete much sooner.
- Partitioning allows you to have more control over how data is managed inside the database. For example, you can drop specific partitions in a partitioned table where data loses its usefulness. The process of adding new data, in some cases, be greatly facilitated by adding one or more new partitions for storing that data using ALTER TABLE command.
- In partitioning, it is possible to store more data in one table than can be held on a single disk or file system partition.
- MySQL 5.6 supports explicit partition selection for queries. For example, SELECT * FROM table1 PARTITION (p0,p1) WHERE col1< 10 selects only those rows in partitions p0 and p1 that match the WHERE condition, this can greatly speed up queries
- Partition selection also supports the data modification statements DELETE, INSERT, REPLACE, UPDATE, and LOAD DATA, LOAD XML.
Types of MySQL partitioning
Following types of partitioning are available in MySQL 5.6 :
- RANGE Partitioning
- LIST Partitioning
- COLUMNS Partitioning
- HASH Partitioning
- KEY Partitioning
- Subpartitioning
MySQL RANGE Partitioning
In MySQL, RANGE partitioning mode allows us to specify various ranges for which data is assigned. Ranges should be contiguous but not overlapping, and are defined using the VALUES LESS THAN operator. In the following example, sale_mast table contains four columns bill_no, bill_date, cust_code and amount. This table can be partitioned by range in various of ways, depending on your requirement. Here we have used the bill_date column and decide to partition the table 4 ways by adding a PARTITION BY RANGE clause. In these partitions the range of the sale date (sale_date) are as of follow :
- partition p0 ( sale between 01-01-2013 to 31-03-2013)
- partition p1 ( sale between 01-04-2013 to 30-06-2013)
- partition p2 ( sale between 01-07-2013 to 30-09-2013)
- partition p3 ( sale between 01-10-2013 to 30-12-2013)
Let create the table :
-- Create a new table named "sale_mast"
CREATE TABLE sale_mast (
-- Define a column named "bill_no" of data type INT that cannot contain NULL values
bill_no INT NOT NULL,
-- Define a column named "bill_date" of data type TIMESTAMP that cannot contain NULL values
bill_date TIMESTAMP NOT NULL,
-- Define a column named "cust_code" of data type VARCHAR(15) that cannot contain NULL values
cust_code VARCHAR(15) NOT NULL,
-- Define a column named "amount" of data type DECIMAL(8,2) that cannot contain NULL values
amount DECIMAL(8,2) NOT NULL
)
-- Partition the table by range based on the UNIX timestamp of the "bill_date" column
PARTITION BY RANGE (UNIX_TIMESTAMP(bill_date)) (
-- Define partition p0 for values less than the UNIX timestamp of '2013-04-01'
PARTITION p0 VALUES LESS THAN (UNIX_TIMESTAMP('2013-04-01')),
-- Define partition p1 for values less than the UNIX timestamp of '2013-07-01'
PARTITION p1 VALUES LESS THAN (UNIX_TIMESTAMP('2013-07-01')),
-- Define partition p2 for values less than the UNIX timestamp of '2013-10-01'
PARTITION p2 VALUES LESS THAN (UNIX_TIMESTAMP('2013-10-01')),
-- Define partition p3 for values less than the UNIX timestamp of '2014-01-01'
PARTITION p3 VALUES LESS THAN (UNIX_TIMESTAMP('2014-01-01'))
);
Explanation:
- This SQL statement creates a new table named "sale_mast" with four columns and defines range-based partitioning based on the UNIX timestamp of the "bill_date" column.
- CREATE TABLE is a SQL statement used to create a new table.
- sale_mast is the name of the new table being created.
- The table has four columns:
- bill_no: An integer column storing bill numbers, defined as NOT NULL.
- bill_date: A timestamp column storing bill dates, defined as NOT NULL.
- cust_code: A variable character column storing customer codes, defined as NOT NULL.
- amount: A decimal column storing amounts, defined as NOT NULL.
- PARTITION BY RANGE (UNIX_TIMESTAMP(bill_date)): Specifies that the table will be partitioned by range based on the UNIX timestamp of the "bill_date" column.
- The partitions are defined as follows:
- PARTITION p0 VALUES LESS THAN (UNIX_TIMESTAMP('2013-04-01')): Partition "p0" contains rows with bill dates before April 1, 2013.
- PARTITION p1 VALUES LESS THAN (UNIX_TIMESTAMP('2013-07-01')): Partition "p1" contains rows with bill dates before July 1, 2013.
- PARTITION p2 VALUES LESS THAN (UNIX_TIMESTAMP('2013-10-01')): Partition "p2" contains rows with bill dates before October 1, 2013.
- PARTITION p3 VALUES LESS THAN (UNIX_TIMESTAMP('2014-01-01')): Partition "p3" contains rows with bill dates before January 1, 2014.
- Range-based partitioning allows data to be distributed across partitions based on specified ranges of values. In this case, the partitions are defined based on different time periods determined by the "bill_date" column.
Now insert some records in sale_mast table :
mysql> INSERT INTO sale_mast VALUES (1, '2013-01-02', 'C001', 125.56), (2, '2013-01-25', 'C003', 456.50), (3, '2013-02-15', 'C012', 365.00), (4, '2013-03-26', 'C345', 785.00), (5, '2013-04-19', 'C234', 656.00), (6, '2013-05-31', 'C743', 854.00), (7, '2013-06-11', 'C234', 542.00), (8, '2013-07-24', 'C003', 300.00), (8, '2013-08-02', 'C456', 475.20); Query OK, 9 rows affected (0.07 sec) Records: 9 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM sale_mast; +---------+---------------------+-----------+--------+ | bill_no | bill_date | cust_code | amount | +---------+---------------------+-----------+--------+ | 1 | 2013-01-02 00:00:00 | C001 | 125.56 | | 2 | 2013-01-25 00:00:00 | C003 | 456.50 | | 3 | 2013-02-15 00:00:00 | C012 | 365.00 | | 4 | 2013-03-26 00:00:00 | C345 | 785.00 | | 5 | 2013-04-19 00:00:00 | C234 | 656.00 | | 6 | 2013-05-31 00:00:00 | C743 | 854.00 | | 7 | 2013-06-11 00:00:00 | C234 | 542.00 | | 8 | 2013-07-24 00:00:00 | C003 | 300.00 | | 9 | 2013-08-02 00:00:00 | C456 | 475.20 | +---------+---------------------+-----------+--------+ 9 rows in set (0.00 sec)
Here is the partition status of sale_mast table:
mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='sale_mast'; +----------------+------------+ | PARTITION_NAME | TABLE_ROWS | +----------------+------------+ | p0 | 4 | | p1 | 3 | | p2 | 2 | | p3 | 0 | +----------------+------------+ 4 rows in set (0.02 sec)
In the above way you can partition the table based on sale amount (amount. In these partitions the range of the sale amount (amount) are as of follow :
- partition p0 ( sale amount < 100 )
- partition p1 ( sale amount < 500 )
- partition p2 ( sale amount <1000 )
- partition p3 ( sale amount<1500 )
Let create the table :
-- Create a new table named "sale_mast1"
CREATE TABLE sale_mast1 (
-- Define a column named "bill_no" of data type INT that cannot contain NULL values
bill_no INT NOT NULL,
-- Define a column named "bill_date" of data type TIMESTAMP that cannot contain NULL values
bill_date TIMESTAMP NOT NULL,
-- Define a column named "cust_codE" of data type VARCHAR(15) that cannot contain NULL values
cust_codE VARCHAR(15) NOT NULL,
-- Define a column named "amount" of data type INT that cannot contain NULL values
amount INT NOT NULL
)
-- Partition the table by range based on the values of the "amount" column
PARTITION BY RANGE (amount) (
-- Define partition p0 for values less than 100
PARTITION p0 VALUES LESS THAN (100),
-- Define partition p1 for values less than 500
PARTITION p1 VALUES LESS THAN (500),
-- Define partition p2 for values less than 1000
PARTITION p2 VALUES LESS THAN (1000),
-- Define partition p3 for values less than 1500
PARTITION p3 VALUES LESS THAN (1500)
);
Explanation:
- This SQL statement creates a new table named "sale_mast1" with four columns and defines range-based partitioning based on the values of the "amount" column.
- CREATE TABLE is a SQL statement used to create a new table.
- sale_mast1 is the name of the new table being created.
- The table has four columns:
- bill_no: An integer column storing bill numbers, defined as NOT NULL.
- bill_date: A timestamp column storing bill dates, defined as NOT NULL.
- cust_codE: A variable character column storing customer codes, defined as NOT NULL.
- amount: An integer column storing amounts, defined as NOT NULL.
- PARTITION BY RANGE (amount): Specifies that the table will be partitioned by range based on the values of the "amount" column.
- The partitions are defined as follows:
- PARTITION p0 VALUES LESS THAN (100): Partition "p0" contains rows with "amount" values less than 100.
- PARTITION p1 VALUES LESS THAN (500): Partition "p1" contains rows with "amount" values less than 500.
- PARTITION p2 VALUES LESS THAN (1000): Partition "p2" contains rows with "amount" values less than 1000.
- PARTITION p3 VALUES LESS THAN (1500): Partition "p3" contains rows with "amount" values less than 1500.
- Range-based partitioning allows data to be distributed across partitions based on specified ranges of values. In this case, the partitions are defined based on different ranges of values in the "amount" column.
Drop a MySQL partition
If you feel some data are useless in a partitioned table you can drop one or more partition(s). To delete all rows from partition p0 of sale_mast, you can use the following statement :
-- Truncate the data in partition "p0" of the table "sale_mast"
ALTER TABLE sale_mast TRUNCATE PARTITION p0;
Explanation:
- This SQL statement truncates the data contained within partition "p0" of the table "sale_mast".
- ALTER TABLE is a SQL statement used to modify the structure or properties of a table.
- sale_mast is the name of the table being altered.
- TRUNCATE PARTITION p0 instructs MySQL to remove all rows from partition "p0" while keeping the partition itself intact.
- Truncating a partition is a fast operation compared to deleting rows individually, as it does not log individual row deletions and frees up space in a more efficient manner.
mysql> SELECT * FROM sale_mast; +---------+---------------------+-----------+--------+ | bill_no | bill_date | cust_code | amount | +---------+---------------------+-----------+--------+ | 5 | 2013-04-19 00:00:00 | C234 | 656.00 | | 6 | 2013-05-31 00:00:00 | C743 | 854.00 | | 7 | 2013-06-11 00:00:00 | C234 | 542.00 | | 8 | 2013-07-24 00:00:00 | C003 | 300.00 | | 9 | 2013-08-02 00:00:00 | C456 | 475.20 | +---------+---------------------+-----------+--------+ 5 rows in set (0.01 sec)
Here is the partition status of sale_mast after dropping the partition p0 :
MySQL> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='sale_mast'; +----------------+------------+ | PARTITION_NAME | TABLE_ROWS | +----------------+------------+ | p0 | 0 | | p1 | 3 | | p2 | 2 | | p3 | 0 | +----------------+------------+ 4 rows in set (0.05 sec)
MySQL LIST Partitioning
List partition allows us to segment data based on a pre-defined set of values (e.g. 1, 2, 3). This is done by using PARTITION BY LIST(expr) where expr is a column value and then defining each partition by means of a VALUES IN (value_list), where value_list is a comma-separated list of integers. In MySQL 5.6, it is possible to match against only a list of integers (and possibly NULL) when partitioning by LIST. In the following example, sale_mast2 table contains four columns bill_no, bill_date, agent_code, and amount. Suppose there are 11 agents represent three cities A, B, C these can be arranged in three partitions with LIST Partitioning as follows :
City | Agent ID |
---|---|
A | 1, 2, 3 |
B | 4, 5, 6 |
C | 7, 8, 9, 10, 11 |
Let create the table :
-- Create a new table named "sale_mast2"
CREATE TABLE sale_mast2 (
-- Define a column named "bill_no" of data type INT that cannot contain NULL values
bill_no INT NOT NULL,
-- Define a column named "bill_date" of data type TIMESTAMP that cannot contain NULL values
bill_date TIMESTAMP NOT NULL,
-- Define a column named "agent_codE" of data type INT that cannot contain NULL values
agent_codE INT NOT NULL,
-- Define a column named "amount" of data type INT that cannot contain NULL values
amount INT NOT NULL
)
-- Partition the table by list based on the values of the "agent_code" column
PARTITION BY LIST(agent_code) (
-- Define partition pA for specific values of agent codes (1, 2, 3)
PARTITION pA VALUES IN (1,2,3),
-- Define partition pB for specific values of agent codes (4, 5, 6)
PARTITION pB VALUES IN (4,5,6),
-- Define partition pC for specific values of agent codes (7, 8, 9, 10, 11)
PARTITION pC VALUES IN (7,8,9,10,11)
);
Explanation:
- This SQL statement creates a new table named "sale_mast2" with four columns and defines list-based partitioning based on the values of the "agent_code" column.
- CREATE TABLE is a SQL statement used to create a new table.
- sale_mast2 is the name of the new table being created.
- The table has four columns:
- bill_no: An integer column storing bill numbers, defined as NOT NULL.
- bill_date: A timestamp column storing bill dates, defined as NOT NULL.
- agent_codE: An integer column storing agent codes, defined as NOT NULL.
- amount: An integer column storing amounts, defined as NOT NULL.
- PARTITION BY LIST(agent_code): Specifies that the table will be partitioned by list based on the values of the "agent_code" column.
- The partitions are defined as follows:
- PARTITION pA VALUES IN (1,2,3): Partition "pA" contains rows with agent codes 1, 2, or 3.
- PARTITION pB VALUES IN (4,5,6): Partition "pB" contains rows with agent codes 4, 5, or 6.
- PARTITION pC VALUES IN (7,8,9,10,11): Partition "pC" contains rows with agent codes 7, 8, 9, 10, or 11.
- List-based partitioning allows data to be distributed across partitions based on specified lists of discrete values. In this case, the partitions are defined based on specific values of the "agent_code" column.
MySQL COLUMNS Partitioning
In COLUMNS partitioning it is possible to use multiple columns in partitioning keys. There are two types of COLUMNS partitioning :
In addition, both RANGE COLUMNS partitioning and LIST COLUMNS partitioning support the use of non-integer columns for defining value ranges or list members. The permitted data types are shown in the following list:
Both RANGE COLUMNS partitioning and LIST COLUMNS partitioning support following data types for defining value ranges or list members.
- All integer types: TINYINT, SMALLINT, MEDIUMINT, INT (INTEGER), and BIGINT.
- DATE and DATETIME.
RANGE COLUMNS partitioning
RANGE COLUMNS partitioning is similar to range partitioning with some significant difference. RANGE COLUMNS accepts a list of one or more columns as partition keys. You can define the ranges using various columns of types (mentioned above) other than integer types.
Here is the basic syntax for creating a table partitioned by RANGE COLUMNS :
CREATE TABLE table_name PARTITIONED BY RANGE COLUMNS(column_list) ( PARTITION partition_name VALUES LESS THAN (value_list)[, PARTITION partition_name VALUES LESS THAN (value_list)][, ...] ) column_list: column_name[, column_name][, ...] value_list: value[, value][, ...]
- column_list is a list of one or more columns.
- value_list is a list of values and must be supplied for each partition definition.
- column list and in the value list defining each partition must occur in the same order
- The order of the column names in the partitioning column list and the value lists do not have to be the same as the order of the table column definitions in CREATE TABLE statement.
Here is an example :
-- Create a new table named "table3"
CREATE TABLE table3 (
-- Define a column named "col1" of data type INT
col1 INT,
-- Define a column named "col2" of data type INT
col2 INT,
-- Define a column named "col3" of data type CHAR(5)
col3 CHAR(5),
-- Define a column named "col4" of data type INT
col4 INT
)
-- Partition the table by range columns (col1, col2, col3)
PARTITION BY RANGE COLUMNS(col1, col2, col3)
(
-- Define partition p0 for values less than (50, 100, 'aaaaa')
PARTITION p0 VALUES LESS THAN (50, 100, 'aaaaa'),
-- Define partition p1 for values less than (100, 200, 'bbbbb')
PARTITION p1 VALUES LESS THAN (100, 200, 'bbbbb'),
-- Define partition p2 for values less than (150, 300, 'ccccc')
PARTITION p2 VALUES LESS THAN (150, 300, 'ccccc'),
-- Define partition p3 for values less than (MAXVALUE, MAXVALUE, MAXVALUE)
PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE, MAXVALUE)
);
Explanation:
- This SQL statement creates a new table named "table3" with four columns and defines range-based partitioning based on the values of columns "col1", "col2", and "col3".
- CREATE TABLE is a SQL statement used to create a new table.
- table3 is the name of the new table being created.
- The table has four columns:
- col1: An integer column.
- col2: An integer column.
- col3: A character column with a fixed length of 5 characters.
- col4: An integer column.
- PARTITION BY RANGE COLUMNS(col1, col2, col3): Specifies that the table will be partitioned by range based on the values of columns "col1", "col2", and "col3".
- The partitions are defined as follows:
- PARTITION p0 VALUES LESS THAN (50, 100, 'aaaaa'): Partition "p0" contains rows with values of columns "col1" less than 50, values of column "col2" less than 100, and values of column "col3" less than 'aaaaa'.
- PARTITION p1 VALUES LESS THAN (100, 200, 'bbbbb'): Partition "p1" contains rows with values of columns "col1" less than 100, values of column "col2" less than 200, and values of column "col3" less than 'bbbbb'.
- PARTITION p2 VALUES LESS THAN (150, 300, 'ccccc'): Partition "p2" contains rows with values of columns "col1" less than 150, values of column "col2" less than 300, and values of column "col3" less than 'ccccc'.
- PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE, MAXVALUE): Partition "p3" contains rows with values of columns "col1", "col2", and "col3" less than the maximum possible values for each column.
- Range-based partitioning allows data to be distributed across partitions based on specified ranges of values of multiple columns. In this case, the partitions are defined based on combinations of values of columns "col1", "col2", and "col3".
LIST COLUMNS partitioning
LIST COLUMNS accepts a list of one or more columns as partition keys.You can use various columns of data of types other than integer types as partitioning columns. You can use string types, DATE, and DATETIME columns
In a company there are agents in 3 cities, for sales and marketing purposes. We have organized the agents in 3 cities as shown in the following table :
City | Agent ID |
---|---|
A | A1, A2, A3 |
B | B1, B2, B3 |
C | C1, C2, C3, C4, C5 |
Let create a table with LIST COLUMNS partitioning based on the above information :
-- Create a new table named "salemast"
CREATE TABLE salemast (
-- Define a column named "agent_id" of data type VARCHAR(15)
agent_id VARCHAR(15),
-- Define a column named "agent_name" of data type VARCHAR(50)
agent_name VARCHAR(50),
-- Define a column named "agent_address" of data type VARCHAR(100)
agent_address VARCHAR(100),
-- Define a column named "city_code" of data type VARCHAR(10)
city_code VARCHAR(10)
)
-- Partition the table by list columns (agent_id)
PARTITION BY LIST COLUMNS(agent_id) (
-- Define partition pcity_a for specific values of agent_id ('A1', 'A2', 'A3')
PARTITION pcity_a VALUES IN('A1', 'A2', 'A3'),
-- Define partition pcity_b for specific values of agent_id ('B1', 'B2', 'B3')
PARTITION pcity_b VALUES IN('B1', 'B2', 'B3'),
-- Define partition pcity_c for specific values of agent_id ('C1', 'C2', 'C3', 'C4', 'C5')
PARTITION pcity_c VALUES IN ('C1', 'C2', 'C3', 'C4', 'C5')
);
Explanation:
- This SQL statement creates a new table named "salemast" with four columns and defines list-based partitioning based on the values of the "agent_id" column.
- CREATE TABLE is a SQL statement used to create a new table.
- salemast is the name of the new table being created.
- The table has four columns:
- agent_id: A variable character column storing agent IDs.
- agent_name: A variable character column storing agent names.
- agent_address: A variable character column storing agent addresses.
- city_code: A variable character column storing city codes.
- PARTITION BY LIST COLUMNS(agent_id): Specifies that the table will be partitioned by list based on the values of the "agent_id" column.
- The partitions are defined as follows:
- PARTITION pcity_a VALUES IN('A1', 'A2', 'A3'): Partition "pcity_a" contains rows with agent IDs 'A1', 'A2', or 'A3'.
- PARTITION pcity_b VALUES IN('B1', 'B2', 'B3'): Partition "pcity_b" contains rows with agent IDs 'B1', 'B2', or 'B3'.
- PARTITION pcity_c VALUES IN ('C1', 'C2', 'C3', 'C4', 'C5'): Partition "pcity_c" contains rows with agent IDs 'C1', 'C2', 'C3', 'C4', or 'C5'.
- List-based partitioning allows data to be distributed across partitions based on specified lists of discrete values of a column. In this case, the partitions are defined based on specific values of the "agent_id" column.
You can use DATE and DATETIME columns in LIST COLUMNS partitioning, see the following example :
-- Create a new table named "sale_master"
CREATE TABLE sale_master (
-- Define a column named "bill_no" of data type INT which cannot be NULL
bill_no INT NOT NULL,
-- Define a column named "bill_date" of data type DATE
bill_date DATE,
-- Define a column named "cust_code" of data type VARCHAR(15) which cannot be NULL
cust_code VARCHAR(15) NOT NULL,
-- Define a column named "amount" of data type DECIMAL(8,2) which cannot be NULL
amount DECIMAL(8,2) NOT NULL
)
-- Partition the table by range columns (bill_date)
PARTITION BY RANGE COLUMNS (bill_date)(
-- Define partition p_qtr1 for values less than '2013-04-01'
PARTITION p_qtr1 VALUES LESS THAN ('2013-04-01'),
-- Define partition p_qtr2 for values less than '2013-07-01'
PARTITION p_qtr2 VALUES LESS THAN ('2013-07-01'),
-- Define partition p_qtr3 for values less than '2013-10-01'
PARTITION p_qtr3 VALUES LESS THAN ('2013-10-01'),
-- Define partition p_qtr4 for values less than '2014-01-01'
PARTITION p_qtr4 VALUES LESS THAN ('2014-01-01')
);
Explanation:
- This SQL statement creates a new table named "sale_master" with four columns and defines range-based partitioning based on the values of the "bill_date" column.
- CREATE TABLE is a SQL statement used to create a new table.
- sale_master is the name of the new table being created.
- The table has four columns:
- bill_no: An integer column storing bill numbers, which cannot be NULL.
- bill_date: A date column storing bill dates.
- cust_code: A variable character column storing customer codes, which cannot be NULL.
- amount: A decimal column storing amounts, which cannot be NULL.
- PARTITION BY RANGE COLUMNS (bill_date): Specifies that the table will be partitioned by range based on the values of the "bill_date" column.
- The partitions are defined as follows:
- PARTITION p_qtr1 VALUES LESS THAN ('2013-04-01'): Partition "p_qtr1" contains rows with bill dates before '2013-04-01'.
- PARTITION p_qtr2 VALUES LESS THAN ('2013-07-01'): Partition "p_qtr2" contains rows with bill dates before '2013-07-01'.
- PARTITION p_qtr3 VALUES LESS THAN ('2013-10-01'): Partition "p_qtr3" contains rows with bill dates before '2013-10-01'.
- PARTITION p_qtr4 VALUES LESS THAN ('2014-01-01'): Partition "p_qtr4" contains rows with bill dates before '2014-01-01'.
- Range-based partitioning allows data to be distributed across partitions based on specified ranges of values of a column. In this case, the partitions are defined based on ranges of values of the "bill_date" column.
MySQL HASH Partitioning
MySQL HASH partition is used to distribute data among a predefined number of partitions on a column value or expression based on a column value. This is done by using PARTITION BY HASH(expr) clause, adding in CREATE TABLE STATEMENT. In PARTITIONS num clause, num is a positive integer represents the number of partitions of the table. The following statement creates a table that uses hashing on the studetn_id column and is divided into 4 partitions :
-- Create a new table named "student"
CREATE TABLE student (
-- Define a column named "student_id" of data type INT which cannot be NULL
student_id INT NOT NULL,
-- Define a column named "class" of data type VARCHAR(8)
class VARCHAR(8),
-- Define a column named "name" of data type VARCHAR(40)
name VARCHAR(40),
-- Define a column named "date_of_admission" of data type DATE which cannot be NULL, with a default value of '2000-01-01'
date_of_admission DATE NOT NULL DEFAULT '2000-01-01'
)
-- Partition the table by hash partitioning on the column "student_id" with 4 partitions
PARTITION BY HASH(student_id)
PARTITIONS 4;
Explanation:
- This SQL statement creates a new table named "student" with four columns and defines hash partitioning based on the values of the "student_id" column.
- CREATE TABLE is a SQL statement used to create a new table.
- student is the name of the new table being created.
- The table has four columns:
- student_id: An integer column storing student IDs, which cannot be NULL.
- class: A variable character column storing class information.
- name: A variable character column storing student names.
- date_of_admission: A date column storing dates of admission, which cannot be NULL. It has a default value of '2000-01-01'.
- PARTITION BY HASH(student_id): Specifies that the table will be partitioned by hash based on the values of the "student_id" column.
- PARTITIONS 4: Specifies that the table will have 4 partitions for hash partitioning.
- Hash partitioning distributes data evenly across partitions based on a hash function applied to a specified column. In this case, the partitions are created based on the hash of the values in the "student_id" column.
It is also possible to make a partition based on the year in which a student was admitted. See the following statement :
-- Create a new table named "student"
CREATE TABLE student (
-- Define a column named "student_id" of data type INT which cannot be NULL
student_id INT NOT NULL,
-- Define a column named "class" of data type VARCHAR(8)
class VARCHAR(8),
-- Define a column named "class" of data type VARCHAR(8) (duplicate column name, will cause an error)
class VARCHAR(8),
-- Define a column named "name" of data type VARCHAR(40)
name VARCHAR(40),
-- Define a column named "date_of_admission" of data type DATE which cannot be NULL, with a default value of '2000-01-01'
date_of_admission DATE NOT NULL DEFAULT '2000-01-01'
)
-- Partition the table by hash partitioning based on the year of "date_of_admission" column, with 4 partitions
PARTITION BY HASH(YEAR(date_of_admission))
PARTITIONS 4;
Explanation:
- This SQL statement attempts to create a new table named "student" with five columns, but it contains an error due to the duplicate column name "class".
- The "class" column is defined twice, which is not allowed in SQL. The second definition of "class" column will result in an error.
- The "class" column is mistakenly defined twice in the table definition, which is not valid SQL syntax.
- The rest of the SQL statement attempts to partition the table by hash partitioning based on the year extracted from the "date_of_admission" column, with 4 partitions.
MySQL KEY Partitioning
MySQL KEY partition is a special form of HASH partition, where the hashing function for key partitioning is supplied by the MySQL server. The server employs its own internal hashing function which is based on the same algorithm as PASSWORD(). This is done by using PARTITION BY KEY, adding in CREATE TABLE STATEMENT. In KEY partitioning KEY takes only a list of zero or more column names. Any columns used as the partitioning key must comprise part or all of the table's primary key if the table has one. If there is a primary key in a table, it is used as partitioning key when no column is specified as the partitioning key. Here is an example :
-- Create a new table named "table1"
CREATE TABLE table1 (
-- Define a column named "id" of data type INT which cannot be NULL and is set as the primary key
id INT NOT NULL PRIMARY KEY,
-- Define a column named "fname" of data type VARCHAR(25)
fname VARCHAR(25),
-- Define a column named "lname" of data type VARCHAR(25)
lname VARCHAR(25)
)
-- Partition the table by key partitioning without specifying a key column, with 2 partitions
PARTITION BY KEY()
PARTITIONS 2;
Explanation:
- This SQL statement creates a new table named "table1" with three columns.
- CREATE TABLE is a SQL statement used to create a new table.
- table1 is the name of the new table being created.
- The table has three columns:
- id: An integer column storing IDs, which cannot be NULL and is set as the primary key.
- fname: A variable character column storing first names.
- lname: A variable character column storing last names.
- PARTITION BY KEY(): Specifies that the table will be partitioned by key partitioning without specifying a key column.
- PARTITIONS 2: Specifies that the table will have 2 partitions for key partitioning.
- Key partitioning distributes data across partitions based on a hash function applied to the values of a specified key column. In this case, the key column is not specified, so the behavior of the key partitioning is based on an unspecified key column.
If there is no primary key but there is a unique key in a table, then the unique key is used for the partitioning key :
-- Create a new table named "table2"
CREATE TABLE table2 (
-- Define a column named "id" of data type INT which cannot be NULL
id INT NOT NULL,
-- Define a column named "fname" of data type VARCHAR(25)
fname VARCHAR(25),
-- Define a column named "lname" of data type VARCHAR(25)
lname VARCHAR(25),
-- Define a unique key constraint on the "id" column
UNIQUE KEY (id)
)
-- Partition the table by key partitioning without specifying a key column, with 2 partitions
PARTITION BY KEY()
PARTITIONS 2;
Explanation:
- This SQL statement creates a new table named "table2" with four columns.
- CREATE TABLE is a SQL statement used to create a new table.
- table2 is the name of the new table being created.
- The table has three columns:
- id: An integer column storing IDs, which cannot be NULL.
- fname: A variable character column storing first names.
- lname: A variable character column storing last names.
- A unique key constraint is defined on the "id" column, ensuring that each value in the "id" column is unique across the table.
- PARTITION BY KEY(): Specifies that the table will be partitioned by key partitioning without specifying a key column.
- PARTITIONS 2: Specifies that the table will have 2 partitions for key partitioning.
- Key partitioning distributes data across partitions based on a hash function applied to the values of a specified key column. In this case, the key column is not specified, so the behavior of the key partitioning is based on an unspecified key column.
MySQL Subpartitioning
Subpartitioning is a method to divide each partition further in a partitioned table. See the following CREATE TABLE statement :
-- Create a new table named "table10"
CREATE TABLE table10 (
-- Define a column named "BILL_NO" of data type INT
BILL_NO INT,
-- Define a column named "sale_date" of data type DATE
sale_date DATE,
-- Define a column named "cust_code" of data type VARCHAR(15)
cust_code VARCHAR(15),
-- Define a column named "AMOUNT" of data type DECIMAL(8,2)
AMOUNT DECIMAL(8,2)
)
-- Partition the table by range partitioning based on the year of "sale_date" column
PARTITION BY RANGE(YEAR(sale_date))
-- Subpartition the partitions by hash partitioning based on the number of days since the epoch for "sale_date"
SUBPARTITION BY HASH(TO_DAYS(sale_date))
-- Define 4 subpartitions for each range partition
SUBPARTITIONS 4 (
-- Define partition "p0" for values less than the year 1990
PARTITION p0 VALUES LESS THAN (1990),
-- Define partition "p1" for values less than the year 2000
PARTITION p1 VALUES LESS THAN (2000),
-- Define partition "p2" for values less than the year 2010
PARTITION p2 VALUES LESS THAN (2010),
-- Define partition "p3" for values less than the maximum value
PARTITION p3 VALUES LESS THAN MAXVALUE
);
Explanation:
- This SQL statement creates a new table named "table10" with four columns.
- CREATE TABLE is a SQL statement used to create a new table.
- table10 is the name of the new table being created.
- The table has four columns:
- BILL_NO: An integer column storing bill numbers.
- sale_date: A date column storing sale dates.
- cust_code: A variable character column storing customer codes.
- AMOUNT: A decimal column storing sale amounts.
- PARTITION BY RANGE(YEAR(sale_date)): Specifies that the table will be partitioned by range partitioning based on the year of the "sale_date" column.
- SUBPARTITION BY HASH(TO_DAYS(sale_date)): Specifies that each partition will be further subpartitioned by hash partitioning based on the number of days since the epoch for the "sale_date" column.
- SUBPARTITIONS 4: Specifies that each partition will have 4 subpartitions.
- The table is divided into four partitions:
- p0: Contains rows with sale dates before the year 1990.
- p1: Contains rows with sale dates before the year 2000.
- p2: Contains rows with sale dates before the year 2010.
- p3: Contains rows with sale dates after the year 2010.
In the above statement -
- The table has 4 RANGE partitions.
- Each of these partitions—p0, p1, p2 and p3—is further divided into 4 subpartitions.
- Therefore the entire table is divided into 4 * 4 = 16 partitions.
Here is the partition status of table10 :
mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='stable'; +----------------+------------+ | PARTITION_NAME | TABLE_ROWS | +----------------+------------+ | p0 | 0 | | p0 | 0 | | p0 | 0 | | p0 | 0 | | p1 | 0 | | p1 | 0 | | p1 | 0 | | p1 | 0 | | p2 | 0 | | p2 | 0 | | p2 | 0 | | p2 | 0 | | p3 | 0 | | p3 | 0 | | p3 | 0 | | p3 | 0 | +----------------+------------+ 16 rows in set (0.16 sec)
Previous: MySQL ALTER TABLE
Next: MySQL INSERT
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics