|
1 | 1 | # UUID_VX component
|
2 | 2 |
|
3 |
| -A Universally Unique Identifier (UUID) is a 128-bit number used to identify information uniquely in computer systems. It is often represented as a 32-character hexadecimal string divided into five groups separated by hyphens. |
| 3 | +The `UUID_VX` component in Percona Server for MySQL provides functions to work with different versions of Universally Unique Identifiers (UUIDs). It allows for: |
4 | 4 |
|
5 |
| -| Benefit | Description | |
6 |
| -|------------------------|-------------| |
7 |
| -| Global Uniqueness | UUIDs ensure that each identifier is unique across different databases and systems without needing a central authority to manage the IDs. This prevents ID conflicts when merging data from multiple sources. | |
8 |
| -| Decentralized Generation | Since UUIDs can be generated independently by different systems, there is no need for coordination. This is particularly useful in distributed environments where systems might not have constant communication with each other. | |
9 |
| -| Scalability | UUIDs support scalability in distributed databases. New records can be added without worrying about generating duplicate IDs, even when data is inserted concurrently across multiple nodes. | |
10 |
| -| Improved Data Merging | When data from various sources is combined, UUIDs prevent conflicts, making the merging process simpler and more reliable. | |
11 |
| -| Security | UUIDs, especially those generated randomly (like UUIDv4), are hard to predict, adding a layer of security when used as identifiers. | |
| 5 | +* Managing any UUID version: You can handle various UUID versions, including UUIDv1, UUIDv4, and others. |
12 | 6 |
|
13 |
| -The following table describes the UUID versions: |
| 7 | +* Generating UUIDs for specific versions: This includes creating time-based UUIDs (versions 1, 6, and 7) and random UUIDs (version 4). |
14 | 8 |
|
15 |
| -| UUID Version | Description | |
16 |
| -|--------------|-------------| |
17 |
| -| Version 1 (Time-based) | - Generated using the current time and a node identifier (usually the MAC address). <br> - Ensures uniqueness over time and across nodes. | |
18 |
| -| Version 2 (DCE Security) | - Similar to version 1 but includes additional information such as POSIX UID/GID. <br> - Often used in environments requiring enhanced security. | |
19 |
| -| Version 3 (Name-based, MD5 hash) | - Generated from a namespace identifier and a name (string). <br> - Uses the MD5 hashing algorithm to ensure the UUID is derived from the namespace and name. | |
20 |
| -| Version 4 (Random) | - Generated using random numbers. <br> - Offers high uniqueness and is easy to generate without requiring specific inputs. | |
21 |
| -| Version 5 (Name-based, SHA-1 hash) | - Similar to version 3 but uses the SHA-1 hashing algorithm. <br> - Provides a stronger hash function than MD5. | |
22 |
| -| Version 6 (Time-ordered) | - A reordered version of UUIDv1 for better indexing and storage efficiency. <br> - Combines timestamp and random or unique data. | |
23 |
| -| Version 7 (Unix Epoch Time) | - Combines a high-precision timestamp with random data. <br> - Provides unique, time-ordered UUIDs that are ideal for database indexing. | |
24 |
| -| Version 8 (Custom) | - Reserved for user-defined purposes and experimental uses. <br> - Allows custom formats and structures according to specific requirements. | |
| 9 | +* Enhancing support for UUID-based operations: This adds flexibility in how UUIDs are generated and used within the database. |
25 | 10 |
|
26 |
| -UUID version 4 (UUIDv4) generates a unique identifier using random numbers. This randomness ensures a high level of uniqueness without needing a central authority to manage IDs. However, using UUIDv4 as a primary key in a distributed database is not recommended. The random nature of UUIDv4 leads to several issues: |
| 11 | +By utilizing `UUID_VX`, you can tailor UUID generation to suit your application's needs. For instance, time-based UUIDs ensure chronological ordering, which can improve indexing and query performance in distributed systems. Random UUIDs, on the other hand, provide a higher level of uniqueness and are useful for security-sensitive applications. |
27 | 12 |
|
28 |
| -| Issue | Description | |
29 |
| -|------------------|--------------------------------------------------------------------------------------------------------------| |
30 |
| -| Inefficient Indexing | UUIDv4 does not follow any order, causing inefficient indexing. Databases struggle to keep records organized, leading to slower query performance. | |
31 |
| -| Fragmentation | The random distribution of UUIDv4 can cause data fragmentation, making database storage less efficient. | |
32 |
| -| Storage Space | UUIDs are larger (128 bits) than traditional integer keys, consuming more storage space and memory. | |
| 13 | +This component empowers developers to optimize their database operations by choosing the most appropriate UUID version for their specific scenarios. |
33 | 14 |
|
| 15 | +## Universally unique identifier (UUID) overview |
34 | 16 |
|
35 |
| -For better performance and efficiency in a distributed database, consider using UUIDv7, which incorporates timestamps for some order levels. |
| 17 | +A universally unique identifier (UUID) is a 128-bit number used to uniquely identify information in computer systems. It is commonly represented as a 32-character hexadecimal string divided into five groups separated by hyphens. |
36 | 18 |
|
37 |
| -UUID version 7 (UUIDv7) creates time-ordered identifiers by encoding a Unix timestamp with millisecond precision in the first 48 bits. It uses 6 bits to specify the UUID version and variant, while the remaining 74 bits are random. This time-ordering results in nearly sequential values, which helps improve index performance and locality in distributed systems. |
| 19 | +## Benefits of using UUIDs |
| 20 | + |
| 21 | +UUIDs offer several advantages in distributed systems: |
| 22 | + |
| 23 | +* Global uniqueness: UUIDs ensure that each identifier is unique across different databases and systems without needing a central authority. This prevents ID conflicts when merging data from multiple sources. |
| 24 | + |
| 25 | +* Decentralized generation: UUIDs can be generated independently by different systems, removing the need for coordination. This is particularly useful in distributed environments. |
| 26 | + |
| 27 | +* Scalability: UUIDs support distributed databases by allowing new records to be added without generating duplicate IDs, even with concurrent insertions. |
| 28 | + |
| 29 | +* Improved data merging: UUIDs prevent conflicts when combining data from different sources, simplifying the merging process. |
| 30 | + |
| 31 | +* Security: Randomly generated UUIDs (like UUIDv4) are hard to predict, adding an extra security layer when used as identifiers. |
| 32 | + |
| 33 | +## UUID versions |
| 34 | + |
| 35 | +The table below describes the different UUID versions and their characteristics: |
| 36 | + |
| 37 | +| UUID version | Description | |
| 38 | +|-------------|-------------| |
| 39 | +| Version 1 (time-based) | - Generated using the current time and a node identifier (usually a MAC address). <br> - Ensures uniqueness over time and across nodes. | |
| 40 | +| Version 2 (DCE security) | - Similar to version 1 but includes POSIX UID/GID for enhanced security. | |
| 41 | +| Version 3 (name-based, MD5 hash) | - Generated from a namespace identifier and a name (string). <br> - Uses the MD5 hashing algorithm to derive the UUID. | |
| 42 | +| Version 4 (random) | - Completely random UUIDs. <br> - Offers uniqueness without requiring specific inputs. | |
| 43 | +| Version 5 (name-based, SHA-1 hash) | - Similar to version 3 but uses SHA-1 instead of MD5 for a stronger hash function. | |
| 44 | +| Version 6 (time-ordered) | - A reordered version of UUIDv1 for better indexing and storage efficiency. | |
| 45 | +| Version 7 (Unix epoch time) | - Encodes a high-precision Unix timestamp with random data. <br> - Provides unique, time-ordered UUIDs ideal for database indexing. | |
| 46 | +| Version 8 (custom) | - Reserved for user-defined and experimental purposes. <br> - Allows custom formats and structures. | |
| 47 | + |
| 48 | +## Challenges with UUIDv4 in databases |
| 49 | + |
| 50 | +While UUIDv4 provides strong uniqueness through randomness, using it as a primary key in distributed databases is generally discouraged due to the following issues: |
| 51 | + |
| 52 | +| Issue | Description | |
| 53 | +|-------|-------------| |
| 54 | +| Inefficient indexing | UUIDv4 lacks order, making it inefficient for indexing. Databases struggle to keep records organized, leading to slower query performance. | |
| 55 | +| Fragmentation | The random distribution of UUIDv4 causes data fragmentation, reducing storage efficiency. | |
| 56 | +| Storage overhead | UUIDs (128 bits) consume more storage space than traditional integer keys, increasing memory usage. | |
| 57 | + |
| 58 | +To improve performance in distributed databases, consider using UUIDv7. UUIDv7 encodes a Unix timestamp (millisecond precision) in the first 48 bits, followed by six bits for the UUID version and variant, with the remaining 74 bits as random data. This structure makes UUIDv7 nearly sequential, improving indexing and query efficiency. |
| 59 | + |
| 60 | +### Advantages of UUIDv7 |
| 61 | + |
| 62 | +The following advantages make UUIDv7 a better choice for distributed databases: |
| 63 | + |
| 64 | +* Time-ordered values improve indexing performance and data locality. |
| 65 | + |
| 66 | +* Better scalability for high-insert workloads in distributed databases. |
| 67 | + |
| 68 | +* Maintains uniqueness while allowing efficient range queries. |
38 | 69 |
|
39 | 70 | ## Install the UUID_VX component
|
40 | 71 |
|
@@ -64,15 +95,15 @@ The following functions are compatible with all UUID versions:
|
64 | 95 |
|
65 | 96 | | Function name | Argument | Description |
|
66 | 97 | |----------------------|----------|---|
|
67 |
| -| `BIN_TO_UUID_VX()` | One string argument that must be a hexadecimal of exactly 32 characters (16 bytes) | The function returns a UUID with binary data from the argument. It returns an error for all other inputs. | |
68 |
| -| `IS_MAX_UUID_VX()` | One string argument that represents a UUID in standard or hexadecimal form. | The function returns true if the argument is a valid UUID and is a MAX UUID. It returns false for all other inputs. If the argument is NULL, it returns NULL. If the argument cannot be parsed as a UUID, the function throws an error. | |
| 98 | +| `BIN_TO_UUID_VX()` | One string argument that must be hexadecimal of exactly 32 characters (16 bytes) | The function returns a UUID with binary data from the argument. It returns an error for all other inputs. | |
| 99 | +| `IS_MAX_UUID_VX()` | One string argument that represents a UUID in standard or hexadecimal form. | The function returns true if the argument is a valid UUID and is a MAX UUID. It returns false for all other inputs. If the argument is NULL, it returns NULL. The function throws an error if the argument cannot be parsed as a UUID. | |
69 | 100 | | `IS_NIL_UUID_VX()` | One string argument representing a UUID in standard or hexadecimal form. | The function returns true if the string is a NIL UUID. If the argument is NULL, it returns NULL. If the argument is not a valid UUID, it throws an error. |
|
70 |
| -| `IS_UUID_VX()` | One string argument that represents a UUID in either standard or hexadecimal form. | The function returns true if the argument is a valid UUID. If the argument is NULL, it returns NULL. For any other input, it returns false. | |
| 101 | +| `IS_UUID_VX()` | One string argument representing a UUID in either standard or hexadecimal form. | The function returns true if the argument is a valid UUID. If the argument is NULL, it returns NULL. For any other input, it returns false. | |
71 | 102 | | `MAX_UUID_VX()` | No argument | This function generates a MAX UUID, which has all 128 bits set to one (FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF). This function result is the opposite of the NIL UUID. |
|
72 | 103 | | `NIL_UUID_VX()` | No argument. | This function generates a NIL UUID, which has all 128 bits set to zero (00000000-0000-0000-0000-000000000000). |
|
73 |
| -| `UUID_VX_TO_BIN()` | One string argument, formatted as a UUID or in hexadecimal form | The function converts the string arugment to its binary representation. | |
74 |
| -| `UUID_VX_VARIANT()` | One string argument that represents a UUID in either standard or hexadecimal format. | The function returns the UUID version (1-8) or an error if the argument is not a valid UUID or returns NULL if the input is NULL. | |
75 |
| -| `UUID_VX_VERSION()` | One string representing a UUID in standard or hexadecimal form. | The function returns version of UUID(1-8). The function throws an error if the argument is not a valid UUID in formatted or hexadecimal form or returns a NULL if the argument is NULL. If the argument is a valid UUID string but has an unknown value (outside of the 1-8 range) the function returns `-1`. | |
| 104 | +| `UUID_VX_TO_BIN()` | One string argument, formatted as a UUID or in hexadecimal form | The function converts the string argument to its binary representation. | |
| 105 | +| `UUID_VX_VARIANT()` | One string argument representing a UUID in either standard or hexadecimal format. | The function returns the UUID version (1-8) or an error if the argument is not a valid UUID or returns NULL if the input is NULL. | |
| 106 | +| `UUID_VX_VERSION()` | One string representing a UUID in standard or hexadecimal form. | The function returns version of UUID(1-8). The function throws an error if the argument is not a valid UUID in formatted or hexadecimal form or returns a NULL if the argument is NULL. If the argument is a valid UUID string with an unknown value (outside of the 1-8 range), the function returns `-1`. | |
76 | 107 |
|
77 | 108 |
|
78 | 109 | ### Examples of functions for all UUID versions
|
@@ -127,11 +158,11 @@ The following functions generate specific UUID versions:
|
127 | 158 | | UUID Version | Arguement | Description |
|
128 | 159 | |--------------|-----------|---|
|
129 | 160 | | `UUID_V1()` | No argument | Generates a version 1 UUID based on a timestamp. If possible, use UUID_V7() instead. |
|
130 |
| -| `UUID_V3()` | One or two arguments: the first argument is a string that is hashed with MD5 and used in the UUID; the second argument is optional and specifies a namespace (integer values: DNS: 0, URL: 1, OID: 2, X.500: 3; default is 1 or URL). | Generates a version 3 UUID based on a name. Note: MD5 is outdated and not secure. Use with caution and avoid exposing sensitive data. | |
| 161 | +| `UUID_V3()` | One or two arguments: the first argument is a string that is hashed with MD5 and used in the UUID; the second argument is optional and specifies a namespace (integer values: DNS: 0, URL: 1, OID: 2, X.500: 3; default is 1 or URL). | Generates a version 3 UUID based on a name. Note: MD5 is outdated and not secure. Use with caution and avoid exposing sensitive data. | |
131 | 162 | | `UUID_V4()` | No argument | The function generates a version 4 UUID using random numbers and is similar to the built-in UUID() function. |
|
132 |
| -| `UUID_V5()` | One or two arguments: the first argument is a string that is hashed with SHA1 and used in the UUID; the second argument is optional and specifies a namespace (integer values: DNS: 0, URL: 1, OID: 2, X.500: 3; default is 1 or URL).| Generates a version 5 UUID based on a name. Note: SHA1 is better than MD5 but still not secure. Use with caution and avoid exposing sensitive data. | |
| 163 | +| `UUID_V5()` | One or two arguments: the first argument is a string that is hashed with SHA1 and used in the UUID; the second argument is optional and specifies a namespace (integer values: DNS: 0, URL: 1, OID: 2, X.500: 3; default is 1 or URL).| Generates a version 5 UUID based on a name. Note: SHA1 is better than MD5 but still not secure. Use with caution and avoid exposing sensitive data. | |
133 | 164 | | `UUID_V6()` | No argument | Generates a version 6 UUID based on a timestamp. If possible, use UUID_V7() instead. |
|
134 |
| -| `UUID_V7()` | Can have either no argument or a one integer argument: the argument is the number of milliseconds to adjust the timestamp forward or backward (negative values). | Generates a version 7 UUID based on a timestamp. If there is no argument, no timestamp shift occurs. Timestamp shift can hide the actual creation time of the record. | |
| 165 | +| `UUID_V7()` | Can have either no argument or a one integer argument: the argument is the number of milliseconds to adjust the timestamp forward or backward (negative values). | Generates a version 7 UUID based on a timestamp. If there is no argument, no timestamp shift occurs. Timestamp shift can hide the actual creation time of the record. | |
135 | 166 |
|
136 | 167 | The `UUID_v3()` function and `UUID_v5()` function do not validate the string argument, such as whether the URL is formatted correctly or the DNS name is correct. These functions generate a string hash and then add that hash to a UUID with the defined namespace. The user specifies the string.
|
137 | 168 |
|
@@ -289,9 +320,9 @@ The following functions are used only with time-based UUIDs, specifically versio
|
289 | 320 |
|
290 | 321 | | Function name | Argument | Description |
|
291 | 322 | |-----------------------------|---|----------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
292 |
| -| UUID_VX_TO_TIMESTAMP() | One string argument | Returns a timestamp string like “2024-05-29 18:04:14.201”. If the argument is not parsable as UUID v.1,6,7, the function throws an error. The function always uses UTC time, regardless of system settings or time zone settings in MySQL. | |
293 |
| -| UUID_VX_TO_TIMESTAMP_TZ() | One string argument | Returns a timestamp string with the time zone like “Wed May 29 18:05:07 2024 GMT”. If the argument is not parsable as UUID v.1,6,7, the function throws an error. The function always uses UTC time (GMT time zone), regardless of system settings or time zone settings in MySQL. | |
294 |
| -| UUID_VX_TO_UNIXTIME() | One string argument | Returns a number of milliseconds since the Epoch. If the argument is not parsable as UUID v.1,6,7, the function throws an error. | |
| 323 | +| UUID_VX_TO_TIMESTAMP() | One string argument | Returns a timestamp string like “2024-05-29 18:04:14.201”. The function throws an error if the argument is not parsable as UUID v.1,6,7. The function always uses UTC, regardless of MySQL's system settings or time zone settings. | |
| 324 | +| UUID_VX_TO_TIMESTAMP_TZ() | One string argument | Returns a timestamp string with the time zone like “Wed May 29 18:05:07 2024 GMT”. The function throws an error if the argument is not parsable as UUID v.1,6,7. The function always uses UTC (GMT zone), regardless of MySQL's system settings or time zone settings. | |
| 325 | +| UUID_VX_TO_UNIXTIME() | One string argument | Returns the number of milliseconds since the Epoch. The function throws an error if the argument is not parsable as UUID v.1,6,7. | |
295 | 326 |
|
296 | 327 | ### Timestamp-based function examples
|
297 | 328 |
|
@@ -349,4 +380,4 @@ mysql> UNINSTALL COMPONENT 'file://component_uuid_vx_udf';
|
349 | 380 |
|
350 | 381 | ```{.text .no-copy}
|
351 | 382 | Query OK, 0 rows affected (0.03 sec)
|
352 |
| - ``` |
| 383 | + ``` |
0 commit comments