Universally Unique Identifier (UUID) Analysis
Overview
UUIDs are 128-bit identifiers standardized in RFC 9562 (May 2024), which obsoletes the previous RFC 4122. The latest specification introduces three new versions (v6, v7, v8) while maintaining backward compatibility with existing versions.
URI Safety
✅ Fully URI-Safe
UUIDs are inherently safe for use in URIs without any encoding required.
Standard format:
550e8400-e29b-41d4-a716-446655440000
Characteristics:
- 36 characters: 32 hexadecimal digits + 4 hyphens
- Character set:
a-f,0-9,- - All characters are in RFC 3986 §2.3 unreserved set
- Case-insensitive (lowercase recommended per RFC 9562)
Usage in URIs:
/api/users/550e8400-e29b-41d4-a716-446655440000
?id=550e8400-e29b-41d4-a716-446655440000
urn:uuid:550e8400-e29b-41d4-a716-446655440000
Alternative encodings:
- Base64 URL-safe: 22 characters (optimization, not required)
- Base62: Similar length, avoids
+and/ - These are for compactness, not safety
Database Storage and Performance
Storage Size
Binary format:
- 16 bytes (128 bits) - canonical storage format
- Defined in RFC 9562
String format:
- 36 characters (
CHAR(36)) - Actual storage: 36-40 bytes depending on database encoding
Storage comparison:
| Format | Size | Overhead |
|---|---|---|
Binary (BINARY(16)) | 16 bytes | baseline |
String (CHAR(36)) | 36 bytes | 2.25× |
String (VARCHAR(36)) | 38-40 bytes | ~2.5× |
Database-Specific Implementations
PostgreSQL:
-- Use native UUID type (16 bytes internally)
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
);
-- PostgreSQL 18+ supports UUIDv7
CREATE TABLE posts (
id UUID PRIMARY KEY DEFAULT gen_uuid_v7()
);
Performance impact:
- Native
UUIDtype: 16 bytes - Text storage: Tables 54% larger, indexes 85% larger
MySQL:
-- Use BINARY(16) with conversion functions
CREATE TABLE users (
id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID()))
);
-- Retrieve with conversion
SELECT BIN_TO_UUID(id) as id FROM users;
SQL Server:
CREATE TABLE users (
id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWSEQUENTIALID()
);
- Note:
NEWSEQUENTIALID()generates sequential UUIDs, notNEWID()which is random
Index Performance
The UUID v4 Problem
Random insertion issues:
- Page splits: New UUIDs insert at arbitrary positions in B-tree
- Fragmentation: Index becomes scattered across non-contiguous pages
- Wasted space: Page splits leave gaps throughout index
- Cache inefficiency: Poor locality leads to more cache misses
- Write amplification: More disk I/O per insert
Measured impact:
- Constant page splits during INSERT operations
- Index bloat (more pages for same data)
- 2-5× slower than sequential IDs
- Degraded SELECT performance
The UUID v7 Solution
Sequential insertion benefits:
- Append-only writes: New entries go to end of index
- Minimal page splits: Only last page splits when full
- Low fragmentation: Index remains mostly contiguous
- Better caching: Sequential access patterns
- Reduced I/O: Fewer disk operations
Measured improvements:
- 2-5× faster insert performance vs v4
- 50% reduction in Write-Ahead Log (WAL) rate
- Fewer page splits comparable to auto-increment
- Better storage efficiency
Binary vs String Storage
Index size comparison (PostgreSQL):
| Storage Type | Table Size | Index Size |
|---|---|---|
| Binary (UUID) | 100% (baseline) | 100% (baseline) |
| String (TEXT) | 154% | 185% |
Why binary is faster:
- Smaller indexes (fewer pages)
- Better cache utilization
- Faster CPU comparisons (128-bit integers)
- Reduced I/O (less data transfer)
Generation Approach
✅ Fully Decentralized
One of UUID’s core design goals is decentralized generation without coordination. Multiple systems can generate UUIDs independently without collision risk.
UUID Version Comparison
UUID v1 - Time-based + MAC Address
Structure:
Timestamp (60 bits) + Clock Sequence (14 bits) + MAC Address (48 bits)
Generation:
- Timestamp: 100-nanosecond intervals since Oct 15, 1582
- Node ID: System’s MAC address
- Clock sequence: Random value to prevent duplicates
Pros:
- Sequential (sorts chronologically)
- Very low collision risk
- Decentralized
Cons:
- ❌ Privacy concern: Leaks MAC address (physical location)
- ❌ Timestamp not in sortable byte order
- ❌ Modern systems avoid for security reasons
Use case: Legacy systems only (prefer v7)
UUID v4 - Random
Structure:
122 random bits + 6 version/variant bits
Generation:
- Entirely random (cryptographically secure RNG recommended)
- No coordination needed
- No sequential ordering
Pros:
- ✅ Maximum privacy (no identifying information)
- ✅ Simplest to generate
- ✅ Works offline
- ✅ Truly decentralized
Cons:
- ❌ Poor database performance: Random insertion causes fragmentation
- ❌ No time information
- ❌ Higher collision probability (still astronomically low)
Collision probability:
- 122 bits of entropy
- Need ~2.7 × 10¹⁸ UUIDs for 50% collision chance
- In practice: negligible
Use cases:
- Session IDs
- One-time tokens
- Non-database identifiers
- When pure randomness is desired
UUID v6 - Reordered Time-based
Structure:
Timestamp (60 bits, big-endian) + Clock Sequence + Node ID
Generation:
- Like v1 but timestamp bytes reordered for sorting
- Maintains MAC address (privacy concern)
Pros:
- Sortable (better than v1)
- Sequential insertion performance
Cons:
- ❌ Still leaks MAC address
- ❌ Superseded by v7: RFC 9562 recommends v7
Use case: None - v7 is better
UUID v7 - Time-ordered + Random ⭐ RECOMMENDED
Structure:
Unix Timestamp (48 bits, millisecond) + Random (74 bits)
Generation:
- Top 48 bits: Unix epoch milliseconds
- Bottom 74 bits: Random data
- No MAC address
- Monotonically increasing
Pros:
- ✅ Excellent database performance: Sequential inserts
- ✅ Privacy-preserving: No MAC address
- ✅ Sortable: Natural time ordering
- ✅ Decentralized: No coordination needed
- ✅ Random component: Prevents collisions from multiple nodes
Performance measured:
- 2-5× faster inserts than v4
- 50% reduction in WAL rate
- Minimal page splits
- Better cache locality
Cons:
- ⚠️ Exposes creation timestamp (usually acceptable)
- Slightly more complex than v4
Use cases:
- Database primary keys (optimal choice)
- Distributed systems
- Event IDs with time ordering
- Modern applications (default recommendation)
Decentralization Requirements
No central service required for any version:
// Example: Independent generation
// Node A
uuid1 := uuid.NewV7() // 0191e1a6-8b2c-7890-abcd-123456789abc
// Node B (same time)
uuid2 := uuid.NewV7() // 0191e1a6-8b2c-7890-xyz1-987654321def
How v7 avoids collisions:
- Time component: Millisecond precision provides separation
- Random component: 74 bits prevents same-millisecond collisions
- No coordination: Each node generates independently
Collision risk (UUID v7):
- Within same millisecond: 2⁷⁴ unique values possible
- Even at 1 billion IDs per millisecond: negligible collision risk
Version Selection Guide
┌─────────────────────────────────────────────────────┐
│ Which UUID Version? │
├─────────────────────────────────────────────────────┤
│ │
│ Database Primary Key? ──YES──> UUID v7 │
│ │ │
│ NO │
│ │ │
│ Need time ordering? ──YES──> UUID v7 │
│ │ │
│ NO │
│ │ │
│ Need pure randomness? ──YES──> UUID v4 │
│ │
│ ❌ Avoid: v1 (privacy), v6 (superseded) │
└─────────────────────────────────────────────────────┘
Go Library Support
✅ Official Google UUID Library
The most widely-used Go library for UUIDs is github.com/google/uuid, which provides full support for UUID versions 1, 3, 4, 5, 6, and 7.
Installation:
go get github.com/google/uuid
Usage examples:
import "github.com/google/uuid"
// Generate UUID v4 (random)
id := uuid.New()
fmt.Println(id.String()) // e.g., 550e8400-e29b-41d4-a716-446655440000
// Generate UUID v7 (time-ordered, recommended for databases)
id := uuid.Must(uuid.NewV7())
fmt.Println(id.String()) // e.g., 0191e1a6-8b2c-7890-abcd-123456789abc
// Parse existing UUID
parsed, err := uuid.Parse("550e8400-e29b-41d4-a716-446655440000")
if err != nil {
log.Fatal(err)
}
Modern Recommendations (2024-2025)
For new projects:
Default choice: UUID v7
- Best performance
- Decentralized generation
- No privacy concerns
- Sortable
Special cases: UUID v4
- Explicit randomness needed
- Non-database contexts
- Legacy compatibility
Avoid: v1, v6
- v1: Privacy issues (MAC address)
- v6: v7 is better in every way
Recent Developments
RFC 9562 (May 2024)
- Obsoletes RFC 4122
- Introduces v6, v7, v8
- Recommends v7 for database keys
PostgreSQL 18 (2025)
- Native
gen_uuid_v7()function - Solves B-tree fragmentation
- Built-in time-ordered UUID generation
Industry Adoption
- Buildkite: “Goodbye to sequential integers, hello UUIDv7”
- Cloud providers adding native support
- Database vendors implementing optimizations
Summary
| Aspect | UUID v4 | UUID v7 |
|---|---|---|
| Storage | 16 bytes binary | 16 bytes binary |
| Generation | Fully random | Time + random |
| Decentralized | ✅ Yes | ✅ Yes |
| Coordination | ❌ No | ❌ No |
| URI safe | ✅ Yes | ✅ Yes |
| DB inserts | ⚠️ Slow (random) | ✅ Fast (sequential) |
| Fragmentation | ⚠️ High | ✅ Low |
| Page splits | ⚠️ Frequent | ✅ Minimal |
| Sortable | ❌ No | ✅ Yes (by time) |
| Privacy | ✅ Maximum | ✅ Good |
| Best for | Tokens, session IDs | Database keys |
Key Takeaways
- Always use binary storage in databases (16 bytes vs 36-40 bytes)
- UUID v7 is the modern default for database primary keys
- UUID v4 still useful for session tokens and random IDs
- No coordination required - all versions are fully decentralized
- URI-safe by design - use directly in URLs without encoding
- RFC standardized - wide vendor support and tooling available