You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
100 lines
2.4 KiB
100 lines
2.4 KiB
---
|
|
title: "Introduction to {ulid}"
|
|
output: rmarkdown::html_vignette
|
|
vignette: >
|
|
%\VignetteIndexEntry{Introduction to {ulid}}
|
|
%\VignetteEngine{knitr::rmarkdown}
|
|
%\VignetteEncoding{UTF-8}
|
|
---
|
|
|
|
```{r, include = FALSE}
|
|
knitr::opts_chunk$set(
|
|
collapse = TRUE,
|
|
comment = "#>"
|
|
)
|
|
```
|
|
|
|
## UUID : Universally Unique Lexicographically Sortable Identifiers
|
|
|
|
UUID can be suboptimal for many uses-cases because:
|
|
|
|
- It isn't the most character efficient way of encoding 128 bits of randomness
|
|
- UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
|
|
- UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
|
|
- UUID v4 provides no other information than randomness which can cause fragmentation in many data structures
|
|
|
|
Instead, herein is proposed ULID:
|
|
|
|
```javascript
|
|
ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV
|
|
```
|
|
|
|
- 128-bit compatibility with UUID
|
|
- 1.21e+24 unique ULIDs per millisecond
|
|
- Lexicographically sortable!
|
|
- Canonically encoded as a 26 character string, as opposed to the 36 character UUID
|
|
- Uses Crockford's base32 for better efficiency and readability (5 bits per character)
|
|
- Case insensitive
|
|
- No special characters (URL safe)
|
|
- Monotonic sort order (correctly detects and handles the same millisecond)
|
|
|
|
```
|
|
01AN4Z07BY 79KA1307SR9X4MV3
|
|
|
|
|----------| |----------------|
|
|
Timestamp Randomness
|
|
48bits 80bits
|
|
```
|
|
|
|
### Components
|
|
|
|
**Timestamp**
|
|
- 48 bit integer
|
|
- UNIX-time in milliseconds
|
|
- Won't run out of space till the year 10889 AD.
|
|
|
|
**Randomness**
|
|
- 80 bits
|
|
- Cryptographically secure source of randomness, if possible
|
|
|
|
### Sorting
|
|
|
|
The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed.
|
|
|
|
## What's Inside The Tin
|
|
|
|
The following functions are implemented:
|
|
|
|
- `ULIDgenerate` / `generate` / `ulid_generate`: Generate a time-based ULID
|
|
- `ts_generate`: Generate ULID from timestamps
|
|
- `unmarshal`: Unmarshal a ULID into a data frame with timestamp and random bitstring columns
|
|
|
|
```{r setup}
|
|
library(ulid)
|
|
```
|
|
|
|
### One
|
|
|
|
```{r}
|
|
ulid::ULIDgenerate()
|
|
```
|
|
|
|
### Many
|
|
|
|
```{r}
|
|
(u <- ulid::ULIDgenerate(20))
|
|
```
|
|
|
|
### Unmarshal
|
|
|
|
```{r}
|
|
unmarshal(u)
|
|
```
|
|
|
|
### Use defined timestamps
|
|
|
|
```{r}
|
|
(ut <- ts_generate(as.POSIXct("2017-11-01 15:00:00", origin="1970-01-01")))
|
|
|
|
unmarshal(ut)
|
|
```
|
|
|