Universally Unique Lexicographically Sortable Identifiers in R
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

3.1 KiB

---
output: rmarkdown::github_document
---

[![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/ulid.svg?branch=master)](https://travis-ci.org/hrbrmstr/ulid)
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/hrbrmstr/ulid?branch=master&svg=true)](https://ci.appveyor.com/project/hrbrmstr/ulid)
[![Coverage Status](https://codecov.io/gh/hrbrmstr/ulid/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/ulid)
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/ulid)](https://cran.r-project.org/package=ulid)

# ulid

Universally Unique Lexicographically Sortable Identifier

## Description

(grifted from <https://github.com/ulid/spec>)

UUID can be suboptimal for many uses-cases because:

- It isn't the most character efficient way of encoding 128 bits of randomness
- UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
- UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
- UUID v4 provides no other information than randomness which can cause fragmentation in many data structures

Instead, herein is proposed ULID:

```javascript
ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV
```

- 128-bit compatibility with UUID
- 1.21e+24 unique ULIDs per millisecond
- Lexicographically sortable!
- Canonically encoded as a 26 character string, as opposed to the 36 character UUID
- Uses Crockford's base32 for better efficiency and readability (5 bits per character)
- Case insensitive
- No special characters (URL safe)
- Monotonic sort order (correctly detects and handles the same millisecond)

```
01AN4Z07BY 79KA1307SR9X4MV3

|----------| |----------------|
Timestamp Randomness
48bits 80bits
```

### Components

**Timestamp**
- 48 bit integer
- UNIX-time in milliseconds
- Won't run out of space till the year 10889 AD.

**Randomness**
- 80 bits
- Cryptographically secure source of randomness, if possible

### Sorting

The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed.

## What's Inside The Tin

The following functions are implemented:

- `ULIDgenerate` / `generate` / `ulid_generate`: Generate a time-based ULID
- `ts_generate`: Generate ULID from timestamps
- `unmarshal`: Unmarshal a ULID into a data frame with timestamp and random bitstring columns

## Installation

```{r nstall-ex, results='asis', echo = FALSE}
hrbrpkghelpr::install_block()
```

```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```

## Usage

```{r message=FALSE, warning=FALSE, error=FALSE}
library(ulid)

# current verison
packageVersion("ulid")

```

### One

```{r}
ulid::ULIDgenerate()
```

### Many

```{r}
(u <- ulid::ULIDgenerate(20))
```

### Unmarshal

```{r}
unmarshal(u)
```

### Use defined timestamps

```{r}
(ut <- ts_generate(as.POSIXct("2017-11-01 15:00:00", origin="1970-01-01")))

unmarshal(ut)
```

## Package Code Metrics

```{r}
cloc::cloc_pkg_md()
```