README.md
# Multibases
[![Build Status](https://travis-ci.com/SleeplessByte/ruby-multibase.svg?branch=master)][shield-link-travis]
[![Gem Version](https://badge.fury.io/rb/multibases.svg)][shield-link-gem]
[![MIT license](https://img.shields.io/badge/license-MIT-brightgreen.svg)][shield-link-license]
[![Maintainability](https://api.codeclimate.com/v1/badges/1253cc22b664d27d4052/maintainability)][shield-link-codeclimate]
[shield-link-travis]: https://travis-ci.com/SleeplessByte/ruby-multibase
[shield-link-gem]: https://badge.fury.io/rb/multibases
[shield-link-license]: http://opensource.org/licenses/MIT
[shield-link-codeclimate]: https://codeclimate.com/github/SleeplessByte/ruby-multibase/maintainability
> Multibase is a protocol for disambiguating the encoding of base-encoded
> (e.g., base32, base64, base58, etc.) binary appearing in text.
`Multibases` is the ruby implementation of [multiformats/multibase][spec].
This gem can be used _both_ for encoding into or decoding from multibase packed
strings, as well as serve as a _general purpose_ library to do `BaseX` encoding
and decoding _without_ adding the prefix.
> 🙌🏽 This is called `multibases` instead of the singular form, to stay
> consistent with the `multihashes` gem, which was _forced_ to take a different
> name has `multihash` was already taken, which is also the case for `multibase`
> and others. In the future, this might be renamed to `multiformats-base`, with
> a backwards-compatible interface.
## Installation
Add this line to your application's Gemfile:
```Ruby
gem 'multibases'
```
or alternatively if you would like to bring your own engines and not load any
of the built-in ones:
```Ruby
gem 'multibases', require: 'multibases/bare'
```
And then execute:
$ bundle
Or install it yourself as:
$ gem install multibases
## Usage
This is a low-level library, but high level implementations are provided.
You can also bring your own encoder/decoder. The most important methods are:
- `Multibases.encode(encoding, data, engine?)`: encodes the given data with a
built-in engine for encoding, or engine if it's given. Returns an `Encoded`
PORO that has `pack`.
- `Multibases.unpack(packed)`: decodes a multibase packed string into an
`Encoded` PORO that has `decode`.
- `Multibases::Encoded.pack`: packs the multihash into a single string
- `Multibases::Encoded.decode(engine?)`: decodes the PORO's data using a
built-in engine, or engine if it's given. Returns a decoded `ByteArray`.
```ruby
encoded = Multibases.encode('base2', 'mb')
# => #<struct Multibases::Encoded
# code="0", encoding="base2", length=16,
# data=[Multibases::EncodedByteArray "0110110101100010"]>
encoded.pack
# => [Multibases::EncodedByteArray "00110110101100010"]
encoded = Multibases.unpack('766542')
# => #<struct Multibases::Encoded
# code="7", encoding="base8", length=5,
# data=[Multibases::EncodedByteArray "66542"]>
encoded.decode
# => [Multibases::DecodedByteArray "mb"]
```
This means that the flow of calls is as follows:
```text
data ➡️ (encode) ➡️ encoded data
encoded data ➡️ (pack) ➡️ multibasestr
multibasestr ➡️ (unpack) ➡️ encoded data
encoded data ➡️ (decode) ➡️ data
```
Convenience methods are provided:
- `Multibases.pack(encoding, data, engine?)`: calls `encode` and then `pack`
- `Multibases.decode(packed, engine?)`: calls `unpack` and then `decode`
```ruby
Multibases.pack('base2', 'mb')
# => [Multibases::EncodedByteArray "00110110101100010"]
```
### ByteArrays and encoding
As you can see, the "final" methods output a `ByteArray`. These are simple
`DelegateClass` wrappers around the array with bytes, which means that the `hex`
encoding of `hello` is not actually stored as `"f68656c6c6f"`:
```ruby
packed = Multibases.pack('base16', 'hello')
# => [Multibases::EncodedByteArray "f68656c6c6f"]
packed.to_a # .__getobj__.dup
# => [102, 54, 56, 54, 53, 54, 99, 54, 99, 54, 102]
```
They override `inspect` and _force_ the encoding to `UTF-8` (in inspect), but
you can use the convenience methods to use the correct encoding:
> **Note**: If you're using `pry` and have not changed the printer, you
> naturally won't see the output as described above, but instead see the inner
> Array of bytes, always.
```ruby
data = 'hello'.encode('UTF-16LE')
data.encoding
# => #<Encoding:UTF-16LE>
data.bytes
# => [104, 0, 101, 0, 108, 0, 108, 0, 111, 0]
packed = Multibases.pack('base16', data)
# => [Multibases::EncodedByteArray "f680065006c006c006f00"]
decoded = Multibases.decode(packed)
# => [Multibases::DecodedByteArray "h e l l o "]
decoded.to_s('UTF-16LE')
# => "hello"
```
### Implementations
You can find the _current_ multibase table [here][git-multibase-table]. At this
moment, built-in engines are provided as follows:
| encoding | code | description | implementation |
|-------------------|------|-----------------------------------|----------------|
| identity | 0x00 | 8-bit binary | `bare` |
| base1 | 1 | unary (1111) | ❌ |
| base2 | 0 | binary (0101) | `base2` 💨 |
| base8 | 7 | octal | `base_x` |
| base10 | 9 | decimal | `base_x` |
| base16 | f | hexadecimal | `base16` 💨 |
| base16upper | F | hexadecimal | `base16` 💨 |
| base32hex | v | rfc4648 no padding - highest char | `base32` ✨ |
| base32hexupper | V | rfc4648 no padding - highest char | `base32` ✨ |
| base32hexpad | t | rfc4648 with padding | `base32` ✨ |
| base32hexpadupper | T | rfc4648 with padding | `base32` ✨ |
| base32 | b | rfc4648 no padding | `base32` ✨ |
| base32upper | B | rfc4648 no padding | `base32` ✨ |
| base32pad | c | rfc4648 with padding | `base32` ✨ |
| base32padupper | C | rfc4648 with padding | `base32` ✨ |
| base32z | h | z-base-32 (used by Tahoe-LAFS) | `base32` ✨ |
| base58flickr | Z | base58 flicker | `base_x` |
| base58btc | z | base58 bitcoin | `base_x` |
| base64 | m | rfc4648 no padding | `base64` 💨 |
| base64pad | M | rfc4648 with padding - MIME enc | `base64` 💨 |
| base64url | u | rfc4648 no padding | `base64` 💨 |
| base64urlpad | U | rfc4648 with padding | `base64` 💨 |
Those with a 💨 are marked because they are backed by a C implementation (using
`pack` and `unpack`) and are therefore suposed to be blazingly fast. Those with
a ✨ are marked because they have a custom implementation over the generic
`base_x` implementation. It should be faster.
The version of the spec that this repository was last updated for is available
via `Multibases.multibase_version`:
```ruby
Multibases.multibase_version
# => "1.0.0"
```
### Bring your own engine
The methods of `multibases` allow you to bring your own engine, and you can safe
additional memory by only loading `multibases/bare`.
```ruby
# Note: This is not how multibase was meant to work. It's supposed to only
# convert the input from one base to another, and denote what that base
# is, stored in the output. However, the system is _so_ flexible that this
# works perfectly for any reversible transformation!
class EngineKlazz
def initialize(*_)
end
def encode(plain)
plain = plain.bytes unless plain.is_a?(Array)
Multibases::EncodedByteArray.new(plain.reverse)
end
def decode(encoded)
encoded = encoded.bytes unless encoded.is_a?(Array)
Multibases::DecodedByteArray.new(encoded.reverse)
end
end
Multibases.implement 'reverse', 'r', EngineKlazz, 'alphabet'
# => Initializes EngineKlazz with 'alphabet'
Multibases.pack('reverse', 'md')
# => [Multibases::EncodedByteArray "rdm"]
Multibases.decode('dm')
# => [Multibases::DecodedByteArray "md"]
# Alternatively, you can pass the instantiated engine to the appropriate
# function.
engine = EngineKlazz.new
# Mark the encoding as "existing" and attach a code
Multibases.implement 'reverse', 'r'
# Pack, using a custom engine
Multibases.pack('reverse', 'md', engine)
# => [Multibases::EncodedByteArray "rdm"]
Multibases.decode('rdm', engine)
# => [Multibases::DecodedByteArray "md"]
```
### Using the built-in encoders/decoders
You can use the built-in encoders and decoders.
```ruby
require 'multibases/base16'
Multibases::Base16.encode('foobar')
# => [Multibases::EncodedByteArray "666f6f626172"]
Multibases::Base16.decode('666f6f626172')
# => [Multibases::DecodedByteArray "foobar"]
```
These don't add the `multibase` prefix to the output and they use the canonical
`encode` and `decode` nomenclature.
The `base_x` / `BaseX` encoder does not have a module function. You _must_
instantiate it first. The result is an encoder that uses the base alphabet to
determine its base. Currently padding is ❌ not supported for `BaseX`, but
might be in a future update using a second argument or key.
```ruby
require 'multibases/base_x'
Base3 = Multibases::BaseX.new('012')
# => [Multibases::Base3 alphabet="012" strict]
Base3.encode('foobar')
# => [Multibases::EncodedByteArray "112202210012121110020020001100"]
```
You can use the same technique to inject a custom alphabet. This can be used on
the built-in encoders, even the ones that are not `BaseX`:
```ruby
base = Multibases::Base2.new('.!')
# => [Multibases::Base2 alphabet=".!"]
base.encode('foo')
# [Multibases::EncodedByteArray ".!!..!!..!!.!!!!.!!.!!!!"]
base.decode('.!!...!..!!....!.!!!..!.')
# => [Multibases::DecodedByteArray "bar"]
```
All the built-in encoder/decoders take strings, arrays or byte-arrays as input.
```ruby
expected = Multibases::Base16.encode("abc")
# => [Multibases::EncodedByteArray "616263"]
expected == Multibases::Base16.encode([97, 98, 99])
# => true
expected == Multibases::Base16.encode(Multibases::ByteArray.new("abc".bytes))
# => true
```
## Related
- [`multiformats/multibase`][git-multibase]: the spec repository
- [`multiformats/ruby-multicodec`][git-ruby-multicodec]: the ruby implementation of [`multiformats/multicodec`][git-multicodec]
- [`multiformats/ruby-multihash`][git-ruby-multihash]: the ruby implementation of [`multiformats/multihash`][git-multihash]
## Development
After checking out the repo, run `bin/setup` to install dependencies. Then, run
`rake test` to run the tests. You can also run `bin/console` for an interactive
prompt that will allow you to experiment.
To install this gem onto your local machine, run `bundle exec rake install`.
To release a new version, update the version number in `version.rb`, and then
run `bundle exec rake release`, which will create a git tag for the version,
push git commits and tags, and push the `.gem` file to [rubygems.org][web-rubygems].
## Contributing
Bug reports and pull requests are welcome on GitHub at [SleeplessByte/ruby-multibase][git-self].
This project is intended to be a safe, welcoming space for collaboration, and
contributors are expected to adhere to the [Contributor Covenant][web-coc] code
of conduct.
## License
The gem is available as open source under the terms of the [MIT License][web-mit].
## Code of Conduct
Everyone interacting in the Shrine::ConfigurableStorage project’s codebases,
issue trackers, chat rooms and mailing lists is expected to follow the
[code of conduct][git-self-coc].
[spec]: https://github.com/multiformats/multibase
[git-self-coc]: https://github.com/SleeplessByte/ruby-multibase/blob/master/CODE_OF_CONDUCT.md
[git-self]: https://github.com/SleeplessByte/ruby-multibase
[git-ruby-multicodec]: https://github.com/SleeplessByte/ruby-multicodec
[git-multicodec]: https://github.com/multiformats/multicodec
[git-multibase]: https://github.com/multiformats/multibase
[git-multibase-table]: https://github.com/multiformats/multibase/blob/master/multibase.csv
[git-ruby-multihash]: https://github.com/multiformats/ruby-multihash
[git-multihash]: https://github.com/multiformats/multihash
[web-coc]: http://contributor-covenant.org
[web-mit]: https://opensource.org/licenses/MIT
[web-rubygems]: https://rubygems.org