Table of Content


Frequently Asked Questions

1. What's the current state of MetaFS?

Currently MetaFS is in closed "beta" state:

Beta (current state 2016/09):

Alpha (former state 2015/10):

2. How do I switch MongoDB Backend?

If you like to migrate from MongoDB 2.6.1 or earlier to MongoDB 3.0.1 or later, in particular with WiredTiger storage engine, or using TokuMX (MongoDB compatible).

2.1. Backup/Restore Approach

Given alpha is the volume name, and Alpha/ the mounting point:
% metafs alpha Alpha/
% cd Alpha/
% metabusy backup
% cd ..
% metafs -u alpha
at this point you either turn off the old MongoDB and start the new MongoDB, or connect the MetaFS volume with the new MongoDB running at an alternative port (e.g. 27024):
% metafs alpha Alpha/
% metafs --mongo.port=27024 alpha Alpha/
as next make sure the new mongorestore will be used by metabusy, then:
% cd Alpha/
% metabusy restore
Once you tested that the restore worked as expected, you may
% metabusy purgebackup

2.2. Archiving Approach

Alternatively you can backup your volume with marc and format the volume, unmount, remount and restore from the .marc archive, which will trigger update event on all items and re-extract metadata, depending on amount of items can be time consuming:
% marc cvz ~/alpha-backup.marc .
% metabusy format
% cd ..
% metafs -u alpha
at this point you switch MongoDB, or run on a new port, then
% metafs alpha Alpha/
% cd Alpha
% marc xv ~/alpha-backup.marc 

3. What limits has MetaFS?

Please read the following answers carefully, as it's a bit tricky to get an overview what the different limits are.

MMAPv1 and WiredTiger stand for storage engines of MongoDB, and MetaFS::IndexDB is in development aimed to lift the limits of the other backends (not yet released, expect coming with 0.6.x):

3.1. Volume size

A volume contains:
  1. data of items
  2. metadata of items
  3. indexes of metadata
  4. full text search (FTS) index
and each one can reside on a different physical disk, or in case of FTS (full text search) reside on another machine when using Elasticsearch (default) as FTS backend.

MongoDB 2.4/2.6/3.0 (MMAPv1): Maximum of size of all metadata of all items of a volume cannot exceed the virtual memory of the machine.

MongoDB 3.0 (WiredTiger): (unknown)

TokuMX 2.0: (unknown)

MetaFS::IndexDB: Maximum size of a single index (of a key) cannot exceed 256TiB.

3.2. Item / file content size

The maximum size of content per items depends on the underlying filesystem:

Ext4: 16TiB

XFS: 8,000TiB (8EiB)

ZFS: 16,000TiB (16EiB)

Btrfs: 16,000TiB (16EiB)

3.3. Item / file count

Ext4: Maximum of 1EiB (1,048,576TiB) filesystem; further
  • MongoDB 2.4/2.6/3.0 (MMAPv1): the maximum of metadata of a volume is 64TiB (journaled MongoDB) or 128TiB (non journaled MongoDB); given the metadata of a single item is about ~20KiB (20 * 1024) average, then ( 64 * 10244 ) / ( 20 * 1024 ) = 3 * 10243 = 3G items, or 3,435,973,836 items or ~3 * 109 are possible, or double in case non-journaled MongoDB
  • MongoDB 3.0 (WiredTiger): unknown
  • TokuMX 2.0: alike MongoDB 2.4/2.6/3.0 (MMAPv1)
  • MetaFS::IndexDB: 4 * 109 based on Ext4 limit
XFS: unknown
  • MetaFS::IndexDB: ~4 * 1012 items (see calculation under ZFS)
ZFS: Maximum of 248 files, 281,474,976,710,656 or 280 * 1012 files, further
  • MetaFS::IndexDB:
    • index limit: 248 or 281,474,976,710,656 or 280 * 1012 bytes, the index of uid, apprx. 64 bytes per key results in 242 entries or 4,398,046,511,104 or 4 * 1012 items
Btrfs: Maximum of 264 files, 18,446,744,073,709,551,616 or 18 * 1018 files, further
  • MetaFS::IndexDB: ~4 * 1012 items (see calculation under ZFS)
If a large amount of individual items is your use case, Btrfs or ZFS filesystem is recommended.

3.4. Metadata

Following metadata limits apply, depending on the backend database in use:

3.4.1. Metadata total size

MongoDB 2.4/2.6/3.0 (MMAPv1): Maximum of size of the metadata of a single item or file is 16MiB.

MongoDB 3.0 (WiredTiger): (unknown)

TokuMX 2.0: Maximum of size of the metadata of a single item or file is 16MiB.

MetaFS::IndexDB: unlimited, although the metadata of a single item has to fit into the physical memory, so a fraction (e.g. 1/10) of the physical RAM might be a good max size so MetaFS itself doesn't use that many resources, e.g. 16GiB RAM -> 1GiB max metadata which is a lot; with 500MB/s SSD it takes ~2 secs to load, so be aware of overly large metadata of a single item.

3.4.2. Filename length

MongoDB 2.4/2.6/3.0 (MMAPv1): The maximum filename length is 16MiB minus the other metadata of the item / file.

MongoDB 3.0 (WiredTiger): (unknown)

TokuMX 2.0: The maximum filename length is 16MiB minus the other metadata of the item / file.

MetaFS::IndexDB: unlimited (see also "Metadata total size" above)

Needless to say, insanely long filenames (>10,000 characters) don't make sense, as the filename is mainly an identifier used for humans to memorize, conceptually MetaFS main item identifier is uid and globally unique.

3.4.3. Depth

MongoDB 2.4/2.6/3.0 (MMAPv1): Maximum of depth of metadata key is 100, e.g. a.b.c....z.a0.b0.c0... of 100 levels or 99 dots, and 123 characters at max.

MongoDB 3.0 (WiredTiger): (unknown)

TokuMX 2.0: Maximum of depth of metadata key is 100, e.g. a.b.c....z.a0.b0.c0... of 100 levels or 99 dots, and 123 characters at max.

MetaFS::IndexDB: unlimited levels (see also "Metadata total size" above), yet the underlying filesystem might pose limits[1] to the total length of the key:

  • Ext4: 255 characters
  • XFS: 255 characters
  • ZFS: 255 characters
  • Btrfs: 255 characters
  1. a later version of MetaFS::IndexDB might lift those limits to higher number

3.4.4. Amount of Indexed Metadata

All metadata is searchable, regardless about its possible indexed state:

  • indexed metadata: fast to lookup, fast sortable
  • unindexed metadata: slow to lookup, not sortable
MongoDB 2.4/2.6/3.0 (MMAPv1/WiredTiger) & TokuMX 2.0:
Since MetaFS aims to index all metadata, but there is a maximum of only 64 metadata keys being indexed as only 64 indexes per collection are supported in MongoDB. The system metadata keys are uid, name, parent, type, ctime, mtime, atime, utime, otime, mime, size, hash, tags, title, author this means 15 are already used, 49 other keys are available. Images have image.*, audio items audio.*, textual content has text., etc; about 35 custom keys are free to use. This is a severe limitation currently and is about to be addressed (patching MongoDB or switching to another NoSQL database). See also MongoDB Limits and Limits of MongoDB.

MetaFS::IndexDB: unlimited (see also "Metadata total size" above); by default all key's values are indexed.

4. Memory Usage

Depending on the backend database, the memory usage and requirements vary:

MongoDB 2.4/2.6/3.0 (MMAPv1/WiredTiger): uses up (all) available memory in order to do fast lookup in memory, virtual as well resident memory uses grows and cannot be limited, leaves control up to the operating system to allow other programs to use (virtual/resident) memory. For example:

MongoDB 2.6.1: 12 * 106 (12 mio) Wikipedia .txt files: virtual: 65GB, resident: 5.2GB, shared: 5.1GB

TokuMX 2.0: unknown

MetaFS::IndexDB: 15-30MB at normal operation, all data resides on disk (indexes and metadata preferably on SSD, the data itself can reside on slower HDD).

5. What OSes are supported?

Linux supported [1]
MacOS X - [2]
Windows - [3]

  1. Ubuntu 14.04 or Debian preferably
  2. support might come later
  3. it's very unlikely it will be supported as it doesn't support FUSE (Filesystem in Userspace) yet

6. MetaFS on EC2 / KVM / VPS / LXC / Virtual Server

MetaFS in its current form uses Linux FUSE, so one has to make sure FUSE kernel-module is available within the container / virtual machine:

Qemu-KVM works
VirtualBox works
AWS EC2 works
DigitalOcean VPS works
Commercial VPS doesn't work[1]
LXC doesn't work[2]
Docker works[3]

  1. most commercial VPS providers do not support FUSE, see solution
  2. requires manual intervention to work, see example solution
  3. requires docker run --privileged=true ...

7. Updates

Significant updates of this document: Authors