summaryrefslogtreecommitdiffstats
path: root/system/bees/README
diff options
context:
space:
mode:
Diffstat (limited to 'system/bees/README')
-rw-r--r--system/bees/README33
1 files changed, 33 insertions, 0 deletions
diff --git a/system/bees/README b/system/bees/README
new file mode 100644
index 0000000000..88041ffa13
--- /dev/null
+++ b/system/bees/README
@@ -0,0 +1,33 @@
+bees (Best-Effort Extent-Same) is a block-oriented userspace
+deduplication agent designed for large btrfs filesystems. It is an
+offline dedupe combined with an incremental data scan capability to
+minimize time data spends on disk from write to dedupe.
+
+Strengths:
+ * Space-efficient hash table and matching algorithms - can use as
+ little as 1 GB hash table per 10 TB unique data (0.1GB/TB)
+ * Daemon incrementally dedupes new data using btrfs tree search
+ * Works with btrfs compression - dedupe any combination of compressed
+ and uncompressed files
+ * Works around btrfs filesystem structure to free more disk space
+ * Persistent hash table for rapid restart after shutdown
+ * Whole-filesystem dedupe - including snapshots
+ * Constant hash table size - no increased RAM usage if data set
+ becomes larger
+ * Works on live data - no scheduled downtime required
+ * Automatic self-throttling based on system load
+
+Weaknesses:
+ * Whole-filesystem dedupe - has no include/exclude filters, does not
+ accept file lists
+ * Requires root privilege (or CAP_SYS_ADMIN)
+ * First run may require temporary disk space for extent reorganization
+ * First run may increase metadata space usage if many snapshots exist
+ * Constant hash table size - no decreased RAM usage if data set
+ becomes smaller
+ * btrfs only
+
+After installing, edit /etc/rc.d/rc.bees.conf, /etc/logrotate.d/bees,
+and /etc/bees/*.conf, and ensure /etc/rc.d/rc.bees is started from
+/etc/rc.d/rc.local. To drastically reduce the amount of logging it is
+recommended to add "-v 6" to OPTIONS in /etc/bees/*.conf.