Suzf Blog

[译] Couchbase 使用 cbrestore 恢复数据

Jeffrey Sep 02, 2016 NoSQL

恢复使用 cbbackup 命令备份的 bucket 数据，使用 cbrestore 命令恢复数据到一个新的群集的bucket中。

当恢复数据的时候，你不得不根据你正在执行的还原类型选择一个合适的还原序列。恢复群集数据的方法取决于备份群集的方式。

如果使用 cbbackup 备份 bucket 数据，你可以恢复数据到相同或者不同配置的群集中去。这是因为cbbackup存储格式为bucket的数据的信息使其能够被恢复到新的群集上。

注意：如果数据是使用直接拷贝文件的方式备份的，之后你恢复数据的时候必须恢复到相同群集中。

cbrestore 从使用 cbbackup 命令备份的数据中取信息存储到群集中。群集的配置不必匹配备份数据时所使用的。它可以在灾难恢复的情况下，将数据传送到一个新的群集中。或者是更新或者是扩展现有的群集版本。

因为数据可灵活恢复，它提供了许多不同的方案可以在已经备份的数据上执行：

恢复数据到不同大小和配置的群集中。
在相同或不同配置的群集中传输或者恢复数据到不同的 bucket。
恢复选定数据到相同或不同配置的群集一个不同的 bucket 中。

cbrestore 命令的基本用法:

cbrestore [options] [source] [destination]

Where:

[options]
    Options specifying how the information should be restored into the cluster. Common options include:

        --bucket-source

        Specify the name of the bucket data to be read from the backup data that will be restored.
        --bucket-destination

        Specify the name of the bucket the data will be written to. If this option is not specified, the data will be written to a bucket with the same name as the source bucket.
        --add

        Use --add instead of --set to avoid overwriting the existing items in the destination.

[source]
    The backup directory specified to cbbackup where the backup data was stored.

[destination]
    The REST API URL of a node within the cluster where the information will be restored.

cbrestore 命令一次只能恢复一个单独的bucket的数据。如果你备份了这个群集的bucket，则必须还原每个bucket到集群。所有目的bucket必须已经存在，因为 cbrestore 不会帮你要恢复的节点中创建配置bucket。

举个栗子
恢复一个单独的bucket 到群集中去：

cbrestore \
    /backups/backup-2012-05-10 \
    http://Administrator:password@HOST:8091 \
    --bucket-source=XXX
    [####################] 100.0% (231726/231726 msgs)
    bucket: default, msgs transferred...
    :                total |       last |    per sec
    batch :                  232 |        232 |       33.1
    byte  :             10247683 |   10247683 |  1462020.7
    msg   :               231726 |     231726 |    33060.0
    done

恢复bucket 到群集中不同的 bucket 中：

cbrestore \
    /backups/backup-2012-05-10 \
    http://Administrator:password@HOST:8091 \
    --bucket-source=XXX \
    --bucket-destination=YYY
    [####################] 100.0% (231726/231726 msgs)
    bucket: default, msgs transferred...
    :                total |       last |    per sec
    batch :                  232 |        232 |       33.1
    byte  :             10247683 |   10247683 |  1462020.7
    msg   :               231726 |     231726 |    33060.0
    done

上面 msg 行统计的信息是恢复到群集中bucket 中文档的数量。

在恢复的时候过滤 key

cbrestore命令包括对于在备份过程中创建的文件过滤 Key 还原到数据库。
这是除了在备份过程中可用的过滤支持的另一方面。

该规范是以一个正则表达式的形式在cbrestore命令提供的一个参数。
例如，将以object开头的key的数据还原到一个bucket中：

cbrestore /backups/backup-20120501 http://HOST:8091 \
    -u Administrator -p password \
    -b default \
    -k '^object.*'
    2013-02-18 10:39:09,476: w0 skipping msg with key: sales_7597_3783_6
    ...
    2013-02-18 10:39:09,476: w0 skipping msg with key: sales_5575_3699_6
    2013-02-18 10:39:09,476: w0 skipping msg with key: sales_7597_3840_6
    [                    ] 0.0% (0/231726 msgs)
    bucket: default, msgs transferred...
    :                total |       last |    per sec
    batch :                    1 |          1 |        0.1
    byte  :                    0 |          0 |        0.0
    msg   :                    0 |          0 |        0.0
    done

只复制匹配到特定前缀的 keys 到默认的 bucket 中。对于跳过的每个键，提供了一个信息消息。
其余的输出显示转移的记录和正常的摘要。

通过拷贝文件恢复数据

要将数据恢复到相同配置的群集中，需要先关闭整个群集。当你恢复完数据之后，你需要将整个群集启动。在这种情况下，您正在用备份的数据文件的备份版本替换整个群集数据和配置，
然后用保存的群集文件的版本重新启动群集。
重要提示: 确保任何恢复的文件数组指定了适当的Couchbase用户。

当在同一个群集中恢复数据时，验证以下过程：
备份和恢复必须使用相同版本的 Couchbase。
整个群集必须包含相同数量的节点。
每个节点必须有相同的IP 或者在备份的时候配置相同的主机名。
所有的 config.dat 配置文件以及所有数据库文件必须恢复到原来的位置。

完成还原过程所需的步骤是：
在所有节点上停止couchbase服务。
在每个节点上，从你的备份的文件中恢复数据库，stats.json，配置文件config.dat。
在每个节点上重启服务。

Parent topic: Backup and restore

源文： Restoring with cbrestore

Redis 监控技巧

Jeffrey Mar 14, 2016 NoSQL

本文来自 Bugsnag 的联合创始人 Simon Maynard 的系列文章，作者根据几年来对 Redis 的使用经历，对 Redis 监控方法进行了系统性的总结，干货很多，值得一看。

原文链接：Redis Masterclass – Part 2, Monitoring

Redis 监控最直接的方法当然就是使用系统提供的 info 命令来做了，你只需要执行下面一条命令，就能获得 Redis 系统的状态报告。

redis-cli info

内存使用

如果 Redis 使用的内存超出了可用的物理内存大小，那么 Redis 很可能系统会被 OOM Killer 杀掉。针对这一点，你可以通过 info 命令对 used_memory 和 used_memory_peak 进行监控，为使用内存量设定阈值，并设定相应的报警机制。当然，报警只是手段，重要的是你得预先计划好，当内存使用量过大后，你应该做些什么，是清除一些没用的冷数据，还是把 Redis 迁移到更强大的机器上去。

持久化

如果因为你的机器或 Redis 本身的问题导致 Redis 崩溃了，那么你唯一的救命稻草可能就是 dump 出来的 rdb文件了，所以，对 Redis dump 文件进行监控也是很重要的。你可以通过对 rdb_last_save_time 进行监控，了解你最近一次 dump 数据操作的时间，还可以通过对 rdb_changes_since_last_save 进行监控来知道如果这时候出现故障，你会丢失多少数据。

主从复制

如果你设置了主从复制模式，那么你最好对复制的情况是否正常做一些监控，主要是对 info 输出中的 master_link_status 进行监控，如果这个值是 up，那么说明同步正常，如果是 down，那么你就要注意一下输出的其它一些诊断信息了。比如下面这些：

role:slave master_host:192.168.1.128 master_port:6379 master_link_status:down master_last_io_seconds_ago:-1 master_sync_in_progress:0 master_link_down_since_seconds:1356900595

Fork 性能

当 Redis 持久化数据到磁盘上时，它会进行一次 fork 操作，通过 fork 对内存的 copy on write 机制最廉价的实现内存镜像。但是虽然内存是 copy on write 的，但是虚拟内存表是在 fork 的瞬间就需要分配，所以 fork 会造成主线程短时间的卡顿（停止所有读写操作），这个卡顿时间和当前 Redis 的内存使用量有关。通常 GB 量级的 Redis 进行 fork 操作的时间在毫秒级。你可以通过对 info 输出的 latest_fork_usec 进行监控来了解最近一次 fork 操作导致了多少时间的卡顿。

配置一致

Redis 支持使用 CONFIG SET 操作来实现运行实的配置修改，这很方便，但同时也会导致一个问题。就是通过这个命令动态修改的配置，是不会同步到你的配置文件中去的。所以当你因为某些原因重启 Redis 时，你使用 CONFIG SET 做的配置修改就会丢失掉，所以我们最好保证在每次使用 CONFIG SET 修改配置时，也把配置文件一起相应地改掉。为了防止人为的失误，所以我们最好对配置进行监控，使用 CONFIG GET 命令来获取当前运行时的配置，并与 redis.conf 中的配置值进行对比，如果发现两边对不上，就启动报警。

慢日志

Redis 提供了 SLOWLOG 指令来获取最近的慢日志，Redis 的慢日志是直接存在内存中的，所以它的慢日志开销并不大，在实际应用中，我们通过 crontab 任务执行 SLOWLOG 命令来获取慢日志，然后将慢日志存到文件中，并用 Kibana 生成实时的性能图表来实现性能监控。

值得一提的是，Redis 的慢日志记录的时间，仅仅包括 Redis 自身对一条命令的执行时间，不包括 IO 的时间，比如接收客户端数据和发送客户端数据这些时间。另外，Redis 的慢日志和其它数据库的慢日志有一点不同，其它数据库偶尔出现 100ms 的慢日志可能都比较正常，因为一般数据库都是多线程并发执行，某个线程执行某个命令的性能可能并不能代表整体性能，但是对

来说，它是单线程的，一旦出现慢日志，可能就需要马上得到重视，最好去查一下具体是什么原因了。

监控服务

-Sentinel

Sentinel 是 Redis 自带的工具，它可以对 Redis 主从复制进行监控，并实现主挂掉之后的自动故障转移。在转移的过程中，它还可以被配置去执行一个用户自定义的脚本，在脚本中我们就能够实现报警通知等功能。

-Redis Live

Redis Live 是一个更通用的 Redis 监控方案，它的原理是定时在 Redis 上执行 MONITOR 命令，来获取当前 Redis 当前正在执行的命令，并通过统计分析，生成web页面的可视化分析报表。

-Redis Faina

Redis Faina 是由著名的图片分享应用 instagram 开发的 Redis 监控服务，其原理和 Redis Live 类似，都是对通过 MONITOR 来做的。

数据分布

弄清 Redis 中数据存储分布是一件很难的是，比如你想知道哪类型的 key 值占用内存最多。下面是一些工具，可以帮助你对 Redis 的数据集进行分析。

-Redis-sampler

Redis-sampler 是 Redis 作者开发的工具，它通过采样的方法，能够让你了解到当前Redis 中的数据的大致类型，数据及分布状况。

-Redis-audit

Redis-audit 是一个脚本，通过它，我们可以知道每一类 key 对内存的使用量。它可以提供的数据有：某一类 key 值的访问频率如何，有多少值设置了过期时间，某一类 key 值使用内存的大小，这很方便让我们能排查哪些 key 不常用或者压根不用。

-Redis-rdb-tools

Redis-rdb-tools跟 Redis-audit 功能类似，不同的是它是通过对 rdb 文件进行分析来取得统计数据的。

来源: nosqlfan

Redis 安装与特性简述

Jeffrey Mar 14, 2016 NoSQL

Redis< Remote Dictionary Server >作为NoSQL数据库的一种应用，响应速度和命中率上还是比较高效的。
项目中需要用集中式可横向扩展的缓存框架，做了一点调研，即便redis、memcached存在效率上的差异（具体比较参考http://timyang.net/data/mcdb-tt-redis/），但其实都能满足目前项目的需求；但是redis还是比较风骚的，支持链表和集合操作，支持正则表达式查找key，目前项目缓存的结果大多是链表，如果链表新增或者修改数据的话，redis就体现出了极大的优势（memcached只能重新加载链表，redis可以对链表新增或者修改）

What is Couchbase Server

Jeffrey Oct 19, 2015 NoSQL


Developer(s)	Couchbase, Inc.
Stable release	3.0.3 / March 28, 2015
Written in	C++, Erlang, C [1]
Operating system	Cross-platform
Type	distributed key-value / document database system
License	Apache License (Open Source edition), Proprietary (Free Community edition and Paid Enterprise edition)
Website	www.couchbase.com

Couchbase Server, originally known as Membase, is an open-source, distributed (shared-nothing architecture) NoSQL document-oriented database that is optimized for interactive applications. These applications must serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase is designed to provide easy-to-scale key-value or document access with low latency and high sustained throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Couchbase Server provides on-the-wire client protocol compatibility with memcached,[2] but is designed to add disk persistence, data replication, live cluster reconfiguration, rebalancing and multitenancy with data partitioning.

In the parlance of Eric Brewer’s CAP theorem, Couchbase is a CP type system meaning it provides consistency and partition tolerance. However Couchbase Server can be set up as an AP system with multiple clusters using XDCR (Cross Data Center Replication).

Product history

Membase was developed by several leaders of the memcached project, who had founded a company, NorthScale, to develop a key-value store with the simplicity, speed, and scalability of memcached, but also provided the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors Zynga and NHN to a new project on membase.org in June 2010.

On February 8, 2011, the Membase project founders and Membase, Inc. announced a merger with CouchOne (a company with many of the principal players behind CouchDB) with an associated project merger. The merged company was called Couchbase, Inc. In January 2012, Couchbase released Couchbase Server 1.8. On December 2012, Couchbase Server 2.0 was released, with new features including a new JSON document store, indexing and querying, incremental MapReduce and cross datacenter replication.[3]

High-level architecture

Every Couchbase node is architecturally identical consisting of a data manager and cluster manager component.

Cluster manager

The cluster manager supervises the configuration and behavior of all the servers in a Couchbase cluster. It configures and supervises internode behavior like managing replication streams and rebalancing operations. It also provides metric aggregation and consensus functions for the cluster, and a RESTful cluster management API. The cluster manager is built atop Erlang/OTP, a proven environment for building and operating fault-tolerant distributed systems.

Replication and failover

Multi-model replication support: Peer-to-peer replication support with underlying architecture supporting master-slave replication
Configurable replication count: Balance resource utilization with availability requirements
High-speed failover: Fast failover to replicated items based upon request
XDCR: Cross Data Centre Replication [4]

Data manager

The data manager is responsible for storing and retrieving documents in response to data operations from applications.

Asynchronously writes data to disk after acknowledging write to client. In version 1.7 and later, applications can ensure data is synced to more than one server, while disk writes are still asynchronous.
- Tunables to define item ages that affect when data is persisted.
Supports working set greater than a memory quota per "node" or "bucket"
- Tunables to affect how max memory and migration from main-memory to disk is handled.
Configurable “tap” interface: External systems can subscribe to filtered data streams supporting, for example, full text search indexing, data analytics or archiving.[5]

Data format

A document is the most basic unit of data manipulation in Couchbase Server. Documents are stored in JSON document format with no predefined schemas.

Object-managed cache

Couchbase Server includes a built-in multithreaded object-managed cache that implements memcached compatible APIs such as get, set, delete, append, prepend etc.

Storage engine design

Couchbase Server has a tail-append storage design that is immune to data corruption, OOM killers or sudden loss of power. ["immune" to sudden loss of power is a claim that would need to be substantiated or documented, since appending to a file could still result in loss of data if power was interrupted in the middle of a disk write operation] Data is written to the data file in an append-only manner, which enables Couchbase to do mostly sequential writes for update, and provide an optimized access patterns for disk I/O.

Performance

Cisco published a benchmark that measures the latency and throughput of Couchbase Server [6] with a mixed workload. Another performance benchmark done by Altoros, compares Couchbase Server with other NoSQL database solutions.[7]

Licensing and support

Couchbase Server is a packaged version of Couchbase's open source technology and is available in two variants: a Community Edition without recent bug fixes as Open Source (Apache 2.0 license[8]) distribution, and an Enterprise Edition for commercial use.[9]

Couchbase Server builds are available for Ubuntu, Debian, Red Hat, Windows and Mac OS X platforms.

Bibliography

Brown, MC (June 22, 2012). Getting Started with Couchbase Server (1st edition). O'Reilly Media. p. 88. ISBN 978-1449331061.
"Balancing Oracle and open source at Orbitz". GigaOM. 21 September 2012.

References

"The Unreasonable Effectiveness of C". Damien Katz. 2013-01-08. Retrieved 2013-06-04.
"NewProtocols - memcached - Klingon - Memcached - Google Project Hosting". Code.google.com. 2011-08-22. Retrieved 2013-06-04.
"Couchbase 2.0 released; implements JSON document store". ZDNet. 12 December 2012.
http://www.couchbase.com/wiki/display/couchbase/XDCR+Protocol
Want to know what your memcached servers are doing? Tap them.
"Cisco and Solarflare Achieve Dramatic Latency Reduction for Interactive Web Applications with Couchbase, a NoSQL Database" (PDF). Cisco Systems. Retrieved 2013-06-04.
"Benchmarking Couchbase". Couchbase. 2012-10-30. Retrieved 2015-03-04.
"Couchbase Open Source Project".

"Couchbase Server Editions". Couchbase.

External links

Official website
Couchbase Twitter page
Couchbase Blog page
Couchbase Events page
Couchbase Developer SDK's
Couchbase Documentation
Trans from: https://en.wikipedia.org/wiki/Couchbase_Server

How-to: install mongo and php support

Jeffrey Oct 13, 2015 NoSQL

MongoDB可以从开放源代码来建构与安装，更常见的是安装binary文件，目前有Windows, Linux, OS X和Solaris版本。许多Linux包管理系统现在已包含了MongoDB的包，包括CentOS和Fedora,[1] Debian和Ubuntu,[2] Gentoo [3]以及Arch Linux。[4] 同样可从官方网站获取。[5]

MongoDB使用内存映射文件, 32位系统上限制大小为2GB的数据 (64-比特支持更大的数据).[6] MongoDB服务器只能用在小端序系统，虽然大部分公司会同时准备小端序和大端序系统。