[译] How-to Send Rsyslog directly to Elasticsearch

License: Attribution-NonCommercial-ShareAlike 4.0 International

本文出自 Suzf Blog。如未注明，均为 SUZF.NET 原创。

转载请注明：http://suzf.net/post/1375

Originally posted on the server as code: Rsyslog to Elasticsearch

长时间以来，系统管理部门已经知道，集中日志很重要，无论是故障排除还是出于安全考虑。在我看来，我不仅要集中日志，还要让它们可以搜索。（在集中式日志文件上的 grep 不是很强大，但这不是我正在寻找的解决方案。）

Elasticsearch

... is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

它提供的几个基础的功能有：存储/索引/搜索数据。它是典型 ELK Stack（Elasticsearch, Logstash, and Kibana ）的一部分。作为系统管理员，我认为还需要注意的是，当用作集群时，Elasticsearch是分布式系统。不应该轻率地添加另一个分布式系统。作为一个例子，我现在有了生产分布式系统的MySQL Galera，RabbitMQ，Nomad和Elasticsearch。

Rsyslog directly to Elasticsearch

文章这点主要讲得是如何使用 rsyslog 直接发送日志到 Elasticsearch 群集中去。目前我并不想使用 ELK Stack 中的 logstash。我只是想使用 rsyslog 将日志信息直接存到 Elasticsearch, 使用 Kibana 作为图像接口搜索日志。以后的话我可能会使用 Logstash，但是目前不会。我已经写了一个非常简单的 Ansible role 来配置 rsyslog 将日志发送到 Elasticsearch。我在Elasticsearch 集群之前运行着 Haproxy 作为 LB，所以我将所有日志发送到 Haproxy 虚拟IP。但除此之外，这是我使用的，它已经采取了数百万的日志，几周大概 30/sec。

omelasticsearch

omelasticsearch 可以做所有的工作。您只需要一个最近的 rsyslog 版本来获得该模块。然后添加一个指向您的 Elasticsearc h集群的配置文件，然后就完成了。相当简单。

# rpm -qi rsyslog-elasticsearch
Name        : rsyslog-elasticsearch
Version     : 7.4.7
Release     : 16.el7
Architecture: x86_64
Install Date: Wed 19 Apr 2017 05:19:15 PM CST
Group       : System Environment/Daemons
Size        : 40800
License     : (GPLv3+ and ASL 2.0)
Signature   : RSA/SHA256, Mon 21 Nov 2016 04:36:22 AM CST, Key ID 24c6a8a7f4a80eb5
Source RPM  : rsyslog-7.4.7-16.el7.src.rpm
Build Date  : Sun 06 Nov 2016 06:52:43 AM CST
Build Host  : worker1.bsys.centos.org
Relocations : (not relocatable)
Packager    : CentOS BuildSystem <http://bugs.centos.org>
Vendor      : CentOS
URL         : http://www.rsyslog.com/
Summary     : ElasticSearch output module for rsyslog
Description :
This module prov

下面是配置文件

# cat /etc/rsyslog.d/elasticsearch.conf 
module(load="omelasticsearch") # Elasticsearch output module

# this is for index names to be like: rsyslog-YYYY.MM.DD
template(name="rsyslog-index"
  type="list") {
    constant(value="rsyslog-")
    property(name="timereported" dateFormat="rfc3339" position.from="1" position.to="4")
    constant(value=".")
    property(name="timereported" dateFormat="rfc3339" position.from="6" position.to="7")
    constant(value=".")
    property(name="timereported" dateFormat="rfc3339" position.from="9" position.to="10")
}


# template to generate JSON documents for Elasticsearch in Logstash format
template(name="plain-syslog"
  type="list") {
    constant(value="{")
    constant(value="\"@timestamp\":\"")         property(name="timereported" dateFormat="rfc3339")
    constant(value="\",\"host\":\"")            property(name="hostname")
    constant(value="\",\"severity-num\":")      property(name="syslogseverity")
    constant(value=",\"facility-num\":")        property(name="syslogfacility")
    constant(value=",\"severity\":\"")          property(name="syslogseverity-text")
    constant(value="\",\"facility\":\"")        property(name="syslogfacility-text")
    constant(value="\",\"syslogtag\":\"")       property(name="syslogtag" format="json")
    constant(value="\",\"message\":\"")         property(name="msg" format="json")
    constant(value="\"}")
  }

action(type="omelasticsearch"
  server="{{ elastic_search_ip }}"
  serverport="9200"
  template="plain-syslog"  # use the template defined earlier
  searchIndex="rsyslog-index"
  dynSearchIndex="on"
  searchType="events"
  bulkmode="on"                   # use the Bulk API
  queue.dequeuebatchsize="5000"   # ES bulk size
  queue.size="100000"   # capacity of the action queue
  queue.workerthreads="5"   # 5 workers for the action
  action.resumeretrycount="-1"  # retry indefinitely if ES is unreachable
  errorfile="/var/log/omelasticsearch.log"
)

之后重启 rsyslog

# systemctl restart rsyslog.service

Curator

您可能要删除输入到ES群集中的日志，除非您有很多存储空间。我正在使用 Curator 来做这项工作。上面的 omelastic 模块配置添加了名为“rsyslog-YYY.MM.DD”的索引，您可以将该模式与curator 操作文件一起删除超过一定天数的索引。我有一个日常计划任务运行在 Nomad 群集用来删除30天以上的索引。

# curl  -XGET 'elasticsearch-vip:9200/_cat/indices?v'
health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   logstash-apache-2017.04.19 VLdLC0VjQBymLDJDdBjtHw   5   1          7            0    199.7kb         99.8kb
green  open   logstash-apache-2017.04.20 gnUiAG9HSjudsqMvvT4hVA   5   1          7            0    199.6kb         99.8kb
green  open   .kibana                    5c5blDjUSOuXRaMYnx46RQ   1   1          3            1     34.5kb         17.2kb
green  open   rsyslog-2017.04.20         7F-uthdfR82Djpxyry9FFg   5   1      56317            0     24.7mb         12.3mb
green  open   logstash-log-2017.04.20    nIL4b3IxQyaPnWa2oQhbmg   5   1     420380            0    118.6mb         59.3mb
green  open   rsyslog-2017.04.19         zP3H1YzUQpaSBri9rg1Qhg   5   1      24006            0     10.9mb          5.5mb
green  open   logstash-log-2017.04.19    wWcjeyRtSe-uBzLzDtjl-Q   5   1     928212            0    224.5mb        112.3mb

Performance

我也应该注意到，这个配置可能没达到你预期的效果。有一些考虑因素：在索引数量，分片数量及其大小方面；以及它们与ES集群随时间的演进有关。我怀疑这个例子在大型系统中将会有用，并且需要大量的调整。请记住这一点。 :)

Conclusion

这是对于 Elasticsearch 一个非常简单的例子。但是，我认为这是有帮助的，因为它可以帮助您快速运行，并存储到 Elasticsearch，从此您可以在ES上学习改进方面。这是我的计划。 :) Reference [0] http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch [1] http://www.rsyslog.com/doc/v8-stable/configuration/modules/omelasticsearch.html