Fork me on GitHub
Suzf  Blog

[Forward] Centralized logging for fun and profit

Originally posted on Centralized logging for fun and profit

Setting up a centralized log server using syslog isn't as hard as many may believe. Whether it's logs from Apache, nginx, email services, or even from your own Python applications having a central log server gives you many benefits:

Benefits to a centralized logs

  • Reduces disk space usage and disk I/O on core servers that should be busy doing something else. This is especially true if you want to log all queries to your database. Doing this on the same disk as your actual database creates a write for every read and an extra write for every write.
  • Removes logs from the server in the event of an intrusion or system failure. By having the logs elsewhere you at least have a chance of finding something useful about what happened.
  • All of your logs are in one place, duh! This makes things like grepping through say Apache error logs across multiple webservers easier than bouncing around between boxes. Any log processing and log rotation can also be centralized which may delay your sysadmin from finally snapping and killing everyone.

Syslog Review

In case you aren't terribly familiar with how syslog works, here's a quick primer. Syslog separates out various logs using two items. Facilities and Levels. Here are the standard facilities:

  • 0 kernel messages
  • 1 user-level messages
  • 2 mail system
  • 3 system daemons
  • 4 security/authorization messages
  • 5 messages generated internally by syslogd
  • 6 line printer subsystem
  • 7 network news subsystem
  • 8 UUCP subsystem
  • 9 clock daemon
  • 10 security/authorization messages
  • 11 FTP daemon
  • 12 NTP subsystem
  • 13 log audit
  • 14 log alert
  • 15 clock daemon
  • 16 local use 0 (local0)
  • 17 local use 1 (local1)
  • 18 local use 2 (local2)
  • 19 local use 3 (local3)
  • 20 local use 4 (local4)
  • 21 local use 5 (local5)
  • 22 local use 6 (local6)
  • 23 local use 7 (local7)

For each facility logs are sent using a particular level, the levels are:

  • 0 Emergency: system is unusable
  • 1 Alert: action must be taken immediately
  • 2 Critical: critical conditions
  • 3 Error: error conditions
  • 4 Warning: warning conditions
  • 5 Notice: normal but significant condition
  • 6 Informational: informational messages
  • 7 Debug: debug-level messages

So for any given log message you set these two options to give a hint as to where the logs should be directed. For example, if an email server receives a new message it would likely be sent as mail.info and a kernel panic would be sent using kern.emerg The receiving syslog server then can be configured to direct log messages of a certain facility and/or log level to various files. For example, a default Ubuntu system has some settings like this:

daemon.*        /var/log/daemon.log
kern.*          /var/log/kern.log
mail.*          /var/log/mail.log

But you can also do more granular separation for example you might want to log mail.err into a separate file from the main mail logs to make it easier to spot new errors with this:

mail.*        /var/log/mail.log
mail.err      /var/log/mail-errors.log

Setting up your central server

Configuring the master log server is pretty easy. On Ubuntu the default syslog server is rsyslog and that's what I'll be using as an example here. You'll need to edit /etc/rsyslog.conf and uncomment the UDP module. You could also use the TCP module, but that one binds to all of your interfaces so you will need to restrict access to it with iptables (or some other mechanism) in order to not allow hackers to fill up your disks remotely. So your configuration should now contain these uncommented lines, where 'x' is an internal protected IP address:

$ModLoad imudp
$UDPServerAddress x.x.x.x
$UDPServerRun 514

And then restart rsyslogd. See that wasn't so hard...

Setting up the remote log sending servers

Setting up your remote servers is even easier. If you want to send ALL of your logs to the central server it's just a matter of adding one line to the top of /etc/rsyslog.d/50-default.conf. That line is:

*.* @x.x.x.x:514

This will send all logs of any facility and any level to the server. Note that the local syslog will, as configured by default, still log locally. So if you don't want that be sure to remove all of the other configuration in this file. You can also get fancy here and keep some logs on the local server and only send some things remotely. For most of your custom apps and logs you'll want to be using the LOCAL[0-9] facilities. Let's say we're going to want to centrally log our Python logs and Apache error logs. We'll be using LOCAL0 and LOCAL1 for them respectively. That config would look like:

local0.* @x.x.x.x:514
local1.* @x.x.x.x:514

Keep in mind however that most systems have *.info, *.debug, etc. configurations setup so you might be duplicating your data. If you poke around this file you'll see lots of configurations ending in .none, this instructs rsyslog to not include those facilities in this particular file. So for example, you'd want to edit your /var/log/syslog to resemble this:

*.*;auth,authpriv,local0,local1.none        /var/log/syslog

Additional help and features

While most applications are easy to setup for use with syslog, here are some pointers for more info on the subject:

  • Apache support sending error logs to syslog via the ErrorLog syslog:local1 configuration option. However, it does not support sending access logs directly. To do that you'll need a small script and pipe your access logs through it.
  • For more information on setting up your own Python code to use syslog, check out the logging.handlers.SysLogHandler handler for the logging module.

We've only really scratched the surface of the features of rsyslog with this setup. You can configure it to do some fairly advanced separation of logs based on the sending host, application name, and other various aspects of the message itself. Refer to the rsyslog documentation for more information on that. Happy Logging!