Using jq with Bro logs

We switched to JSON output from Bro when we started feeding logs into an ELK cluster. While I can still grep the raw logs, it’s a bit ugly. The ever-brilliant Justin Azoff recommended jq to me. I played around a bit with this today. The current release version of jq is 1.4, which doesn’t seem to have gmtime(). I checked out the git version and that worked, or you can get 1.5rc(whatever) from the releases page. By the way, those of you who know me, know that I generally despise the “cat file | something” convention, but I’m using it here. Sorry, not sorry.

If you just want to get the timestamps and convert them to human-readable (assuming you don’t think in epoch):

$ cat test | jq -c '.ts | gmtime'
[2015,7,2,15,3,56.80596899986267,0,213]
[2015,7,2,15,3,58.65949010848999,0,213]
[2015,7,2,15,5,30.574909925460815,0,213]
$

I used the -c flag so it wouldn’t actually break each timestamp up one line per element. Still, boring – who just wants slightly-more-readable-than-epoch timestamps? Timestamp, source and destination IPs, and the requested URI:

$ cat test | jq -c '[(.ts | gmtime), ."id.orig_h", ."id.resp_h", .uri]'
[[2015,7,2,15,3,56.80596899986267,0,213],"10.0.0.2","50.56.3.60","/sites/default/files/css/css_7KJjOARp2EdJe7HGme_KJe6Y7Rq6npDiv9Uq6onbQY0.css"]
[[2015,7,2,15,3,58.65949010848999,0,213],"10.0.0.2","50.56.3.60","/sites/all/themes/custom/tbs_v03/css/i/navbar/icon-facebook.png"]
[[2015,7,2,15,5,30.574909925460815,0,213],"10.0.0.2","160.153.16.10","/wordpress/wp-content/plugins/cookie-law-info/css/cli-style.css?ver=1.5.2"]
$

(I was checking the Beer Store hours for this holiday weekend.) The square brackets around the fields collects everything into a single array, the quotation marks around the field names tell jq to interpret those keys as literals – otherwise it gets angry about the dots in them. Parentheses allow piping the timestamp to the gmtime function.

gmtime output is pretty ugly though.

$ cat test | jq -c '[(.ts | todateiso8601), ."id.orig_h", ."id.resp_h", .uri]'
["2015-08-02T15:03:56Z","10.0.0.2","50.56.3.60","/sites/default/files/css/css_7KJjOARp2EdJe7HGme_KJe6Y7Rq6npDiv9Uq6onbQY0.css"]
["2015-08-02T15:03:58Z","10.0.0.2","50.56.3.60","/sites/all/themes/custom/tbs_v03/css/i/navbar/icon-facebook.png"]
["2015-08-02T15:05:30Z","10.0.0.2","160.153.16.10","/wordpress/wp-content/plugins/cookie-law-info/css/cli-style.css?ver=1.5.2"]
$

Note the timestamps actually are in UTC. Depending on your use case, you may want to omit the -c flag – compact output including any of the longer fields isn’t a lot better to read than the raw JSON logs.

Using Bro to load balance

If your hardware doesn’t support HLB, or if you for whatever reason don’t want to use that load balancing, I’ve had good success with a Bro configuration that Seth Hall wrote for me. As background, my NIC is (currently) sending a full copy of the input stream to each of a dozen output streams. I’m going to be implementing HLB on my NIC, so I wanted to keep Seth’s hard work around somewhere that it might also do somebody else some good.

This configuration allows for six workers. If you want a differing amount, change the total_lb_procs and the integers at the end of each restrict_filters statement appropriately.

event bro_init() &priority=-12
 {
local total_lb_procs = 6;

if ( Cluster::node == "worker-1" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 0) );
if ( Cluster::node == "worker-2" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 1) );
if ( Cluster::node == "worker-3" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 2) );
if ( Cluster::node == "worker-4" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 3) );
if ( Cluster::node == "worker-5" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 4) );
if ( Cluster::node == "worker-6" )
    restrict_filters = table(["lb_filter"] = fmt("(ip[14:2]+ip[18:2]) - (%d*((ip[14:2]+ip[18:2])/%d)) == %d", total_lb_procs, total_lb_procs, 5) );

PacketFilter::install();
}

Update 16 September – Seth tells me that this is a terrible way to balance in Bro – he had some problems with this at another high-volume institution. Well, it worked for me. :)

Automated rsync backups with ssh key restrictions

For the first time ever I wanted to make an rsync script to back up a couple of remote servers, restricting the commands by the use of a key. I wanted to restrict the commands that could be run with that key in case of compromise, since there needs to be no passphrase on the key. I’m not going to explain the theory or most of the commands, since you (I) already know.

Doing some googling, I found this which was pretty close, but I wanted it here (so I could find it again) and with fewer words. I ripped off the validatersync.sh script wholesale:

#!/bin/sh

case "$SSH_ORIGINAL_COMMAND" in
*\&*)
echo "Rejected"
;;
*\(*)
echo "Rejected"
;;
*\{*)
echo "Rejected"
;;
*\;*)
echo "Rejected"
;;
*\<*)
echo "Rejected"
;;
*\`*)
echo "Rejected"
;;
*\|*)
echo "Rejected"
;;
rsync\ --server*)
$SSH_ORIGINAL_COMMAND
;;
*)
echo "Rejected"
;;
esac

There’s probably some holes in it, but it’s close enough for government work. Then, add to authorized_keys:

from="hostname",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command="/path/tovalidatersync.sh" thebackupkey

And a sample backup script:

#!/bin/bash
LOGF=/path/to/LogFile
MYD=`date`
echo "Starting rsync at ${MYD}" >> ${LOGF}
/usr/bin/rsync -q -a --delete -e "ssh -i /the/.ssh/backup_key" userid@remote:/home/asdf/ asdf/
MYD=`date`
echo "Finished at ${MYD}" >> ${LOGF}

Call that in cron and you (I) should be good to go.

ETA: you might get “protocol mismatch” errors from rsync. TFM will tell you it’s because there’s output from your shell. TFM might be wrong. I’m still getting this error from one host I’m doing this with, but not the other. Since both are FreeBSD 8.4 machines, I’m somewhat mystified. Anyway, this might be enough to get started.