TriLUG

dnsmasq + unbound

Recently, our local Linux Users Group was talking about DNS servers.  Some folks in the group claimed that their ISP’s DNS servers were very slow.

In a group like this, there is usually a camp that are strong supporters of running BIND.  Somehow, I have never been able to wrap my head around BIND.  Instead, I have been using dnsmasq.  These two packages are very different.

BIND

BIND is a fully recursive DNS resolver.  When you look up a name like “www.cnn.com”, it goes to “com” to ask who “cnn” is, and then it goes to “cnn.com” to ask who “www.cnn.com” is.  BIND has a steep learning curve, and that has always discouraged me from really tinkering with it.  It also misses a very important point that my home network needs — local name resolution of DHCP-assigned addresses.

dnsmasq

Dnsmasq is more of a caching DNS server for a local network.  It has a built-in DHCP server, so devices on my home network get their addresses from dnsmasq.  When I make a DNS request, dnsmasq looks in its local DHCP table first.  For example, if I want to talk to another device in the same room, like a Roku or a printer, dnsmasq knows the addresses of the local devices and it responds immediately.  If the request is not a local name, it simply passes on the request to some other name server… maybe your ISP’s, or maybe a free server like OpenDNS or Google’s 8.8.8.8.  Dnsmasq caches all DNS requests, so if you make repeated requests to the same site, they are answered pretty quickly.

I really like dnsmasq.

It is super flexible, and you configure it through a single configuration file which is super easy to understand.  In fact, many home routers use dnsmasq under the hood.

unbound

But during the discussion in our LUG, someone mentioned unbound, another fully recursive DNS server that is super easy to set up.  So I had to try it out.  It did not disappoint.

My Setup

So how do these two tools work together?

Actually, it’s quite elegant.  Dnsmasq listens on port 53 of all addresses on my router.  It is the primary DNS server for all machines on my local network.  If the request is for a local device, then it fills the request immediately.  But if the request is for some site on the internet, then it passes the request off to unbound, which is also running on the router, but listening on a different address/port combination.

Here is how I configured dnsmasq.

/etc/dnsmasq.conf

# --- DNS ----------------------------

# Be a good netizen, keep local stuff local.
domain-needed
bogus-priv
filterwin2k

# Do not listen on "all" interfaces and just filter.
bind-interfaces
# Listen on port 53 on in-home network (eth1) and localhost (lo).
# Do not listen on internet interface (eth0).
interface=lo
interface=eth1

# Upstream servers are not listed in resolv.conf, they are listed here.
no-resolv
server=127.0.0.1#10053 # unbound

# Add this domain to all simple names in the hosts file.
# (Also sets the domain (15) option for DHCP).
expand-hosts
domain=home.alanporter.com

# Special treatments for some domains and hosts.
local=/local/ # dnsmasq handles these itself
server=/alanporter.com/69.93.127.10 # look up via ns1.linode.com
address=/doubleclick.net/127.0.0.1 # return this address immediately
address=/sentosa.us/108.161.130.139 # return this address immediately
cname=oldname.home.alanporter.com,newname.home.alanporter.com

# Logging
log-queries
log-facility=local1

# Caching
cache-size=1000

# --- DHCP ---------------------------

dhcp-range=FunkyNet,172.31.1.100,172.31.1.199,10m
dhcp-option=FunkyNet,1,255.255.255.0 # subnet mask - 1
dhcp-option=FunkyNet,3,172.31.1.1 # default router - 3
dhcp-option=FunkyNet,6,172.31.1.1 # DNS server - 6
dhcp-option=FunkyNet,15,home.alanporter.com # domain name - 15
dhcp-option=FunkyNet,28,172.31.1.255 # broadcast address - 28

dhcp-leasefile=/var/lib/dnsmasq.leases
read-ethers

# reserved names and addresses
dhcp-host=d8:5d:4c:93:32:41,chumby
dhcp-host=00:50:43:00:02:02,sheeva,172.31.1.3,10m

# --- PXE ----------------------------

dhcp-boot=pxelinux.0,bender,172.31.1.1

So dnsmasq listens on the local network for requests, answers what it can: local DHCP addresses, cached addresses and special overrides from the config file. And anything it can’t handle itself, it sends on upstream to unbound.

/etc/unbound/unbound.conf

server:
    # perform cryptographic DNSSEC validation using the root trust anchor.
    auto-trust-anchor-file: "/var/lib/unbound/root.key"
    # listen on local network, allow local network access
    interface: 127.0.0.1
    access-control: 127.0.0.0/8 allow
    # NOT listening on IPv6
    # interface: ::1
    # access-control: ::1 allow
    port: 10053
    # logging
    chroot: ""
    logfile: "/var/log/unbound.log"
    log-time-ascii: yes
    log-queries: yes
    verbosity: 2

As you can see, unbound does not require much configuration.

Notice that I am NOT listening on the IPv6 interface. It turns out, there is no need. Dnsmasq listens on both, and it forwards A requests and AAAA requests to unbound over an IPv4 connection on the local “lo” adaptor.

How it stacks up

So how well does this setup work? Are there advantages or disadvantages to using dnsmasq and unbound together?

Disadvantages

I tested this setup using “namebench“, a Google “20 percent” project that measures DNS lookup times. It told me that Google’s public DNS (8.8.8.8) was 250% faster than my in-home DNS. Furthermore, it said I would be better off using my ISP’s DNS servers. I am guessing that this is because these larger DNS servers cache a much larger pool of addresses, bypassing full recursive lookups of most common names.

Advantages of dnsmasq + unbound

If my setup is slower than using a single upstream DNS, then why should I run mine this way? I have a few reasons.

  • First and foremost, I learn a lot about DNS this way.
  • But also worth considering, ISP nameservers are notoriously flaky. Just because the ISP beat my nameserver on a single test, that does not mean it will always do so. That’s like comparing the bus to driving your own car… it might be better sometimes, but really bad other times.
  • One compelling reason to run a recursive DNS server like unbound is that you know you’re getting the right answer. When you use an ISP’s DNS server, they may hijack some domains and give you an incorrect answer on purpose. For example, they may censor content, and return a bogus landing page address for addresses that are on their black list. OpenDNS touts this as a feature… it is more “family-friendly” than raw DNS.
  • If you’re the tinfoil hat type, you might not want to use a DNS service from someone like Google, who makes their money from knowing more about your browsing habits than you do. Or from your ISP, who is always trying to up-sell you with something.

Advantages of dnsmasq + any upstream DNS

  • Dnsmasq (whether I use an upstream DNS or unbound) gives me control over how stuff is looked up. For example, when I was working on a new web site, I could tell dnsmasq to use the hosting company’s DNS for that one domain, so I did not have to wait for caches to expire between me and the host.
  • Dnsmasq caches lookups. Actually, unbound does, too. I am still playing with both.
  • Dnsmasq make switching DNS providers really easy. Say your ISP’s nameservers are acting up… just change one line in dnsmasq.conf and start getting results from somewhere else.

South East Linux Fest

I enjoyed a “Geekin’ Weekend” at South East Linux Fest in Spartanburg SC.

Four of us from TriLUG (Kevin Otte, Jeff Shornick, Bill Farrow and myself) packed into the minivan and made a road trip down to South Carolina on Friday.  We got there in time to see some of the exhibits and a few of the Friday afternoon sessions.  There was some light mingling in the evening, and then we all wrapped it up to prepare for the big day ahead.

On Saturday, we had breakfast with Jon “Maddog” Hall before he gave the keynote on how using open source can help create jobs.  The day was filled with educational sessions.  I attended ones on SELinux, Remote Access and Policy, FreeNAS, Arduino hacking, and Open Source in the greater-than-software world.  We wrapped it up with a talk on the many ways that projects can FAIL (from a distro package maintainer’s view).  But the night was not over… we partied hard in the hotel lounge, rockin’ to the beats of nerdcore rapper “Dual Core”, and then trying to spend the complimentary drink tickets faster than sponsor Rackspace could purchase them.

Sunday was much slower paced, as many had left for home and many others were sleeping off the funk of the previous night’s party.  But if you knew where to be, there were door prizes to be scored.  I ended up with a book on Embedded Linux.

It was a memorable weekend, for sure.  We learned a lot of new tech tricks, and we enjoyed hanging out with the geeks.

Back to the Future

A few days ago, I learned a very important lesson about filesystems and snapshots. I learned that a complete copy is not always a Good Thing™.

I help manage a server for our local Linux Users Group. We have about 250 users on the system, and all of our system administration is done by volunteers.

A few months ago, I made a complete backup of our /home partition using the guidelines that have been told to me by Smart People™:

  • make a snapshot volume of /home (called home-snap)
  • make a new empty volume (called home-backup)
  • use ‘dd‘ to copy from home-snap to home-backup
  • remove the home-snap snapshot volume

All was fine, until a few months later, when we decided to reboot.

When the machine rebooted, it mounted the WRONG copy of /home. It looked in /etc/fstab to see what to mount, read the UUID, and started looking for that filesystem among the logical volumes.

Here’s a list of the available filesystems and their UUID’s.

root@pilot:~# blkid
/dev/mapper/vg01-home: UUID="1a578e6f-772b-4892-86e3-1181aadda119" TYPE="ext3" SEC_TYPE="ext2"
/dev/mapper/vg01-home-backup: UUID="1a578e6f-772b-4892-86e3-1181aadda119" TYPE="ext3" SEC_TYPE="ext2"
/dev/mapper/vg01-swap: TYPE="swap" UUID="303f2743-da69-466b-a200-40a1a369fa1c"
/dev/mapper/vg01-u804: UUID="b5689a93-b7ad-4011-a0f9-ffaf2d68bf6f" TYPE="ext3"
/dev/sdb: UUID="Uh0TI1-pxD4-M1Pm-5kP3-zU1a-IRgm-bD0JAq" TYPE="lvm2pv"
/dev/sda: UUID="9oZhBo-3DPP-1eay-kgGM-fd06-yuJB-c2eCo7" TYPE="lvm2pv"
/dev/sdc1: UUID="5c15308e-a81b-4fd9-b2c2-7ef3fe39ce0b" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdc2: TYPE="swap" UUID="08c55fa5-3379-4f6a-b798-4b8f3ead6790"
/dev/sdc3: UUID="5a544a7f-90ed-474c-b096-1b5929c83109" SEC_TYPE="ext2" TYPE="ext3"
root@pilot:~#

Notice anything goofy? Yes, the UUID for the home volume is the same as the UUID for the home-backup volume! Of course it is… I used ‘dd‘ to copy the entire volume!

So our machine booted up, looked for a filesystem whose UUID was ‘1a578e6f-772b-4892-86e3-1181aadda119’ and it mounted it on /home. Unfortunately, it found the home-backup volume before it found the real home volume, and so our 250 users took a step back in time for the evening.

All of the files in our home directories looked like they did back in May.

On the surface, this does not seem like such a Bad Thing™. But over the course of the next few hours, users started receiving email, and logging IRC chats, and doing all of the other things that users do. These new emails and log files were written to home-backup instead of home, and so now we were starting to mix old and new files.

This is a lot like the movie “Back to the Future”, when Marty’s mom tries to kiss him. Except the characters involved here are not as good-looking.

The fix was quick and painless. I simply generated a new UUID for the home-backup volume, and then rebooted. The magic command is simply:

 tune2fs -U random /dev/mapper/vg01-home-backup

But the cleanup would come later. If someone were interested in the emails or log files that were mistakenly written to the wrong volume (their “past life”), then they would need to look on that volume for “new” files. Pretty easy work.

find /mnt/home-backup/porter -mtime -7

This will show all files in my “backup” home directory that are less than a week old. Since the backup was made four months ago, I would expect all files in that directory to either be more than four months old, or just one day old. This command will show you the new files.

So I am revising the backup procedure as follows:

  • make a snapshot volume of /home (called home-snap)
  • make a new empty volume (called home-backup)
  • use ‘dd‘ to copy from home-snap to home-backup
  • remove the home-snap snapshot volume
  • change the UUID on home-backup ◄— new

In fact, now that we already have a base to work with, I might just use rsync to copy files instead of dd to copy the entire volume. This will leave the backup with its own UUID, and will avoid collisions like the one we saw.

Pwn3d

I just spent the entire weekend re-building a server for the Triangle Linux Users Group.

We first noticed that something was wrong when the machine stopped responding over the network. A couple of our admins took a trip to the data center and noticed that we had a firehose of data on port 6667 (an IRC port), originating from a process owned by the “apache” user.

So we’d been pwned. Now what?

We figured the best way to proceed would be a complete re-install of the operating system. I happened to be free the next day, so I was volunteered to lead in the clean-up duty.

So I drove out to the data center to camp out in the cold air conditioning for a while. I saved away the old infected partitions (we use LVM) and I allocated new space for the fresh install. After I had the OS installed and responding over the network, I went home to finish. I worked frantically over the weekend to restore many of the services that we enjoyed. My priorities were clearly restoring our 250 user accounts and then getting email working (securely). In the process, I gave myself a crash course in LDAP, since that is what we use for user authentication.

Within about 48 hours, we had everything restored except our web pages. After all, we knew the break-in had allowed someone to create a rogue process owned by apache. So we must have had some problem with one of our web-based applications. We did not know whether it was our Drupal-based web page, our web mail client, our wiki, a user application, or something else.

I dug through the log files on the infected partitions, and soon it became apparent that there was a cron job set to run every minute, owned by the ‘apache’ user. The script simply looked to see if its IRC program was running, and if any part of it was damaged or deleted, it would reinstall a new copy of itself somewhere else on the disk… somewhere no one would look, like /var/tmp/.s/something.

Finally, the apache error logs showed what the problem was. It seems that we were running an unpatched version of “RoundCube“, a web-based IMAP e-mail client with a nice AJAX interface. There is a vulnerability in this package that allows a visitor to upload a package to your web server and then run their programs on your server.

Fortunately, the process runs as the “apache” user, and not as “root”. Otherwise, the rogue software would have had permission to do a lot more damage than it actually did. As it stands, the bot simply chatted with a lot of other infected machines. Thankfully, it did not seem interested in the files on our machine.

I learned a lot from this experience. As one admin said, the forced cleanup was a “much-needed enema”, something we had avoided for a long time. As a shared system, system administration was something that was handled by a loose group, and was handed off to new members every year. This break-in was enough to attract our attention, but it was not destructive. And it inspired us to simplify our existing system. And it inspired me to set up nightly backups.

Nerds of a feather

My first exposure to computers was in 1981, when my neighbor “Howdy” (Howard) Petree showed me his family’s TRS-80 Color Computer. His dad gave me some sage advice: “do whatever you want to… you’re not going to break it”. I wrote a simple game called “Al-Zap”, which led the player through a series of scenarios, each followed by three choices: “(1) Eat it, (2) Shoot it, (3) Run away”. I kept the program on three hand-written pages on a note pad, and I manually re-entered it when I wanted to work on it some more.

My interest in computers continued, but I could not go bug Howdy every time I had the urge to tinker. That’s when my friend Greg Reid told me that the public library in downtown Winston-Salem had a lab with four Apple II computers. So my early years of computing were primarily spent hacking on the Apple II’s. Eventually, my dad bought one for our family.

The rest, as they say, is history.

This week, Jeff Mercer from the Triangle Linux User Group offered a working Apple II computer to whoever would come and take it off of his hands. I took Jeff’s offer, and I hooked the old computer up so I could show the girls what “old school” computing was like.

Audrey and I did a little bit of tinkering with Applesoft BASIC, and then I gave her an assignment: to print out a multiplication table. She worked on her FOR/NEXT loops, and soon she had a very nice looking 10×10 table of numbers.

I am very proud of her accomplishment, and even more proud that she took such an interest in her daddy’s past.

Go to Top