Overview

    This manual attempts to cover things that don't logically fit in the man
    pages. It covers the big-picture items and overall design It's a cross
    between a manual and a design document at the moment, although perhaps
    those functions can be split up at a later date.

    This manual is not intended to be an exhaustive reference. For a
    complete rundown of every configuration and commandline option and its
    precise technical meaning, see the man pages.

Portability Testing

    Portability data and testing is somewhat out of date at this time.

    Modern Linux/x86-64 is the primary development and deployment platform,
    and of course the x86 architecture there generally receives extensive
    testing as well.

    Compatibility with the open source *BSD distributions is important,
    and bug reports are welcome for any breakage there.  Unfortunately
    the author doesn't use these regularly, so portability mistakes may
    creep in that need reporting.  In the case of FreeBSD there is a
    working port available, and I believe people have successfully built
    and run the code on NetBSD and OpenBSD in the recent past as well.

    Through the official Debian build system for the gdnsd package there,
    gdnsd gets some testing on exotic CPU architectures, and generally
    shouldn't have issues with any of the well-supported Debian target
    architectures.

    I do some development on Macs, so portability to reasonably-recent
    releases of MacOS X is usually good, but optimization on this platform
    for production use isn't a priority.

    In the past I've successfully tested portability to other relatively
    exotic-yet-POSIXy targets like Linux/mips32r2 and OpenSolaris (with
    Sun's compiler), but those results are very outdated at this time.

    I take absolutely no interest in portability to Microsoft platforms,
    and would probably reject pull requests for it if they add significant
    noise and/or complexity to the codebase.  It's simply not worth it.

Overall Design

  Configuration

    The configuration file's basic syntax is handled by "vscf", which parses
    a simple and clean configuration syntax with arbitrary structural depth
    in the form of arrays and hashes. At one time this was a separate
    library, but it has been bundled back into gdnsd's distribution at this
    point. Details of the configuration options are in the man page
    gdnsd.config(5).

  Threading

    The gdnsd daemon uses pthreads to maximize performance and efficiency,
    but they don't contend with each other on locks at runtime (assuming
    gdnsd is compiled with userspace-rcu support), and no more
    than one thread writes to any shared memory location. Thread-local
    writable memory is malloc()'d within the writing thread and the address
    is private to the thread.

    There are 3 singleton threads in gdnsd that handle specific functional
    roles: main, zone data, and monitoring.

    The "main" thread is the original thread of execution in the daemon.
    While it handles a great deal of diverse things during the startup
    process, at runtime it does absolutely nothing but watch for external
    signals via sigwait() (such as TERM, INT, USR1) and act on them.  It
    is the only thread that runs with signals unblocked.

    The "zone data" thread handles the runtime-reloading of zone data
    from providers.  Depending on the provider and configuration, these
    could be automatic and/or triggered on SIGUSR1.  When not actively
    checking for new zone data and/or reloading it, this thread should
    mostly be idle.

    The "monitoring" thread handles all monitoring duties for the monitoring
    plugin system.  Currently it also handles inbound traffic to the
    built-in pseudo-HTTP service to serve stats data.

    Aside from these 3 singleton threads, a single thread is spawned
    for every configured DNS listening socket.  If the options
    'udp_threads' or 'tcp_threads' are set to values greater than one,
    multiple sockets are created at the same socket address via
    SO_REUSEPORT and a separate thread attaches to each.

    The TCP DNS threads use a libev event loop to handle traffic for
    all connections on the given socket.  The UDP DNS threads use a tight
    loop over the raw send and receive calls for the given socket.

    All of the code executed in the UDP threads at runtime is carefully
    crafted to avoid all syscalls (other than the necessary send/recv
    ones) and other expensive or potentially-blocking operations (e.g.
    locks and dynamic memory allocation).  These threads should never
    block on anything other than their send/recv calls, and should execute
    at most 2 syscalls per request (significantly less under heavy traffic
    loads if Linux sendmmsg() support is compiled in and detected at
    runtime).

    The TCP code shares the efficient core DNS parsing and response
    code of the UDP threads, but it does use dynamic memory allocation
    and a plethora of per-request syscalls (some via the eventloop library)
    at the TCP connection-handling layer.

  Statistics

    The DNS threads keep reasonably detailed statistical counters of all of
    their activity. The core dns request handling code that both the TCP and
    UDP threads use tracks counters for all response types. Mostly these
    counters are named for the corresponding DNS response codes (RCODEs):

    refused
        Request was refused by the server because the server is not
        authoritative for the queried name.

    nxdomain
        Request was for a non-existant domainname. In other words, a name
        the daemon is authoritative for, but which does not exist in the
        database.

    notimp
        Requested service not implemented by this daemon, such as zone
        transfer requests.

    badvers
        Request had an EDNS OPT RR with a version higher than zero, which
        this daemon does not support (at the time of this writing, such a
        version doesn't even exist).

    formerr
        Request was badly-formatted, but was sane enough that we did send
        a response with the rcode FORMERR.

    dropped
        Request was so horribly malformed that we didn't even bother to
        respond (too short to contain a valid header, unparseable question
        section, QR (Query Response) bit set in a supposed question, TC bit
        set, illegal domainname encoding, etc, etc).

    noerror
        Request did not have any of the above problems.

    v6  Request was from an IPv6 client. This one isn't RCODE based, and is
        orthogonal to all other counts above.

    edns
        Request contained an EDNS OPT-RR. Not RCODE-based, so again
        orthogonal to the RCODE-based totals above. Includes the ones that
        generated badvers RCODEs.

    edns_client_subnet
        Subset of the above which specified the edns_client_subnet option.

    The UDP thread(s) keep the following statistics at their own level of
    processing:

    udp_reqs
        Total count of UDP requests received and passed on to the core DNS
        request handling code (this is synthesized by summing all of the
        RCODE-based stat counters above for the UDP threads).

    udp_recvfail
        Count of UDP recvmsg() errors, where the OS indicated that something
        bad happened on receive. Obviously, we don't even get these
        requests, so they can't be processed and replied to.  We also count
        it as udp_recvfail (and do not process the request) if the recvmsg()
        call succeeds but the client used an illegal source port of zero.

    udp_sendfail
        Count of UDP "sendmsg()" errors, which almost definitely resulted in
        dropped responses from the client's point of view.

    udp_tc
        Non-EDNS (traditional 512-byte) UDP responses that were truncated
        with the TC bit set.

    udp_edns_big
        EDNS responses where the response was greater than 512 bytes (in
        other words, EDNS actually did something for you size-wise)

    udp_edns_tc
        EDNS responses where the response was truncated and the TC bit
        set, meaning that the client's specified edns buffer size was too
        small for the data requested in spite of EDNS.

    The TCP threads also count this stuff:

    tcp_reqs
        Total count of TCP requests (again, synthesized by summing the
        RCODE-based stats for only TCP threads).

    tcp_recvfail
        Count of abnormal failures in recv() on a DNS TCP socket, including
        ones where the sender indicated a payload larger than we're willing
        to accept.

    tcp_sendfail
        Count of abnormal failures in send() on a DNS TCP socket.

    These statistics are tracked in per-thread structures. The actual data
    slots are uintptr_t, which helps with rollover on 64-bit machines.

    The monitoring thread reports the statistics in two different ways.
    The first is via syslog every log_stats seconds (default 3600), as well
    as always at exit time. The other is via an embedded HTTP server which
    listens by default on port 3506. The HTTP server can give the data in
    html, json and csv formats.  All of the stats reporting code is in
    statio.c.

  Truncation Handling and other related things

    gdnsd's truncation handling follows the simplest valid set of truncation
    rules. That is: it drops whole RR sets (without setting the TC bit) in
    the case of being unable to fit all the desirable additional records
    into the Additional section, and in the case that Answer or Authority
    records don't fit, it returns an empty (other than perhaps an EDNS OPT
    RR) packet with the TC bit set. The space for the EDNS OPT RR is
    reserved from the start when applicable, so it will never be elided to
    make room for other records. Nameserver address RRs for delegation glue
    are considered part of the required set above (i.e. if they don't fit,
    the whole packet will be truncated w/ TC, even though they go in the
    Additional section).

    Also, by default, unnecessary lists of NS records in the Authority
    section are left out completely (they're really only necessary for
    delegation responses). This behavior can be reversed (to always send
    appropriate NS records even when not strictly necessary) via the
    include_optional_ns option.

Security

    Any public-facing network daemon has to consider security issues. While
    the potential will always exist for gdnsd to contain stupid buffer
    overflow bugs and the like, I believe the code to be reasonably secure
    by design.

    The compiler/linker flags set up by the autotools config in the source
    tree default to turning on all of the relevant security hardening flags
    I'm aware of for the GNU toolchain if they seem to be supported at build
    time.  This can be disabled via --without-hardening if you'd like to
    supply different/conflicting ones, or to aid in debugging/analysis.

    I regularly audit the code as best I can, both manually and with tools
    like valgrind, clang-analyzer, coverity, cppcheck, etc to look for
    stupid memory (or other) bugs. Another point in its favor is the fact
    that, being a purely authoritative server, gdnsd has no reason to
    believe anything anyone else on the network has to say about anything.
    This eliminates entire classes of attacks related to poisoning and the
    like.  gdnsd never sends DNS queries (even indirectly via
    gethostbyname()) to anyone else. It's a DNS server, not a DNS client.

    Perhaps more importantly, gdnsd doesn't trust itself to be root on your
    machine. Any time gdnsd is started as root, it will drop privileges to
    those of the user named "gdnsd" (configurable) before it begins
    receiving network traffic.  This cannot be disabled, and the user isn't
    allowed to have root's uid or gid either.

    If any security-related operation fails, the daemon will fail to start
    itself and abort with a log message indicating the problem.

    The code used to directly support "chroot()". This was removed during
    the development cycle leading to version 2.0. The rationales for removal
    included: "chroot()" wasn't the best tool to begin with (it doesn't
    limit as many things as we'd like). Direct "chroot()" support added
    complexities in many areas of the code and documentation. Systems
    (especially auth DNS servers) tend to be special-purpose rather than
    shared these days. Also, "chroot()" is largely supplanted by much better
    mechanisms on most target hosts which should be configured externally
    (e.g. FreeBSD jails, systemd service options, AppArmor, SELinux, LXC
    containers, actual VMs, etc). For that matter, a distribution packager
    or system admin can still do basic chroot through their init
    scripts/system, and probably set it up better with regard to mounting
    special directories to make library calls work right, etc.

Copyright and License

    Copyright (c) 2014 Brandon L Black <blblack@gmail.com>

    This file is part of gdnsd.

    gdnsd is free software: you can redistribute it and/or modify it under
    the terms of the GNU General Public License as published by the Free
    Software Foundation, either version 3 of the License, or (at your
    option) any later version.

    gdnsd is distributed in the hope that it will be useful, but WITHOUT ANY
    WARRANTY; without even the implied warranty of MERCHANTABILITY or
    FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
    more details.

    You should have received a copy of the GNU General Public License along
    with gdnsd. If not, see <http://www.gnu.org/licenses/>.
