BIND exiting and shutting down for no apparent reason - random RCODE REFUSED

I have been seeing random BIND shut down and restart on Virtualmin GPL install. The shut down is always preceded by some error but random domain or IP entry. these IP addresses and domains are not on my server. The code message is often “error (unexpected RCODE REFUSED) resolving to some domain/IP”

If anyone can point me in the correct direction for finding the cause. I am assuming there is a bad setting somewhere. I am running CentOS6 with Virtualmin GPL

A sample log of a shutdown

Sep 25 06:01:06 ip-68-178-130-21 named[20231]: error (unexpected RCODE REFUSED) resolving '49.142.218.222.in-addr.arpa/PTR/IN': 202.103.225.70#53 Sep 25 06:01:07 ip-68-178-130-21 named[20231]: error (unexpected RCODE REFUSED) resolving '49.142.218.222.in-addr.arpa/PTR/IN': 202.103.224.69#53 Sep 25 06:01:09 ip-68-178-130-21 named[20231]: error (unexpected RCODE REFUSED) resolving '49.142.218.222.in-addr.arpa/PTR/IN': 202.103.224.69#53 Sep 25 06:01:09 ip-68-178-130-21 named[20231]: error (unexpected RCODE REFUSED) resolving '49.142.218.222.in-addr.arpa/PTR/IN': 202.103.225.70#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: shutting down Sep 25 06:02:17 ip-68-178-130-21 named[20231]: stopping command channel on 127.0.0.1#953 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: stopping command channel on ::1#953 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: no longer listening on ::#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: no longer listening on 127.0.0.1#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: no longer listening on 68.178.130.21#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: no longer listening on 68.178.131.108#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: no longer listening on 50.62.34.152#53 Sep 25 06:02:17 ip-68-178-130-21 named[20231]: exiting Sep 25 06:07:46 ip-68-178-130-21 named[29722]: starting BIND 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6.3 -u named Sep 25 06:07:46 ip-68-178-130-21 named[29722]: built with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-libtool' '--localstatedir=/var' '--enable-threads' '--enable-ipv6' '--with-pic' '--disable-static' '--disable-openssl-version-check' '--with-dlz-ldap=yes' '--with-dlz-postgres=yes' '--with-dlz-mysql=yes' '--with-dlz-filesystem=yes' '--with-gssapi=yes' '--disable-isc-spnego' '--with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets' '--enable-fixed-rrset' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' 'CPPFLAGS= -DDIG_SIGCHASE' Sep 25 06:07:46 ip-68-178-130-21 named[29722]: ---------------------------------------------------- Sep 25 06:07:46 ip-68-178-130-21 named[29722]: BIND 9 is maintained by Internet Systems Consortium, Sep 25 06:07:46 ip-68-178-130-21 named[29722]: Inc. (ISC), a non-profit 501(c)(3) public-benefit Sep 25 06:07:46 ip-68-178-130-21 named[29722]: corporation. Support and training for BIND 9 are Sep 25 06:07:46 ip-68-178-130-21 named[29722]: available at https://www.isc.org/support Sep 25 06:07:46 ip-68-178-130-21 named[29722]: ---------------------------------------------------- Sep 25 06:07:46 ip-68-178-130-21 named[29722]: adjusted limit on open files from 1024 to 1048576 Sep 25 06:07:46 ip-68-178-130-21 named[29722]: found 16 CPUs, using 16 worker threads

Another BIND Shutdown was this one - these IP addresses and domains are not on my server:

Sep 25 05:46:25 ip-68-178-130-21 named[22235]: client 122.136.196.116#63897: query (cache) 'aa3247.com/A/IN' denied Sep 25 05:46:25 ip-68-178-130-21 named[22235]: client 122.136.196.116#3209: query (cache) 'aa3247.com/A/IN' denied Sep 25 05:46:25 ip-68-178-130-21 named[22235]: shutting down Sep 25 05:46:25 ip-68-178-130-21 named[22235]: stopping command channel on 127.0.0.1#953 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: stopping command channel on ::1#953 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: no longer listening on ::#53 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: no longer listening on 127.0.0.1#53 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: no longer listening on 68.178.130.21#53 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: no longer listening on 68.178.131.108#53 Sep 25 05:46:25 ip-68-178-130-21 named[22235]: no longer listening on 50.62.34.152#53 Sep 25 05:46:27 ip-68-178-130-21 named[22235]: exiting

Howdy,

Are you by chance using an OpenVZ-based VPS?

If so, can you paste in the contents of your /proc/user_beancounters file?

-Eric

This time BIND just went down twice in a row but with yet another error code:

Sep 25 09:32:38 ip-68-178-130-21 named[29722]: error (unexpected RCODE SERVFAIL) resolving '228.195.58.213.in-addr.arpa/PTR/IN': 2001:67c:e0::6#53 Sep 25 09:35:16 ip-68-178-130-21 php-cgi: PHP Warning: Module 'mcrypt' already loaded in Unknown on line 0 Sep 25 09:35:19 ip-68-178-130-21 named[29722]: client 122.136.196.116#29758: query (cache) 'aa3247.com/A/IN' denied Sep 25 09:35:33 ip-68-178-130-21 named[29722]: error (unexpected RCODE REFUSED) resolving 'ns.bta.net.cn/A/IN': 202.106.196.233#53 Sep 25 09:35:33 ip-68-178-130-21 named[29722]: error (unexpected RCODE REFUSED) resolving 'ns.bta.net.cn/AAAA/IN': 202.106.196.233#53 Sep 25 09:35:33 ip-68-178-130-21 named[29722]: error (unexpected RCODE REFUSED) resolving 'ns2.bta.net.cn/A/IN': 202.106.196.233#53 Sep 25 09:35:33 ip-68-178-130-21 named[29722]: error (unexpected RCODE REFUSED) resolving 'ns2.bta.net.cn/AAAA/IN': 202.106.196.233#53 Sep 25 09:35:56 ip-68-178-130-21 named[29722]: error (connection refused) resolving '130.36.99.91.psbl.surriel.com/A/IN': 82.94.250.75#53 Sep 25 09:37:41 ip-68-178-130-21 named[29722]: error (unexpected RCODE SERVFAIL) resolving '228.195.58.213.in-addr.arpa/PTR/IN': 193.0.9.6#53 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: shutting down Sep 25 09:38:21 ip-68-178-130-21 named[29722]: stopping command channel on 127.0.0.1#953 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: stopping command channel on ::1#953 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: no longer listening on ::#53 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: no longer listening on 127.0.0.1#53 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: no longer listening on 68.178.130.21#53 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: no longer listening on 68.178.131.108#53 Sep 25 09:38:21 ip-68-178-130-21 named[29722]: no longer listening on 50.62.34.152#53 Sep 25 09:38:22 ip-68-178-130-21 named[29722]: exiting

Yes it is.

lockedpages 0 0 9223372036854775807 9223372036854775807 0 privvmpages 799349 840633 1572864 1572864 0 shmpages 1206 1206 9223372036854775807 9223372036854775807 0 dummy 0 0 9223372036854775807 9223372036854775807 0 numproc 167 167 32567 32567 0 physpages 638774 678198 9223372036854775807 9223372036854775807 0 vmguarpages 0 0 786432 9223372036854775807 0 oomguarpages 638775 678199 9223372036854775807 9223372036854775807 0 numtcpsock 48 51 9223372036854775807 9223372036854775807 0 numflock 171 172 9223372036854775807 9223372036854775807 0 numpty 1 1 255 255 0 numsiginfo 6 7 1024 1024 0 tcpsndbuf 1465696 1520640 9223372036854775807 9223372036854775807 0 tcprcvbuf 786432 835584 9223372036854775807 9223372036854775807 0 othersockbuf 282656 565536 9223372036854775807 9223372036854775807 0 dgramrcvbuf 0 9344 9223372036854775807 9223372036854775807 0 numothersock 201 207 9223372036854775807 9223372036854775807 0 dcachesize 2355922 2400994 9223372036854775807 9223372036854775807 0 numfile 18951 18960 9223372036854775807 9223372036854775807 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 numiptent 14 14 9223372036854775807 9223372036854775807 0

I don’t think the error message and the exiting is directly related. In your first example, there’s over a minute between these events. Can you check how much time passed between each of these incidences?

I’d also think that BIND is hitting some kind of resource limit. You might check the syslog for further errors, maybe OOM messages. Short of a programming bug, it should not simply exit like that. You might check with “tcpdump udp port 53” what the load of your BIND is.

Looks like about 20 items per second - not sure what is normal.

This happens about 4 times a day, but at no particular time. I’ve looked thru the logs, but the only consistency I see is right before BIND closes, there is always an error message before such as:

error (unexpected RCODE REFUSED) error (unexpected RCODE SERVFAIL) error (unexpected RCODE REFUSED)

Sometimes there can be 30-50 error messages within a few minutes all on the same PID with different IPs

Looks like about 20 items per second - not sure what is normal.

This happens about 4 times a day, but at no particular time. I’ve looked thru the logs, but the only consistency I see is right before BIND closes, there is always an error message before such as:

error (unexpected RCODE REFUSED) error (unexpected RCODE SERVFAIL) error (unexpected RCODE REFUSED)

Sometimes there can be 30-50 error messages within a few minutes all on the same PID with different IPs

Looks like about 20 items per second - not sure what is normal.

This happens about 4 times a day, but at no particular time. I’ve looked thru the logs, but the only consistency I see is right before BIND closes, there is always an error message before such as:

error (unexpected RCODE REFUSED) error (unexpected RCODE SERVFAIL) error (unexpected RCODE REFUSED)

Sometimes there can be 30-50 error messages within a few minutes all on the same PID with different IPs

The resolver errors you see are reverse lookups, probably triggered by some service on your machine that wants to know the hostname for an IP address that is trying to connect (e.g. SSHD and Postfix can do that). You might want to check your syslog, auth-log or mail-log for an unusually high amount of connection attempts / login failures, which indicate dictionary attacks.

Still, BIND should not simply exit from something like this. It’s meant to process a considerable amount of requests per second.

Did you check the syslog or kernel log for additional errors like OOMs? Otherwise, I’m fairly certain that it’s a problem with the virtualization software in use for your server. What’s your used and free memory? You might want to use the tool “atop” to record and playback performance data, that should give you an idea if something has a memory spike or similar.

No OOMs and nothing in the kernel logs.

However I did find spamassasin set to lookup DNS test mode (not sure that would matter) and did fine an email account that is trying to login every 3 seconds with an authentication failure in saslauthd

The only thing I have seen pulling a lot of CPU is clamav.

It’s not a busy server - only 100 virtual servers.

It’s not uncommon that we see unusual issues and service that die unexpectedly on OpenVZ-based systems… but typically when that occurs, the “failcnt” field of the user_beancounters file shows some non-0 values in it.

On your system, they all appear to be “0”.

How much RAM is in your server, and what is the current usage?

You can determine that by running this command:

free -m

I can only repeat what I wrote last time, and what Eric said too now: We need to know what your RAM situation is. free and atop are recommended to provide that information, and to check historical memory data when BIND crashes.

The system has 3GB ram with 6GB burstable

]# free -m total used free shared buffers cached Mem: 3072 1708 1363 0 0 0 -/+ buffers/cache: 1708 1363 Swap: 0 0 0

This is really odd. Memory should be plenty, and no beancounter violation.

Two things I can think of right now: Intermittent memory problems. To find out about those, install “atop”. Or the BIND version you’re using (being an RC) might be buggy. Ubuntu officially uses 9.8.1, yours is 9.8.2-RC1.

You could try turning on BIND’s own logging, including debug logging. Or you might want to check if there’s an earlier, non-RC version of BIND available. Are you using third-party repositories for your CentOS? Those often contain versions of software that make trouble when used with Virtualmin.

The Centos iss a clean standard install and Virtualmin is the standard. It is a real problem as the server closes down about 6 times a day. Stopped twice again this morning. I searched the logs, and the same RCODE error then a BIND exit. I noticed that there is a bunch of messages all relating to the same PID. Don’t know if it means anything.

Close at 7.03AM

Sep 26 07:03:38 ip-68-178-130-21 named[4010]: error (unexpected RCODE REFUSED) resolving ‘16.136.119.140.in-addr.arpa/PTR/IN’: 2001:288:2001::8c77:36e#53
Sep 26 07:03:40 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 07:03:40 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 07:04:01 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 07:04:06 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 07:04:39 ip-68-178-130-21 named[4010]: error (unexpected RCODE SERVFAIL) resolving ‘65.37.114.42.in-addr.arpa/PTR/IN’: 210.245.31.10#53
Sep 26 07:04:49 ip-68-178-130-21 named[4010]: client 122.136.196.116#45453: query (cache) ‘aa3247.com/A/IN’ denied
Sep 26 07:05:46 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 07:06:41 ip-68-178-130-21 named[4010]: client 122.136.196.116#61009: query (cache) ‘aa3247.com/A/IN’ denied
Sep 26 07:07:17 ip-68-178-130-21 named[4010]: client 122.136.196.116#45732: query (cache) ‘aa3247.com/A/IN’ denied
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: shutting down
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: stopping command channel on 127.0.0.1#953
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: stopping command channel on ::1#953
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: no longer listening on ::#53
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: no longer listening on 127.0.0.1#53
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: no longer listening on 68.178.130.21#53
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: no longer listening on 68.178.131.108#53
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: no longer listening on 50.62.34.152#53
Sep 26 07:07:41 ip-68-178-130-21 named[4010]: exiting
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: starting BIND 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6.3 -u named
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: built with ‘–build=x86_64-redhat-linux-gnu’ ‘–host=x86_64-redhat-linux-gnu’ ‘–target=x86_64-redhat-linux-gnu’ ‘–program-prefix=’ ‘–prefix=/usr’ ‘–exec-prefix=/usr’ ‘–bindir=/usr/bin’ ‘–sbindir=/usr/sbin’ ‘–sysconfdir=/etc’ ‘–datadir=/usr/share’ ‘–includedir=/usr/include’ ‘–libdir=/usr/lib64’ ‘–libexecdir=/usr/libexec’ ‘–sharedstatedir=/var/lib’ ‘–mandir=/usr/share/man’ ‘–infodir=/usr/share/info’ ‘–with-libtool’ ‘–localstatedir=/var’ ‘–enable-threads’ ‘–enable-ipv6’ ‘–with-pic’ ‘–disable-static’ ‘–disable-openssl-version-check’ ‘–with-dlz-ldap=yes’ ‘–with-dlz-postgres=yes’ ‘–with-dlz-mysql=yes’ ‘–with-dlz-filesystem=yes’ ‘–with-gssapi=yes’ ‘–disable-isc-spnego’ ‘–with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets’ ‘–enable-fixed-rrset’ ‘build_alias=x86_64-redhat-linux-gnu’ ‘host_alias=x86_64-redhat-linux-gnu’ ‘target_alias=x86_64-redhat-linux-gnu’ ‘CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic’ ‘CPPFLAGS= -DDIG_SIGCHASE’
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: ----------------------------------------------------
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: BIND 9 is maintained by Internet Systems Consortium,
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: Inc. (ISC), a non-profit 501©(3) public-benefit
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: corporation. Support and training for BIND 9 are
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: available at https://www.isc.org/support
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: ----------------------------------------------------
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: adjusted limit on open files from 1024 to 1048576
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: found 16 CPUs, using 16 worker threads
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: using up to 4096 sockets
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: loading configuration from ‘/etc/named.conf’
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: reading built-in trusted keys from file ‘/etc/named.iscdlv.key’
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: using default UDP/IPv4 port range: [1024, 65535]
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: using default UDP/IPv6 port range: [1024, 65535]
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: listening on IPv6 interfaces, port 53
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: listening on IPv4 interface lo, 127.0.0.1#53
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: listening on IPv4 interface eth0, 68.178.130.21#53
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: listening on IPv4 interface eth0:1, 68.178.131.108#53
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: listening on IPv4 interface eth0:2, 50.62.34.152#53
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: generating session key for dynamic DNS
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: sizing zone task pool based on 95 zones
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: set up managed keys zone for view _default, file ‘/var/named/dynamic/managed-keys.bind’
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: Warning: ‘empty-zones-enable/disable-empty-zone’ not set: disabling RFC 1918 empty zones
Sep 26 07:12:50 ip-68-178-130-21 named[7851]: automatic empty zone: 127.IN-ADDR.ARPA

Close at 9:22AM

Sep 26 09:21:14 ip-68-178-130-21 named[26623]: error (unexpected RCODE SERVFAIL) resolving ‘226.128.136.183.in-addr.arpa/PTR/IN’: 202.96.103.36#53
Sep 26 09:21:28 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 09:21:52 ip-68-178-130-21 php-cgi: PHP Warning: Module ‘mcrypt’ already loaded in Unknown on line 0
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: received control channel command ‘stop’
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: shutting down: flushing changes
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: stopping command channel on 127.0.0.1#953
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: stopping command channel on ::1#953
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: no longer listening on ::#53
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: no longer listening on 127.0.0.1#53
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: no longer listening on 68.178.130.21#53
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: no longer listening on 68.178.131.108#53
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: no longer listening on 50.62.34.152#53
Sep 26 09:22:30 ip-68-178-130-21 named[26623]: exiting
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: starting BIND 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6.3 -u named
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: built with ‘–build=x86_64-redhat-linux-gnu’ ‘–host=x86_64-redhat-linux-gnu’ ‘–target=x86_64-redhat-linux-gnu’ ‘–program-prefix=’ ‘–prefix=/usr’ ‘–exec-prefix=/usr’ ‘–bindir=/usr/bin’ ‘–sbindir=/usr/sbin’ ‘–sysconfdir=/etc’ ‘–datadir=/usr/share’ ‘–includedir=/usr/include’ ‘–libdir=/usr/lib64’ ‘–libexecdir=/usr/libexec’ ‘–sharedstatedir=/var/lib’ ‘–mandir=/usr/share/man’ ‘–infodir=/usr/share/info’ ‘–with-libtool’ ‘–localstatedir=/var’ ‘–enable-threads’ ‘–enable-ipv6’ ‘–with-pic’ ‘–disable-static’ ‘–disable-openssl-version-check’ ‘–with-dlz-ldap=yes’ ‘–with-dlz-postgres=yes’ ‘–with-dlz-mysql=yes’ ‘–with-dlz-filesystem=yes’ ‘–with-gssapi=yes’ ‘–disable-isc-spnego’ ‘–with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets’ ‘–enable-fixed-rrset’ ‘build_alias=x86_64-redhat-linux-gnu’ ‘host_alias=x86_64-redhat-linux-gnu’ ‘target_alias=x86_64-redhat-linux-gnu’ ‘CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic’ ‘CPPFLAGS= -DDIG_SIGCHASE’
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: ----------------------------------------------------
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: BIND 9 is maintained by Internet Systems Consortium,
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: Inc. (ISC), a non-profit 501©(3) public-benefit
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: corporation. Support and training for BIND 9 are
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: available at https://www.isc.org/support
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: ----------------------------------------------------
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: adjusted limit on open files from 1024 to 1048576
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: found 16 CPUs, using 16 worker threads
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: using up to 4096 sockets
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: loading configuration from ‘/etc/named.conf’
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: reading built-in trusted keys from file ‘/etc/named.iscdlv.key’
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: using default UDP/IPv4 port range: [1024, 65535]
Sep 26 09:22:32 ip-68-178-130-21 named[3617]: using default UDP/IPv6 port range: [1024, 65535]

One setting question for BIND.

Do full recursive lookups for clients? It is set to Yes

Should it be NO?

So what about my suggestions, like atop, BIND debug log and downgrading to a stable non-RC version? I’m afraid I can’t help you if you don’t reply to my suggestions.

The BIND debug is already on, but it does not seem to be writing to the file. I will be willing to downgrade but the current stable version is 9.8.6 and I have installed 9.8.2-0.17.rc1.el6.3 which is not the latest. I do not have atop installed, but checked it out this morning and will install.

Thanks for the advice

I noticed that my ram is at capacity.

#free -m total used free shared buffers cached Mem: 3072 3072 0 0 0 0 -/+ buffers/cache: 3072 0 Swap: 0 0 0

You’d definitely want to limit recursive queries to your local server, and not allow it for any user on the internet. That’s done with the “allow-recursion” directive, or there’s also an edit box in Webmin’s BIND module for it. So you should set “Do fully recursive lookups” to “Default”, and in “Addresses and Topology”, set “Allow recursive queries from” to “Listed…” and enter in the box

localnets localhost

As for the RAM filling up, did that happen when your BIND crashed? You really need ATOP, to have historical data about processes. It could be a script going haywire and spawning many processes, or BIND itself using too much RAM. But you won’t find that out without ATOP.

When you have it installed, go to /etc/init.d/atop and modify the line “DARGS=”, the number at the end is the interval in seconds that atop uses to store resource data. You might want to lower the default of 10 minutes to e.g. one minute, for your debugging purposes.