Until yesterday my server was running perfectly. Today I get a call from a colleague saying the site is down. Trying to surf to it, the browser returns the error "Could not connect: Unknown MySQL server host ‘my-domain.co.uk’
Looks like a DNS problem, I thought, if the server can’t resolve a domain that points to itself in order to connect to MySQL.
I checked my domain on Pingability, and got a succession of DNS errors, which imply that I have no DNS server running, even though VirtualMin assures me that BIND is running.
“Warning my-domain.co.uk does not have an IP Address (A) record.”
“Error None of this zone’s name servers responded on the request for ‘my-domain.co.uk’ records. Giving up.”
SOA record shows as Unknown
“Warning Did not find any IP Address (A) records for the name server ‘ns1.my-domain.co.uk’. Normally the parent name server will list them. These name server A records are also called ‘host records’ and are usually set by the domain name registrar.”
“Information No glue records found at parent name servers for my-domain.co.uk”
Until yesterday this site was running perfectly. Is this an issue at the domain registrar, not providing the glue records to point to my server? I am no expert on DNS - any help would be sincerely appreciated.
Here is a link to the intodns output. It finds NS records at the parent server, but it says there is no DNS server running at the IP addresses to which those records point. But according to VirtualMin, BIND is up and running. Any ideas?
First off – thanks for the DNS report, that does help in troubleshooting.
Doing a lookup at your nameservers – it does indeed appear that BIND is unavailable to the outside world.
Since it sounds like it’s running locally – you may want to verify that you don’t have a firewall blocking UDP port 53. It seems to hang, rather than reject immediately, which is often a sign of a firewall.
Sincere thanks for your time on this. Here is the output from dig @localhost:
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <<>> @localhost
; (1 server found)
;; global options: printcmd
;; connection timed out; no servers could be reached
I’m not quite where exactly within /var/log I should be looking - I don’t see a file or folder for BIND. Can you advise?
Here is the content of /var/log/messages which is added when I restart BIND:
I am guessing the the “not listening on any interfaces” might be the crux of the problem?
Dec 10 17:29:30 server55711 named[14570]: shutting down: flushing changes
Dec 10 17:29:30 server55711 named[14570]: stopping command channel on 127.0.0.1#953
Dec 10 17:29:30 server55711 named[14570]: stopping command channel on ::1#953
Dec 10 17:29:30 server55711 named[14570]: exiting
Dec 10 17:29:31 server55711 named[15611]: starting BIND 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 -u named
Dec 10 17:29:31 server55711 named[15611]: adjusted limit on open files from 1024 to 1048576
Dec 10 17:29:31 server55711 named[15611]: found 4 CPUs, using 4 worker threads
Dec 10 17:29:31 server55711 named[15611]: using up to 4096 sockets
Dec 10 17:29:31 server55711 named[15611]: loading configuration from '/etc/named.conf'
Dec 10 17:29:31 server55711 named[15611]: using default UDP/IPv4 port range: [1024, 65535]
Dec 10 17:29:31 server55711 named[15611]: using default UDP/IPv6 port range: [1024, 65535]
Dec 10 17:29:31 server55711 named[15611]: /etc/named.conf:9: undefined ACL '83.170.79.9,83.170.78.155,83.170.78.156'
Dec 10 17:29:31 server55711 named[15611]: not listening on any interfaces
Dec 10 17:29:31 server55711 named[15611]: command channel listening on 127.0.0.1#953
Dec 10 17:29:31 server55711 named[15611]: command channel listening on ::1#953
Dec 10 17:29:31 server55711 named[15611]: the working directory is not writable
Dec 10 17:29:31 server55711 named[15611]: zone waos-online.co.uk/IN: loaded serial 1289610790
Dec 10 17:29:31 server55711 named[15611]: zone registration.waos-online.co.uk/IN: loaded serial 1290852794
Dec 10 17:29:31 server55711 named[15611]: running
Dec 10 17:29:31 server55711 named[15611]: zone registration.waos-online.co.uk/IN: sending notifies (serial 1290852794)
Dec 10 17:29:31 server55711 named[15611]: zone waos-online.co.uk/IN: sending notifies (serial 1289610790)
I went into the BIND settings in Webmin and into Addresses and Topology. Under ports and address to listen on, I cleared my three IP addresses out, saved, restarted, added them again, saved restarted… straight away BIND came up. Checked on intoDNS - all green, no problems.
HOWEVER… as soon as I surf to my site, I get an internal server error on a particular page… and that crashes BIND. Which is obviously what caused the problem in the first place. Where would I find the log to tell me more about internal server errors? (I am using CentOS as you summised)
The named.conf file is as follows:
options {
directory "/etc";
pid-file "/var/run/named/named.pid";
allow-recursion {
localnets;
127.0.0.1;
};
listen-on {
83.170.79.9,83.170.78.155,83.170.78.156;
};
};
zone "." {
type hint;
file "/etc/db.cache";
};
zone "waos-online.co.uk" {
type master;
file "/var/named/waos-online.co.uk.hosts";
allow-transfer {
127.0.0.1;
localnets;
};
};
zone "registration.waos-online.co.uk" {
type master;
file "/var/named/registration.waos-online.co.uk.hosts";
allow-transfer {
127.0.0.1;
localnets;
};
};
The text of the internal server error echo’d to the browser is:
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, root@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error.
More information about this error may be available in the server error log.
OK, it seems that the DNS problem is now solved thanks to rectifying the problem in line 9.
However, the underlying server error turns out to be:
“(110)Connection timed out: mod_fcgid: ap_pass_brigade failed in handle_request function”
Having searched around extensively for information on this, it seems that I need to increase the FcgidMaxProcessesPerClass setting to something well above what is a fairly low VirtualMin default.
But I cannot for the life of me figure out how! Eric, in a previous forum posting, you say “Another option as well would be to go into Administration Options -> Edit Resource Limits, and to set “Max Number of Processes” for any Virtual Servers you’d like to have limits for.”
I do not see “Edit resource limits” under Admin Options - is that because I am running GPL not Pro? If so, how do I change this setting?
It likely is a Pro vs GPL thing. It was only recently that Virtualmin GPL began enabling you to configure FCGID, and I suspect the resource limits didn’t make it in there yet.
You would need to manually edit the Apache config, and add set those values in the VirtualHost block for your domain.
Alternatively, you could always move away from FCGID to CGI, which may alleviate some of the problems you’re seeing. To do that, you could go into Server Configuration -> Website Options, and set the PHP Execution Mode.
I tried switching to CGI, and no error is generated but the PHP code still takes a ludicrous amount of time to process - up to a minute just to parse some RSS feeds. Ironically, until last week, I had this site on an older, slower server which was not running Virtualmin, and it parsed out this code in less than a second.
I will edit the Apache config, but can you tell me where? I tried going through Services/Configure Website/Edit Directives and adding FcgidMaxProcessesPerClass 100, but when I tried to restart Apache it returned the error
"Syntax error on line 1053 of /etc/httpd/conf/httpd.conf:
Invalid command ‘FcgidMaxProcessesPerClass’, perhaps misspelled or defined by a module not included in the server configuration’
I just need to know exactly where to put that line, assuming that is the correct line.
Hey Snapmin. I did, but the problem I had there was that I could not mod that file, even when logged in as root. I’m sure I could find a way around that. I was using Transmit, which may have been part of the problem there. Is that where I should put the FcgidMaxProcessesPerClass directive?