Here is a silly question ...

MarkThomas · May 28, 2007, 9:21am

What is that ICON directory that is stuck in the middle of each site we define? I cannot download it. I cannot browse it. It goes away if I install a package into the home directory and remove that package later (installer scripts). It’s just ‘there!’

Mark

Joe · May 28, 2007, 9:30am

Hey Mark,

It’s so that awstats will work, without having to have an actual copy of it in every home. There are alternative methods for making it work (aliases, and such), but this one works in both Webmin (which has an AWstats viewer) and Apache.

If it bugs you, you can disable AWstats, and it’ll stop showing up in your new domains (at least I believe it will). Now that we handle Google Analytics tag insertion on most platforms (with the others coming soon), you may not even want local analytics. The local machine does have some extra data, but not much.

It gets deleted when you install a script without its own docroot because uninstalling a script deletes everything in the directory it was installed into (which makes it rather dangerous to install into public_html without a subdir). This is kind of bug-like, but there’s not a very good solution.

DanLong · May 28, 2007, 12:15pm

Just wanted to chime in on this a little.

You might want to consider the desires of the server user before knocking off the local stats.

Big Brother alert!!!

I’ve grown extremely suspicious of google as of late with the disappearence of mine and many other’s listings in google at the start of the year. That and google using the adword slap on many of us causes me to want to forward as little information as possible into their database. We don’t even purposefully list with google anymore as it becomes more and more apparent that they are trying to control content.

I’m not willing to give away info on traffic patterns and clients that I worked hard to get.

MarkThomas · June 4, 2007, 11:16pm

Hey Joe,

Been a bit since I was on.

Is there a fairly simple way for me to recreate that directory if I need to? I like AWstats. I also like to install things into my home directory.

MarkThomas · June 4, 2007, 11:17pm

I have been debating whether to block Google’s bot. They seem to nose into everything, and they show more activity on my site than I show for myself.

Joe · June 4, 2007, 11:23pm

Hey Mark,

Sure, it’s just a link.

cd /home/domain/public_html
ln -s /usr/share/awstats/wwwroot/icon /usr/share/awstats/wwwroot/icon

Where you may need to replace that path with the correct one for awstates–it varies across platforms (that’s the path on my FC6 desktop machine).

MarkThomas · June 4, 2007, 11:28pm

Thank you. I will try it.

Joe · June 4, 2007, 11:29pm

Hey Mark,

I ended up adding some of our paths to the robots.txt, because the Googlebot was visiting us every day, and polling through the whole site (a few thousand pages, dynamically generated). It was causing our CMS to balloon up to 1.8GB by the end of the day (on a 2GB machine). Once the new website goes live (hopefully tomorrow), I’ll let Google back into the whole site. Having Google index your site is almost entirely a positive thing…but it was getting ridiculous having to restart OpenACS every day just to keep it working.

To do something like that, just add a robots.txt in the root public_html, like this:

User-agent: *
Disallow: /doc/
Disallow: /api-doc/
Disallow: /register/

Where you’d obviously replace the “Disallow: …” bits with the stuff you want Google to leave alone.

MarkThomas · June 9, 2007, 5:59pm

Hey Joe,

Once again, I was possibly making it more difficult for myself than normal. I had installed Drupal on some of my sites. Drupal’s a nice package, but there are several things to watch out for.

Drupal manages everything within the directory you install it in. That means that you cannot get to the stats directory easily unless you figure out their .htaccess files.
With Drupal, a person can alias their nodes and their paths. It makes it even more fun to block robots from scanning everything. Not only does a person have to block the /node/ stuff, but you also have to block the aliases. I felt like I was in a war there for a little bit, trying to block off everything except for what I wanted to make public. I can hardly wait to see what a less benevolent crawler would do.

MarkThomas · June 9, 2007, 6:01pm

… I think I am going to start looking at the idea of installing the packages into their own subdirectories, regardless of the fact that I don’t like setting up redirection.