New server looks good (plus a surprise!)

September 25, 2008

So far, LyricWiki has been running pretty well with the new server (our 5th server, named “Cochise“) set up with the rest of them.  I’ve moved the API completely to that server which takes a good deal of the stress off of the site itself.

The site has been pretty fast since the new server has been up.  However, there have been occasional slow-patches, and looking at the CPU usage on the main webserver for the site – it’s definitely still not “calm”.

Surprise 1

So here’s a fun surprise: today I ordered another server! 😀

For the first time in a looong time, we can hopefully stay ahead of demand instead of suffering for a couple of weeks until it’s unbearable and we’re forced to upgrade.

Surprise 2

W00t! This is an uplifting blogpost… lots of surprises! Anyway: the second surprise is that starting in October, I’m going to be decreasing the time I spend at my day-job by 40% so I’ll only be in the office 3 days per week. That’ll give me two whole days more each week where I can work on Motive Force products like LyricWiki. This should help things become much more stable very quickly.

Well, that was an abnormally enjoyable post for this blog – which usually just announces outages! The site still isn’t totally upgraded (because the weekend ended before I could make all of the extensions work), so I have to run back to that. I took snapshots of a bunch of performance stats before and after, so I’ll post those sometime soon.

Thanks for your patience during those past few weeks. Hopefully we’re in a whole new era for the site!


Setting up new server right now

September 20, 2008

The new server exists, I’m setting it up and am hoping I can have it all done before I go to sleep.

There is a decent amount of stuff that needs to be done before it’s all working. If you’re technical, or just like watching lists as they are being completed… I’ll be tracking the upgrade here.

I’ll also be making occasional updates to the @lyricwiki twitter account.

The new server will be another Apache server and is named “Cochise“. That is the name of the last great Apache chief, and also the inspriation for the song Cochise by Audioslave.

There will be downtime for an unknown length of time tonight. I’ll try to keep it to a minimum, but the site is so slow it’s practically down anyway.

PS: Special thanks to our awesome webhost for getting the new box here & set up quickly!

Finally fast again.

August 10, 2008

For the first time all day, the site is moving at what appears to be full-speed.

Also, to answer the earlier-posed question about the Squid having to log a ton of extra space when it restarts, it turns out that is true – apparently something (either Apache restarting or somehow detecting that the Squid just came back) triggers it so that the MediaWiki install sends a <em>ton</em> of “PURGE” requests to the Squid server which basically tell it to forget about a page it may be caching because it’s probably out of date now.  Each request is another line in the log-file, so that’s about 800,000 extra lines in the span of a few minutes.

Squid ran out of space again this morning (fixed now)

August 10, 2008

Apparently the site was down while I slept, but I had emails & comments in my inbox about the outage as soon as I got up so I was able to jump on it right away.

The Squid had run out of memory again.  The access log files for one day were 17 gigs.  That seems awfully high – maybe we’re getting spidered too hard or the logs go through serious stress after a restart?

I’ll be finding a more permanent solution to the issue, but in the meantime the site is “up” but it’s going to be fairly slow while the cache refills… again.

Squid back up :)

August 9, 2008

It had run out of harddrive space.   Moved some log files to another disk and we’re back in business.

Site is back up and looks good

June 7, 2008

It appears that the problems on the site were due to someone (read: me) messing up when they restarted the database-replication!

Thanks to a bunch of helpful problem-reports from a number of users, I had some good data to look at to figure out what was wrong. It was actually pretty easy to figure out once I had all of those problem-pages to look at (I’m talking about you Brian May!).

Bonus pretzel

While I was waiting for the computers to move some massive files around, I had a couple of minutes here and there to make other tweaks to the site. Two somewhat interesting things that came out of this time are that the 1) “job queue” is getting automatically run every hour now (which keeps things up to date and avoids assigning extra jobs to random users who would get like a 2 minute page-load randomly every 10,000 pages) and 2) the road-block page that shows up when the site is shut-down for maintenance now has an iframe in it which shows a google-search of for the same page and suggests that users click the “Cached” link. This will allow people to see a somewhat-recent of most of the pages even while the site is down. I dig it.

Special thanks/shoutouts to Kiefer, Redxx, Senvaikis, Teknomunk and WillMak050389 for their help figuring out what was wrong and testing things to make sure they were fixed!

As always, please let me know if you see something strange on the site. Thanks!

Site should be back now

June 7, 2008 is back up. I’m relatively sure I fixed everything. I’ll be back on to check things in a bit & I’ll have more details once I find out if things are actually working.