January 28, 2007
As I’ve mentioned before, LyricWiki’s long-term financial stability is going to be taken care of by rolling it into the forthcoming Pedlr. It will still be the cool site you all know and love, but it will technically be a portion of Pedlr, so they will be in the same server farm.
LyricWiki has gotten very stable over the last couple of weeks, serving over 150,000 pages (or SOAP results) a day to hungry lyrics fans everywhere… but it’s about to get upgraded in a big way.
Right now everything is on one high-grade server, but to get ready to handle the anticipated launch of Pedlr, we’re going to be getting more web-servers and moving the database to its own special server with 4 Gigs of RAM (this makes it serve lyrics faster!). We’ll be able to support more users at once, more responses through the lyrics api, and every request will go faster than ever.
Very exciting! I’ll let you know when we have the new hardware ready… 🙂
January 7, 2007
Something is wrong on the server. It’s not slow in the normal sense caused by traffic. It’s completely bogged down: I could barely log in over SSH, it took a full minute just to run ‘ps’ and logging into mysql is taking a long… long time (still not done). I’ll update more once I figure out the problem.
…and I was all happy after the upgrades yesterday
UPDATE1: I tried to restart the mysql server. I issued the command then waited… but it was taking so long that SSH disconnected me. I can’t reconnect, I’ve been trying on and off for about an hour. Bad. During the brief time that I was on I ran a SHOW PROCESSLIST command and noticed that part of the problem is coming from the fact that Plugins using the SOAP don’t stop making requests when the server is slow (humans on the site realize that it is slow or broken and stop browsing). So there were 269 queries running at once and the vast majority of them were SOAP requests. If the plugin you use has an option to turn it off temporarily, I would appreciate it if you do that! :]
UPDATE2:I called our webhost and they’re going to manually reboot the box and call right back, then I should be able to get everything working.
UPDATE3:The site is back up. I’m pouring through logs and such trying to figure out what went wrong. So far I’ve ruled out a DoS attack from any single source, so it appears that most likely traffic just got too high and my setting of max_connections was so high that things just went from bad to worse during a high-traffic time of day.
January 7, 2007
At the board meeting today, we decided on a February 26th launch date for Pedlr.
I can’t get much more into what Pedlr is (yet), but as it relates to LyricWiki, it is a site which will contain LyricWiki and add on a TON of other features (which won’t interfere with the LyricWiki part). The best part is that since it will have a decent business-model, it will be able to pull the server-expense of LyricWiki which takes a lot of bandwidth/space/cpu-power to handle its traffic.
Earlier I mentioned that I would be making some improvements to the SOAP webservice. I thought I should take a minute to recap what was done today (developers may be interested?).
- Speed – Ignored the common format (artsist=””, song=”Track ##”) automatically before checking the database.
- Accuracy – Added an additional test so that songs with trailing parentheses get checked for versions without the parentheses. For example, if you did a getSong for (“Disturbed”, “Want (Remix)”), since there is no remix, the webservice would figure out that you actually wanted Want by Disturbed.
- Accuracy – Re-enabled the artist-redirect trick (which was too slow the old way). For example if you did getSong(“Prodigy”, “Action Radar”), the ws detects that there is no page by that name, however “Prodigy” redirects to “The Prodigy” and there IS a page for getSong(“The Prodigy”, “Action Radar”).
- Speed -Replaced the MediaWiki code to check if an article exists. That code was written for a different purpose (using an article Title several times on a single page), so it does a whole bunch of extra initialization that wasn’t needed. I just wrote a simple query to replace it and added caching of that single-byte result so that checks for the same title (during the same page load) have no additional database-query overhead.
- Speed – In the list of most common failed SOAP requests, I added 2-hour caching of the results so that users can go to the page as frequently as needed without running that fairly expensive query each time.
- Speed – Took out the double-logging of SOAP requests. For statistics, hits to the server are logged. Hits to the SOAP are logged. Now the hits to the SOAP are not added to the main statistics (which was resulting in double the amount of queries).
The only improvement I would have liked to have made, but decided against was the automatic removal of commonly failed requests after the first time they are successfully found. This was a great suggestion by admin Teknomunk, but I couldn’t figure out a way to do it without slowing down the whole system by issuing an extra query every time there was a successful SOAP request (which is currently about 40% of the SOAP requests). This extra query would be unneeded in the vast majority of cases (only helps for the first time a previous-failure is requested successfully). On the flipside, this is fairly needed so that the site can be self-sufficient (right now it requires me to manually remove entries from the table when I see they’ve been fixed… and a site shouldn’t ever be held up waiting for a single person), so I’m still thinking of a way that it could be done efficiently.
January 4, 2007
Thanks to technomunk and remiss, I’ve figured out one of the major problems that has been pwning the speed of the site.
To try to get more funds to get another server to handle the increase in traffic (which has now been driven away a bit by the slow page-loads), I started experimenting with other ad services. I tried TextLinkAds, AdBrite and ContextWeb. The layering of these ad networks was what appeared to be destroying page-loads.
It worked like this:
- TextLinkAds would check their server for any new ads through a fopen() type request (so they loaded another page).
- Since no TextLinkAds have been sold, the ContextWeb code would then be called.
- After a few page-loads ContextWeb no longer seemed to be able to find ads for us (maybe because their spider gave up on our slow pages?) and it would default to call the AdBrite code
- AdBrite would make sure that it could compete with the eCPM of Google AdSense (eCPM means estimated cost per thousand page loads… ‘m’ stands for mili in latin). Since AdBrite could not get high enough bids, it would then default to the AdSense.
So all of this was happening just to end up calling AdSense anyway. Lame! Guess we’re stuck with AdSense for the moment.
The first page-load is still really slow. UPDATE: Fixed!
The SOAP is still slower than I would like, but that’s another issue (that I’m working on… it’s straightforward, but will take some time to write/test). UPDATE: Also fixed now!
January 3, 2007
Back to the same old grind of server problems.
The cause is unknown, I have restarted mySQL, Apache, and even rebooted the server. The number of users coming to the site is about normal, but the page views are much lower because people aren’t sticking around due to the reeeealy slow load times.
If anyone has any ideas about how to diagnose what could be wrong… I’m listening! 🙂
January 1, 2007
There was an outage today for unknown reasons from about 4:00pm to 7:00pm when we restarted the webserver. I’m looking into possible causes, but it certainly wasn’t traffic (traffic is very low during the holiday break, especially on holidays). Thanks to Sherry for notifying me of the downtime.
On the bright side, this will almost certainly be the last outage of the year!
Sorry for the downtime and thank you for your patience.