December 14, 2008
A post on LifeHacker mentioned that Last.fm has released their lists of the most listened-to songs of 2008. Since their service “scrobbles” (logs) everything that their users listen to, they have a pretty massive data-set. They are UK-based, so there is quite a bit of a UK bias in the results, but they are still interesting nonetheless.
There were two somewhat-annoying things about the lists: 1) each entry is on a different page, so you have to go to 30 pages to see all of the results 2) no links to lyrics!
So we’ve taken the liberty of compiling them for you into a nice concise list of links to the lyrics-pages:
http://lyricwiki.org/LyricWiki:Lists/2008/Last.fm
Enjoy!
2 Comments |
Links, LyricWiki, LyricWiki.org, Lyrics, Statistics | Tagged: 2008, coldplay, last.fm, lists, Lyrics |
Permalink
Posted by Sean Colombo
September 12, 2008
This week, the site has been extremely slow and even gone down and up a couple of times. I searched for a problem for a while but it appears that we’ve just really hit the wall on how much traffic we can support with our current servers. That’s fairly good timing since we’d been planning to move to more servers for a little while, so I’d already begun to look into it.
Today I ordered another server with the same specs as the current Apache server. This will bring us up to 5 total servers running LyricWiki. For the curious (and tech-savvy): that’s one squid caching server in front of two Apache web servers which talk to one mysql master server and one read-only replica mysql server.
To get the server to be as beefy as we need, I had to ask the hosting company to order extra RAM for it. So we’re just waiting for that to be delivered (hopefully around this weekend or very soon after) and we’ll be ready to start working to get the new server pulled into our setup.
In addition to just having more man-power machine-power to handle our traffic, this will give two additional benefits immediately. The first is that we can use the new server to test out the upgrade to the newest version of MediaWiki (the software that runs our site as well as Wikipedia). The second benefit is that now we’ll have two Apache servers – currently the most overworked part of the system – with one running the API and one running the site itself (lyricwiki.org). This will let us more quickly identify when something is wrong with one of those two systems and it will make sure that problems with either of them are unlikely to effect the other.
Exciting times… stay tuned!
4 Comments |
LyricWiki, LyricWiki.org, Lyrics, OpenSource, Traffic | Tagged: Apache, Lyrics, LyricWiki, MediaWiki, mySQL, servers, Squid, web-hosting, wiki |
Permalink
Posted by Sean Colombo
August 10, 2008
For the first time all day, the site is moving at what appears to be full-speed.
Also, to answer the earlier-posed question about the Squid having to log a ton of extra space when it restarts, it turns out that is true – apparently something (either Apache restarting or somehow detecting that the Squid just came back) triggers it so that the MediaWiki install sends a <em>ton</em> of “PURGE” requests to the Squid server which basically tell it to forget about a page it may be caching because it’s probably out of date now. Each request is another line in the log-file, so that’s about 800,000 extra lines in the span of a few minutes.
1 Comment |
LyricWiki, LyricWiki.org, Lyrics, Outages, Traffic | Tagged: MediaWiki, Squid |
Permalink
Posted by Sean Colombo
August 10, 2008
Apparently the site was down while I slept, but I had emails & comments in my inbox about the outage as soon as I got up so I was able to jump on it right away.
The Squid had run out of memory again. The access log files for one day were 17 gigs. That seems awfully high – maybe we’re getting spidered too hard or the logs go through serious stress after a restart?
I’ll be finding a more permanent solution to the issue, but in the meantime the site is “up” but it’s going to be fairly slow while the cache refills… again.
Leave a Comment » |
LyricWiki, LyricWiki.org, Lyrics, Outages, Traffic | Tagged: Squid, squidcache |
Permalink
Posted by Sean Colombo
April 1, 2008
Don’t worry… be happy.
UPDATE: This was an April Fool’s joke. On April 1st, every page on LyricWiki resulted in a Rick-Rolling. To view the page as it would have been on April 1st, please try this permanent link.
Leave a Comment » |
Announcements, LyricWiki, LyricWiki.org, Lyrics | Tagged: announcement, funny, happy, Lyrics, Music, news, worry |
Permalink
Posted by Sean Colombo
March 30, 2008
I’m proud to announce a long-overdue feature: implied redirects.
Implied redirects make it so that the site can often understand what you’re looking for even if it is misspelled or we don’t have a redirect page for the specific song. For example, we have a redirect from the band name “Of A Revolution” to their preferred form “O.A.R.“. However, if someone comes to the site, we do not have a redirect from “Of A Revolution:Crazy Game Of Poker” to the correct page: “O.A.R.:Crazy Game Of Poker“. With the new implied-redirects extension, the site will automatically figure out what you meant and display the correct page. To see it in action, go to “Of A Revolution:Crazy Game Of Poker“.
Implied redirects have been active in the API for quite some time, but didn’t work on the site until tonight.
5 Comments |
Announcements, Features, LyricWiki, LyricWiki.org, Lyrics | Tagged: feature, Lyrics, MediaWiki, redirects, release, upgrade, wiki |
Permalink
Posted by Sean Colombo
November 15, 2007
If you’ve been using the API over the past week, you probably noticed the painfully large percentage of results that were being returned as “Not found”.
The large increase in traffic recently was causing us to get “Too many connections” errors when the API was left alone, so had to turn on a throttling system which would randomly drop a certain percentage of the requests. Looking into our server logs, I found out that our actual web server (behind our Squid caching server which serves up 30% of our pages) has been getting over 1 million page requests per day! Wow… that explains the scaling problems.
I was overly busy for most of the week (a drawback of having LyricWiki not be my “day-job”), so I first got to really attack the problem tonight. It appears that everything is back up to speed, and the throttling is turned off. I’ll be keeping an eye on how the site is doing tomorrow during peak traffic time, but I think we should be okay.
I have some more fixes planned for the near-future which should make it so the API can continue to handle increasing traffic. I probably won’t post about them as they happen, but hopefully you’ll notice an increase in the speed that results are served up.
3 Comments |
LyricWiki, LyricWiki.org, Lyrics, Outages | Tagged: api, database, Lyrics, LyricWiki, MediaWiki, scaling, wiki |
Permalink
Posted by Sean Colombo
November 7, 2007
Yesterday I got the replicated slave database up and running and even made the API use MediaWiki’s built-in database-connections which are persistent, so that should have knocked down the amount of connect/disconnects (which are time-expensive).
Today, we’ve still been getting “Too Many Connections” errors… possibly because the MediaWiki persistent connections don’t close very quickly? We’ll be looking into this some more… maybe I need new stats on how much traffic the API is getting.
Anyway, the solution I’ve taken is that during peak times, I keep setting the API to drop a certain percent of requests. This isn’t a cool solution, so I’ll be trying to figure out a better way… anyone have any ideas?
7 Comments |
LyricWiki, LyricWiki.org, Lyrics, Outages | Tagged: wiki MediaWiki |
Permalink
Posted by Sean Colombo
October 24, 2007
LyricWiki just got better because we’re much faster now at delivering lyrics to the world and much stronger against slashdot-effect or digg-effect problems.
Our server upgrades are continuing along quite smoothly. Last night we got Squid caching up and running. That means that if the wiki delivers a page to a logged-out user once, the rendered page is saved by Squid until it is changed, saving all of the database lookups and processing to turn the WikiText into a full-blown HTML page. This measure is awesome since it not only makes a large majority of the browsing faster, it also makes the site extremely resistant to Slashdotting / Digging / etc. (since those are logged-out users all accessing the same pages – which would be in the cache).
Currently, only about 30% of our page requests are getting served by the Squid, but that’s partially since the API has people sending all kinds of weird requests at it (varied spellings, capitalizations, etc.). Wikipedia serves around 60% of its pages through their Squids, so we have potential for even more savings as the web-traffic catches up to the API traffic.
Tonight, I’ll be moving on to try to use load-balancing to get our other web-server into the party (this is a bit trickier than it sounds, so it might take a while). Then I’ll try to upgrade the new web-server with APC like I did for the first web-server yesterday. Once that setup is done, I’m going to be begging for a slashdot just to see how well the servers can fare against that kind of onslaught (we can take it!).
Leave a Comment » |
LyricWiki, LyricWiki.org, Lyrics, Statistics, Traffic |
Permalink
Posted by Sean Colombo