Saturday, August 2, 2008

Performance issues during the past few days

I'm currently in Mexico, performing at the Zacatecas International Folklore Festival. I have been in Mexico for just about two weeks now, first having a holiday in Mexico City and Oaxaca, and now playing guitar with the Finnish folk dance group Petkele at the festival. So, I haven't been paying much attention to the service.

But today I got an automatic alarm about the service being pretty slow, so I checked the systems and found out that my recent status message view changes had introduced a memory leak in the backend system, and the master server was swapping pretty much. This has caused some slowness and data loss over the past 4 or 5 days, as you can see in the statistic graphs. I restarted the data collector processes to regain the leaked memory, and will look into this in more detail once I'm back home. It leaks so slowly that it should work just fine for a week.

I'll have to tune the alarm system to be a bit more sensitive for this kind of slowness. I think I've forgotten to set up the "too much swap IO" alarm, too.

