Wednesday, April 8, 2009

Log4erl issue with high load

Hi,

I've got a case in Log4erl google group mailing list regarding a "out of memory" crash in log4erl. This is an issue that I've been aware of for some time now. It's actually the downside of using concurrent language where you can spawn lots of processes. A similar case can happen with error_logger.

The problem is that many processes may be sending log requests to a logger that has to perform IO on it. Given that IO is slow, requests coming from many processes will accumulate in the process inbox and logger will be overwhelmed eventually causing "out of memory" crash.

In log4erl case, and presumably error_logger, log requests are asynchronous. Basically, log_manager module will use gen_enent:notify/2 to dispatch requests to the appropriate logger and this operation is non-blocking.

You can create multiple loggers, degrade log level to 'warn' or 'fatal' for example. But this will not really solve the problem. A solution is to change notify/2 to sync_notify/2, but this will slow logging down. Also, I think this is the easy but not the right way to proceed.

I think the best solution is to use a persistent db (queue) to offload requests from logger's inbox. Until now, I can only think of using something like memcacheq so that log_manager will save requests into a queue and loggers will fetch them from there. I'm still thinking about this but haven't made my mind yet. Implementation will not be easy as we have to take into account many cases (e.g. queue is empty). If you have any suggestions on how to best do this, please let me know.

Best regards,

Ahmed

Update 01-May-2009:
It turned out that this is mainly because in Eralng selective receive is very slow when there are so many messages in process messages. That is why synchronous logging has better performance. I'll change the default behavior in the next version of log4erl to sync instead of async event notification. This should solve the erlang crash reported in log4erl google group.

No comments: