ActiveMQ: KahaDB Journal Files - More Than Just Message Content Bits
I recently came across an issue where the ActiveMQ KahaDB journal files
were continually rolling despite the fact that only a small number of
small persistent messages were occasionally being stored by the
broker. This behavior seemed very strange being that the message sizes
being persisted were only a couple of kilobytes and there was a
relatively small amount of messages actually on a queue. In this
scenario something was filling up the 32MB journal files, but I wasn't
quite sure what it could be? Were there other messages somewhere in the
broker? Did an index get corrupted that was actually causing messages
to be written across multiple journal files? It was pretty strange
behavior but it can be explained fairly easily. This post describes the
actual cause of this behavior and I have created it to remind myself in
the future that there is more in the journal file than just the message
content bits.
The KahaDB journal files are used to store persistent messages that have
been sent to the broker. In addition to storing the message content,
the journal files also store information on KahaDB commands and
transactional information. There are several commands for which
information is stored;
KahaAddMessageCommand, KahaCommitCommand,
KahaPrepareCommand, KahaProducerAuditCommand,
KahaRemoveDestinationCommand, KahaRemoveMessageCommand,
KahaRollbackCommand, KahaSubscriptionCommand, and KahaTraceCommand.
In
this particular case, it was the KahaProducerAuditCommand which was
responsible for the behavior that was observed. This command stores
information about producer ids and message ids which is used for
duplicate detection. In this case information is stored in a map object
which over time grows. This information is then stored in the journal
file each time a checkpoint is run, which by default is every 5 seconds.
Over time, this can begin to use up the space allocated by the journal
file causing low volume smaller messages to roll to the next journal
file which in turn prevents the broker from cleaning up journal files
which still have referenced messages. Eventually this situation can
lead to Producer Flow Control being trigger by the broker's store limit
which prevents producers from sending new messages into the broker.
This behavior can occur under the following conditions:
- Persistent messages are being sent to a queue
- The messages are not being consumed on a regular basis
- The rate of messages being sent to the broker is low
<persistenceAdapter>
<kahaDB directory="${activemq.base}/data/kahadb" failoverProducersAuditDepth="0" maxFailoverProducersToTrack="0"/>
</persistenceAdapter> This is something to think about when designing your system. Under
normal circumstances, if you have consumers available to consume the
persistent messages, this condition would probably never occur as the
journal files roll and messages are consumed, the broker can begin to
clean up old journal files.There is currently an enhancement request at Apache which will also help resolve this issue. AMQ-3833 has been opened to enhance the broker so it will only write the KahaProducerAuditCommand if a change has occurred since the last checkpoint. This will help reduce the amount of data that is written to the journal files in between message storage.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





