Skip to content

Why might I get ALL_BROKERS_DOWN at 10 second intervals on a consumer? #227

Closed
@johnrgregg3

Description

@johnrgregg3

I am a consumer in a test lab, and the producer is on the same machine as my single Kafka server. That is, it is a single-node cluster, with zookeeper running on the same machine as well. The consumer is on the same network, and I am sure there are no connectivity problems between consumer and broker. The consumer gets ALL_BROKERS_DOWN a lot, either exactly 10, 20, or 30 seconds after the last time, and when I say exactly, I mean almost to the thousandth of a second. Here is an excerpt from my log file:

[2017-08-03 20:24:12,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:24:22,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:24:42,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:24:52,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:25:02,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:25:22,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)
[2017-08-03 20:25:32,652] [18018] [vocfe_notify_server.py:634] [DEBUG] vocfe_kafka_error(198.18.130.41): code -187, name _ALL_BROKERS_DOWN (former kafka_error_code -192)

There are occasional exceptions to this 10-second rule, but look at all the 652's in the lines above. These are the thousandths of a second.

I get MSG_TIMED_OUT as well (on my consumer). I also sometimes manage to receive some actual messages between times I get ALL_BROKERS_DOWN. What could this mean? Is there some periodic heartbeat/handshake or something that takes place on 10-second intervals between the consumer and the broker/cluster? The consumer is not doing anything every 10 seconds.

Metadata

Metadata

Assignees

No one assigned

    Labels

    status:needs-more-infoIssues that require more information to cleanup.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions