Skip to content

Kafka Consumer Hangs on poll(), Possible Deadlock in Library? #2459

Closed
@kietheros

Description

@kietheros

I am using a Kafka consumer with manual commit. My application has been running for almost a year, but recently, it started hanging at the consumer.poll() call.

When I dump the stack trace of the threads, I see the following two threads. It seems like there might be a bug in the library causing a deadlock. In the Kafka broker, everything appears to be working normally, and there are no errors.
I have also read [this issue](#1764), but that issue is related to the Kafka broker, whereas my Kafka setup is functioning correctly.

Has anyone encountered a similar issue or found a solution for this?

<Thread(Thread-36 (_run_auto_restart), started 124923968227008)>
....................................................................
    results = self.consumer.poll(
  File "/usr/local/lib/python3.10/site-packages/kafka/consumer/group.py", line 655, in poll
    records = self._poll_once(remaining, max_records, update_offsets=update_offsets)
  File "/usr/local/lib/python3.10/site-packages/kafka/consumer/group.py", line 675, in _poll_once
    self._coordinator.poll()
  File "/usr/local/lib/python3.10/site-packages/kafka/coordinator/consumer.py", line 270, in poll
    self.ensure_coordinator_ready()
  File "/usr/local/lib/python3.10/site-packages/kafka/coordinator/base.py", line 245, in ensure_coordinator_ready
    with self._client._lock, self._lock:
  File "/usr/local/lib/python3.10/threading.py", line 265, in __enter__
    return self._lock.__enter__()  <<<========= LOCK


<HeartbeatThread(post-processor-heartbeat, started daemon 124923819329216)>
  File "/usr/local/lib/python3.10/threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.10/site-packages/kafka/coordinator/base.py", line 935, in run
    self._run_once()
  File "/usr/local/lib/python3.10/site-packages/kafka/coordinator/base.py", line 993, in _run_once
    self.coordinator.maybe_leave_group()
  File "/usr/local/lib/python3.10/site-packages/kafka/coordinator/base.py", line 766, in maybe_leave_group
    with self._client._lock, self._lock:   <<<======= LOCK

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions