Flink kafka的offset怎么保存_大数据

2条回答

王超

2楼 · 2020-07-24 10:12

flink对接kafka这方面对接的比较好，由kafka自身维护就可以，外部并设置了checkpoint保存高版本低版本的kafka保证村早ZK中

乔治与佩奇

3楼 · 2021-12-03 11:32

Blog Address:http://blog.csdn.net/jsjsjs1789 https://blog.csdn.net/jsjsjs1789/article/details/88956080

Flink对Offset的管理，有两种方式： 1.Checkpointing disabled 完全依赖于kafka自身的API 2.Checkpointing enabled 当checkpoint做完的时候，会将offset提交给kafka or zk 本文只针对于第二种，Checkpointing enabled

FlinkKafkaConsumerBase中的 notifyCheckpointComplete

@Override//当checkpoint完成的时候，此方法会被调用
 public final void notifyCheckpointComplete(long checkpointId) throws Exception {
  if (!running) {
   LOG.debug("notifyCheckpointComplete() called on closed source");
   return;
  }

  final AbstractFetcher fetcher = this.kafkaFetcher;
  if (fetcher == null) {
   LOG.debug("notifyCheckpointComplete() called on uninitialized source");
   return;
  }

  if (offsetCommitMode == OffsetCommitMode.ON_CHECKPOINTS) {
   // only one commit operation must be in progress
   if (LOG.isDebugEnabled()) {
    LOG.debug("Committing offsets to Kafka/ZooKeeper for checkpoint " + checkpointId);
   }

   try {
    final int posInMap = pendingOffsetsToCommit.indexOf(checkpointId);
    if (posInMap == -1) {
     LOG.warn("Received confirmation for unknown checkpoint id {}", checkpointId);
     return;
    }

    @SuppressWarnings("unchecked")
    Map offsets =
     (Map) pendingOffsetsToCommit.remove(posInMap);

    // remove older checkpoints in map
    for (int i = 0; i < posInMap; i++) {
     pendingOffsetsToCommit.remove(0);
    }

    if (offsets == null || offsets.size() == 0) {
     LOG.debug("Checkpoint state was empty.");
     return;
    }

   //通过kafkaFetcher提交offset fetcher.commitInternalOffsetsToKafka(offsets, offsetCommitCallback);
   } catch (Exception e) {
    if (running) {
     throw e;
    }
    // else ignore exception if we are no longer running
   }
  }
 }

跳转到kafkaFetcher

@Override protected void doCommitInternalOffsetsToKafka(
  Map offsets,
  @Nonnull KafkaCommitCallback commitCallback) throws Exception {

  @SuppressWarnings("unchecked")
  List> partitions = subscribedPartitionStates();

  Map offsetsToCommit = new HashMap<>(partitions.size());

  for (KafkaTopicPartitionState partition : partitions) {
   Long lastProcessedOffset = offsets.get(partition.getKafkaTopicPartition());
   if (lastProcessedOffset != null) {
    checkState(lastProcessedOffset >= 0, "Illegal offset value to commit");

    // committed offsets through the KafkaConsumer need to be 1 more than the last processed offset.
    // This does not affect Flink's checkpoints/saved state.
    long offsetToCommit = lastProcessedOffset + 1;

    offsetsToCommit.put(partition.getKafkaPartitionHandle(), new OffsetAndMetadata(offsetToCommit));
    partition.setCommittedOffset(offsetToCommit);
   }
  }

  // record the work to be committed by the main consumer thread and make sure the consumer notices that
  consumerThread.setOffsetsToCommit(offsetsToCommit, commitCallback);
 }

可以看到调用consumerThread.setOffsetsToCommit方法

void setOffsetsToCommit(
   Map offsetsToCommit,
   @Nonnull KafkaCommitCallback commitCallback) {

  // record the work to be committed by the main consumer thread and make sure the consumer notices that
  /*
  !=null的时候，说明kafkaConsumerThread更新的太慢了，新的将会覆盖old
  当此处执行的时候，kafkaconsumerThread中consumer.commitAsync()
  
这个方法还是关键的方法，直接给nextOffsetsToCommit赋值了
nextOffsetsToCommit，我们可以看到是AtomicReference，可以原子更新对象的引用
   */
 
  if (nextOffsetsToCommit.getAndSet(Tuple2.of(offsetsToCommit, commitCallback)) != null) {
   log.warn("Committing offsets to Kafka takes longer than the checkpoint interval. " +
     "Skipping commit of previous offsets because newer complete checkpoint offsets are available. " +
     "This does not compromise Flink's checkpoint integrity.");
  }

  // if the consumer is blocked in a poll() or handover operation, wake it up to commit soon
  handover.wakeupProducer();

  synchronized (consumerReassignmentLock) {
   if (consumer != null) {
    consumer.wakeup();
   } else {
    // the consumer is currently isolated for partition reassignment;
    // set this flag so that the wakeup state is restored once the reassignment is complete
    hasBufferedWakeup = true;
   }
  }
 }

nextOffsetsToCommit已经有值了，接下我们来看一下KafkaConsumerThread的run方法

@Override public void run() {
  // early exit check
  if (!running) {
   return;
  }

  ......
   // main fetch loop
   while (running) {

    // check if there is something to commit//default false
    if (!commitInProgress) {
     // get and reset the work-to-be committed, so we don't repeatedly commit the same//setCommittedOffset方法已经给nextOffsetsToCommit赋值了，这里进行获取，所以commitOffsetsAndCallback is not null
     final Tuple2, KafkaCommitCallback> commitOffsetsAndCallback =
       nextOffsetsToCommit.getAndSet(null);

     if (commitOffsetsAndCallback != null) {
      log.debug("Sending async offset commit request to Kafka broker");

      // also record that a commit is already in progress
      // the order here matters! first set the flag, then send the commit command.
      commitInProgress = true;
      consumer.commitAsync(commitOffsetsAndCallback.f0, new CommitCallback(commitOffsetsAndCallback.f1));
     }
    }

    ....
 }

至此offset就更新完毕了，我们可以很清楚的看到，当checkpoint完成时，调用相关的commit方法，将kafka offset提交至kafka broker

Flink kafka的offset怎么保存

相关问题推荐

等你来答

热门问答

相关文章

Flink kafka的offset怎么保存

相关问题推荐

等你来答

热门问答

相关文章

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间