国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Java javaTutorial Belay the Metamorphosis: analyzing Kafka project

Belay the Metamorphosis: analyzing Kafka project

Oct 16, 2024 pm 08:09 PM

您有沒有想過跨國公司的項目源代碼中可能潛藏著哪些錯誤?不要錯過在開源 Apache Kafka 項目中發(fā)現(xiàn) PVS-Studio 靜態(tài)分析器檢測到的有趣錯誤的機會。

Belay the Metamorphosis: analyzing Kafka project

介紹

Apache Kafka 是一個著名的開源項目,主要用 Java 編寫。 LinkedIn 于 2011 年將其開發(fā)為消息代理,即各種系統(tǒng)組件的數(shù)據(jù)管道。如今,它已成為同類產(chǎn)品中最受歡迎的解決方案之一。

準(zhǔn)備好看看引擎蓋下的內(nèi)容了嗎?

附注
只是想簡單說明一下標(biāo)題。它參考了弗朗茨·卡夫卡的《變形記》,其中主角變成了可怕的害蟲。我們的靜態(tài)分析器致力于防止您的項目變身為可怕的害蟲轉(zhuǎn)變?yōu)橐粋€巨大的錯誤,所以對“變形記”說不。

哦不,蟲子

所有的幽默都源于痛苦

這不是我的話;這句話出自理查德·普賴爾之口。但這有什么關(guān)系呢?我想告訴你的第一件事是一個愚蠢的錯誤。然而,在多次嘗試?yán)斫獬绦驘o法正常運行的原因后,遇到如下示例的情況令人沮喪:

@Override
public KeyValueIterator<Windowed<K>, V> backwardFetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  if (keyFrom == null && keyFrom == null) {   // <=
    kvSubMap = kvMap;
  } else if (keyFrom == null) {
    kvSubMap = kvMap.headMap(keyTo, true);
  } else if (keyTo == null) {
    kvSubMap = kvMap.tailMap(keyFrom, true);
  } else {
    // keyFrom != null and KeyTo != null 
    kvSubMap = kvMap.subMap(keyFrom, true, keyTo, true);
  } 
  ....
}

如您所見,這是任何開發(fā)人員都無法避免的事情——一個微不足道的拼寫錯誤。在第一個條件下,開發(fā)人員希望使用以下邏輯表達(dá)式:

keyFrom == null && keyTo == null

分析器發(fā)出兩個警告:

V6001 在“&&”運算符的左側(cè)和右側(cè)有相同的子表達(dá)式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 327、ReadOnlyWindowStoreStub.java 327

V6007 表達(dá)式“keyFrom == null”始終為 false。 ReadOnlyWindowStoreStub.java 329

我們可以明白為什么。對于每個開發(fā)人員來說,這種可笑的打字錯誤都是永恒的。雖然我們可以花很多時間尋找它們,但要回憶起它們潛伏的地方可不是小菜一碟。

在同一個類中,另一個方法中存在完全相同的錯誤。我認(rèn)為稱其為復(fù)制面食是公平的。

@Override
public KeyValueIterator<Windowed<K>, V> fetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  NavigableMap<K, V> kvMap = data.get(now);
  if (kvMap != null) {
    NavigableMap<K, V> kvSubMap;
    if (keyFrom == null && keyFrom == null) {      // <=
      kvSubMap = kvMap;
    } else if (keyFrom == null) {
      kvSubMap = kvMap.headMap(keyTo, true);
    } else if (keyTo == null) {
      kvSubMap = kvMap.tailMap(keyFrom, true);
    } else {
      // keyFrom != null and KeyTo != null
      kvSubMap = kvMap.subMap(keyFrom, true, keyTo, true);
    }
  }
  ....
}

以下是相同的警告:

V6007 表達(dá)式“keyFrom == null”始終為 false。 ReadOnlyWindowStoreStub.java 273

V6001 在“&&”運算符的左側(cè)和右側(cè)有相同的子表達(dá)式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 271, ReadOnlyWindowStoreStub.java 271

不用擔(dān)心——我們不必一次查看數(shù)百行代碼。 PVS-Studio 非常擅長處理此類簡單的事情。解決一些更具挑戰(zhàn)性的事情怎么樣?

可變同步

Java 中 synchronized 關(guān)鍵字的用途是什么?在這里,我將只關(guān)注同步方法,而不是塊。根據(jù) Oracle 文檔,synchronized 關(guān)鍵字將方法聲明為同步,以確保與實例的線程安全交互。如果一個線程調(diào)用該實例的同步方法,則嘗試調(diào)用同一實例的同步方法的其他線程將被阻塞(即它們的執(zhí)行將被掛起)。它們將被阻塞,直到第一個線程調(diào)用的方法處理其執(zhí)行。當(dāng)實例對多個線程可見時,需要執(zhí)行此操作。此類實例的讀/寫操作只能通過同步方法執(zhí)行。

開發(fā)人員違反了 Sensor 類中的規(guī)則,如下面的簡化代碼片段所示。對實例字段的讀/寫操作可以通過同步和非同步兩種方式執(zhí)行。它可能會導(dǎo)致競爭條件并使輸出不可預(yù)測。

private final Map<MetricName, KafkaMetric> metrics;

public void checkQuotas(long timeMs) {                  // <=
  for (KafkaMetric metric : this.metrics.values()) {
    MetricConfig config = metric.config();
    if (config != null) {
      ....
    }
  }
  ....
}  

public synchronized boolean add(CompoundStat stat,      // <=
                                MetricConfig config) {       
  ....
  if (!metrics.containsKey(metric.metricName())) {         
    metrics.put(metric.metricName(), metric);            
  }  
  ....
}  

public synchronized boolean add(MetricName metricName,  // <=
                                MeasurableStat stat, 
                                MetricConfig config) {  
  if (hasExpired()) {
    return false;
  } else if (metrics.containsKey(metricName)) {
    return true;
  } else {
    ....
    metrics.put(metric.metricName(), metric);
    return true;
  }
}

分析器警告如下所示:

V6102 “metrics”字段同步不一致。考慮在所有用途上同步該字段。傳感器.java 49,傳感器.java 254

如果不同的線程可以同時更改實例狀態(tài),則允許此操作的方法應(yīng)該同步。如果程序沒有預(yù)料到多個線程可以與實例交互,則使其方法同步是沒有意義的。最壞的情況下,甚至?xí)p害程序性能。

程序中有很多這樣的錯誤。這是分析器發(fā)出警告的類似代碼片段:

private final PrefixKeyFormatter prefixKeyFormatter; 

@Override
public synchronized void destroy() {                // <=
  ....
  Bytes keyPrefix = prefixKeyFormatter.getPrefix();
  ....
}

@Override
public void addToBatch(....) {                      // <=
  physicalStore.addToBatch(
    new KeyValue<>(
    prefixKeyFormatter.addPrefix(record.key),
    record.value
    ), batch
  );
} 

@Override
public synchronized void deleteRange(....) {        // <=
  physicalStore.deleteRange(
    prefixKeyFormatter.addPrefix(keyFrom),
    prefixKeyFormatter.addPrefix(keyTo)
  );
}

@Override
public synchronized void put(....) {                // <=
  physicalStore.put(
    prefixKeyFormatter.addPrefix(key),
    value
  );
}

分析器警告:

V6102 “prefixKeyFormatter”字段同步不一致。考慮在所有用途上同步該字段。 LogicalKeyValueSegment.java 60、LogicalKeyValueSegment.java 247

Iterator, iterator, and iterator again...

In the example, there are two rather unpleasant errors within one line at once. I'll explain their nature within the part of the article. Here's a code snippet:

private final Map<String, Uuid> topicIds = new HashMap(); 

private Map<String, KafkaFutureVoid> handleDeleteTopicsUsingNames(....) { 
  ....
  Collection<String> topicNames = new ArrayList<>(topicNameCollection);

  for (final String topicName : topicNames) {
    KafkaFutureImpl<Void> future = new KafkaFutureImpl<>();

    if (allTopics.remove(topicName) == null) {
      ....
    } else {
      topicNames.remove(topicIds.remove(topicName));      // <=
      future.complete(null);
    }
    ....
  }
}

That's what the analyzer shows us:

V6066 The type of object passed as argument is incompatible with the type of collection: String, Uuid. MockAdminClient.java 569

V6053 The 'topicNames' collection of 'ArrayList' type is modified while iteration is in progress. ConcurrentModificationException may occur. MockAdminClient.java 569

Now that's a big dilemma! What's going on here, and how should we address it?!

First, let's talk about collections and generics. Using the generic types of collections helps us avoid ClassCastExceptions and cumbersome constructs where we convert types.

If we specify a certain data type when initializing a collection and add an incompatible type, the compiler won't compile the code.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<String> set = new HashSet<>();
    set.add("str");
    set.add(UUID.randomUUID()); // java.util.UUID cannot be converted to
                                // java.lang.String
  }
}

However, if we delete an incompatible type from our Set, no exception will be thrown. The method returns false.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<String> set = new HashSet<>();
    set.add("abc");
    set.add("def");
    System.out.println(set.remove(new Integer(13))); // false
  }
}

It's a waste of time. Most likely, if we encounter something like this in the code, this is an error. I suggest you go back to the code at the beginning of this subchapter and try to spot a similar case.

Second, let's talk about the Iterator. We can talk about iterating through collections for a long time. I don't want to bore you or digress from the main topic, so I'll just cover the key points to ensure we understand why we get the warning.

So, how do we iterate through the collection here? Here is what the for loop in the code fragment looks like:

for (Type collectionElem : collection) {
  ....
}

The for loop entry is just syntactic sugar. The construction is equivalent to this one:

for (Iterator<Type> iter = collection.iterator(); iter.hasNext();) {
  Type collectionElem = iter.next();
  ....
}

We're basically working with the collection iterator. All right, that's sorted! Now, let's discuss ConcurrentModificationException.

ConcurrentModificationException is an exception that covers a range of situations both in single-threaded and multi-threaded programs. Here, we're focusing on single-threading. We can find an explanation quite easily. Let's take a peek at the Oracle docs: a method can throw the exception when it detects parallel modification of an object that doesn't support it. In our case, while the iterator is running, we delete objects from the collection. This may cause the iterator to throw a ConcurrentModificationException.

How does the iterator know when to throw the exception? If we look at the ArrayList collection, we see that its parent, AbstactList, has the modCount field that stores the number of modifications to the collection:

protected transient int modCount = 0;

Here are some usages of the modCount counter in the ArrayList class:

public boolean add(E e) {
  modCount++;
  add(e, elementData, size);
  return true;
}

private void fastRemove(Object[] es, int i) {
  modCount++;
  final int newSize;
  if ((newSize = size - 1) > i)
    System.arraycopy(es, i + 1, es, i, newSize - i);
  es[size = newSize] = null;
}

So, the counter is incremented each time when the collection is modified.

Btw, the fastRemove method is used in the remove method, which we use inside the loop.

Here's the small code fragment of the ArrayList iterator inner workings:

private class Itr implements Iterator<E> {
  ....
  int expectedModCount = modCount;            

  final void checkForComodification() {
  if (modCount != expectedModCount)               // <=
    throw new ConcurrentModificationException();
  }

  public E next() {
    checkForComodification();              
    ....
  }

  public void remove() {
    ....
    checkForComodification();             

    try {
      ArrayList.this.remove(lastRet);   
      ....
      expectedModCount = modCount;
    } catch (IndexOutOfBoundsException ex) {
      throw new ConcurrentModificationException();
    }
  }
  ....
  public void add(E e) {
    checkForComodification();            
    try {
      ....
      ArrayList.this.add(i, e);        
      ....
      expectedModCount = modCount;     
    } catch (IndexOutOfBoundsException ex) {
      throw new ConcurrentModificationException();
    }
  }
}

Let me explain that last fragment. If the collection modifications don't match the expected number of modifications (which is the sum of the initial modifications before the iterator was created and the number of the iterator operations), a ConcurrentModificationException is thrown. That's only possible when we modify the collection using its methods while iterating over it (i.e. in parallel with the iterator). That's what the second warning is about.

So, I've explained you the analyzer messages. Now let's put it all together:

We attempt to delete an element from the collection when the Iterator is still running:

topicNames.remove(topicIds.remove(topicName)); 
// topicsNames – Collection<String>
// topicsIds – Map<String, UUID>

However, since the incompatible element is passed to ArrayList for deletion (the remove method returns a UUID object from topicIds), the modification count won't increase, but the object won't be deleted. Simply put, that code section is rudimentary.

I'd venture to guess that the developer's intent is clear. If that's the case, one way to fix these two warnings could be as follows:

Collection<String> topicNames = new ArrayList<>(topicNameCollection);

List<String> removableItems = new ArrayList<>();

for (final String topicName : topicNames) {
  KafkaFutureImpl<Void> future = new KafkaFutureImpl<>();

  if (allTopics.remove(topicName) == null) {
    ....
  } else {
    topicIds.remove(topicName);
    removableItems.add(topicName);
    future.complete(null);
  }
  ....
}
topicNames.removeAll(removableItems);

Void, sweet void

Where would we go without our all-time favorite null and its potential problems, right? Let me show you the code fragment for which the analyzer issued the following warning:

V6008 Potential null dereference of 'oldMember' in function 'removeStaticMember'. ConsumerGroup.java 311, ConsumerGroup.java 323

@Override
public void removeMember(String memberId) {
  ConsumerGroupMember oldMember = members.remove(memberId);
  ....
  removeStaticMember(oldMember);
  ....
}

private void removeStaticMember(ConsumerGroupMember oldMember) {
  if (oldMember.instanceId() != null) {
    staticMembers.remove(oldMember.instanceId());
  }
}

If members doesn't contain an object with the memberId key, oldMember will be null. It can lead to a NullPointerException in the removeStaticMember method.

Boom! The parameter is checked for null:

if (oldMember != null && oldMember.instanceId() != null) {

The next error will be the last one in the article—I'd like to wrap things up on a positive note. The code below—as well as the one at the beginning of this article—has a common and silly typo. However, it can certainly lead to unpleasant consequences.

Let's take a look at this code fragment:

protected SchemaAndValue roundTrip(...., SchemaAndValue input) {
  String serialized = Values.convertToString(input.schema(),
                                             input.value());

  if (input != null && input.value() != null) {   
    ....
  }
  ....
}

Yeah, that's right. The method actually accesses the input object first, and then checks whether it's referencing null.

V6060 The 'input' reference was utilized before it was verified against null. ValuesTest.java 1212, ValuesTest.java 1213

Again, I'll note that such typos are ok. However, they can lead to some pretty nasty results. It's tough and inefficient to search for these things in the code manually.

Conclusion

In sum, I'd like to circle back to the previous point. Manually searching through the code for all these errors is a very time-consuming and tedious task. It's not unusual for issues like the ones I've shown to lurk in code for a long time. The last bug dates back to 2018. That's why it's a good idea to use static analysis tools. If you'd like to know more about PVS-Studio, the tool we have used to detect all those errors, you can find out more here.

That's all. Let's wrap things up here. "Oh, and in case I don't see ya, good afternoon, good evening, and good night."

Belay the Metamorphosis: analyzing Kafka project

I almost forgot! Catch a link to learn more about a free license for open-source projects.

The above is the detailed content of Belay the Metamorphosis: analyzing Kafka project. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Difference between HashMap and Hashtable? Difference between HashMap and Hashtable? Jun 24, 2025 pm 09:41 PM

The difference between HashMap and Hashtable is mainly reflected in thread safety, null value support and performance. 1. In terms of thread safety, Hashtable is thread-safe, and its methods are mostly synchronous methods, while HashMap does not perform synchronization processing, which is not thread-safe; 2. In terms of null value support, HashMap allows one null key and multiple null values, while Hashtable does not allow null keys or values, otherwise a NullPointerException will be thrown; 3. In terms of performance, HashMap is more efficient because there is no synchronization mechanism, and Hashtable has a low locking performance for each operation. It is recommended to use ConcurrentHashMap instead.

Why do we need wrapper classes? Why do we need wrapper classes? Jun 28, 2025 am 01:01 AM

Java uses wrapper classes because basic data types cannot directly participate in object-oriented operations, and object forms are often required in actual needs; 1. Collection classes can only store objects, such as Lists use automatic boxing to store numerical values; 2. Generics do not support basic types, and packaging classes must be used as type parameters; 3. Packaging classes can represent null values ??to distinguish unset or missing data; 4. Packaging classes provide practical methods such as string conversion to facilitate data parsing and processing, so in scenarios where these characteristics are needed, packaging classes are indispensable.

What are static methods in interfaces? What are static methods in interfaces? Jun 24, 2025 pm 10:57 PM

StaticmethodsininterfaceswereintroducedinJava8toallowutilityfunctionswithintheinterfaceitself.BeforeJava8,suchfunctionsrequiredseparatehelperclasses,leadingtodisorganizedcode.Now,staticmethodsprovidethreekeybenefits:1)theyenableutilitymethodsdirectly

How does JIT compiler optimize code? How does JIT compiler optimize code? Jun 24, 2025 pm 10:45 PM

The JIT compiler optimizes code through four methods: method inline, hot spot detection and compilation, type speculation and devirtualization, and redundant operation elimination. 1. Method inline reduces call overhead and inserts frequently called small methods directly into the call; 2. Hot spot detection and high-frequency code execution and centrally optimize it to save resources; 3. Type speculation collects runtime type information to achieve devirtualization calls, improving efficiency; 4. Redundant operations eliminate useless calculations and inspections based on operational data deletion, enhancing performance.

What is an instance initializer block? What is an instance initializer block? Jun 25, 2025 pm 12:21 PM

Instance initialization blocks are used in Java to run initialization logic when creating objects, which are executed before the constructor. It is suitable for scenarios where multiple constructors share initialization code, complex field initialization, or anonymous class initialization scenarios. Unlike static initialization blocks, it is executed every time it is instantiated, while static initialization blocks only run once when the class is loaded.

What is the Factory pattern? What is the Factory pattern? Jun 24, 2025 pm 11:29 PM

Factory mode is used to encapsulate object creation logic, making the code more flexible, easy to maintain, and loosely coupled. The core answer is: by centrally managing object creation logic, hiding implementation details, and supporting the creation of multiple related objects. The specific description is as follows: the factory mode handes object creation to a special factory class or method for processing, avoiding the use of newClass() directly; it is suitable for scenarios where multiple types of related objects are created, creation logic may change, and implementation details need to be hidden; for example, in the payment processor, Stripe, PayPal and other instances are created through factories; its implementation includes the object returned by the factory class based on input parameters, and all objects realize a common interface; common variants include simple factories, factory methods and abstract factories, which are suitable for different complexities.

What is the `final` keyword for variables? What is the `final` keyword for variables? Jun 24, 2025 pm 07:29 PM

InJava,thefinalkeywordpreventsavariable’svaluefrombeingchangedafterassignment,butitsbehaviordiffersforprimitivesandobjectreferences.Forprimitivevariables,finalmakesthevalueconstant,asinfinalintMAX_SPEED=100;wherereassignmentcausesanerror.Forobjectref

What is type casting? What is type casting? Jun 24, 2025 pm 11:09 PM

There are two types of conversion: implicit and explicit. 1. Implicit conversion occurs automatically, such as converting int to double; 2. Explicit conversion requires manual operation, such as using (int)myDouble. A case where type conversion is required includes processing user input, mathematical operations, or passing different types of values ??between functions. Issues that need to be noted are: turning floating-point numbers into integers will truncate the fractional part, turning large types into small types may lead to data loss, and some languages ??do not allow direct conversion of specific types. A proper understanding of language conversion rules helps avoid errors.

See all articles