software bug

FindBugs is a great open source tool for detection of software bugs in Java. It uses static analysis to search compiled classes for hundreds of bug patterns and even more can be found using FindSecurityBugs and fb-contrib plugins. However, before my recent contribution there was no general detector for hard-coded passwords and cryptographic keys. Hard-coded password are identical for each installation and can be easily extracted, which is likely to be exploited (see CWE-259 for more information). FindBugs and FindSecurityBugs could already detect this vulnerability, but only for constant database passwords and two very specific cases. I have created a detector (accepted to FindSecurityBugs), which is able to find hard-coded values of Strings, char and byte arrays or BigIntegers used as an input parameter for one of the configured methods such as KeyStore.load or KeySpec constructors.

To add a new detector, we have to create a class that implements Detector or extends a prepared class with helper functionality (I have used OpcodeStackDetector). An instance of BugReporter passed in constructor and its method reportBug are used to report problems (BugInstance objects). We also need to add the class name to findbugs.xml to be executed and edit messages.xml for information about detections. Good start for thinking about the detection logic is writing a bunch of flawed code samples and looking to their bytecode (plugin for IDEA can be used). We can write unit tests for them by mocking BugReporter.

Detector class can use visitor design pattern (if it implements Visitor) to react on events while analyzed class is scanned. I have started by overwriting method sawOpcode, which is called every time an instruction is read. Since we are interested in invocations of specific methods, we need to check if it is one of the invoke instructions and get full called method name such as java/security/KeyStore.load(Ljava/io/InputStream;[C)V, which contains method class with package, argument types ([C is char array, parameters of object types starts with L and ends with semicolon) and return type (V for void). Method name can be obtained by calling methods getClassConstantOperand, getNameConstantOperand and getNameSigOperand inherited from DismantleByteCode. If it is one of the problematic methods (loaded from resource files), we can create a BugInstance, add current analyzed line plus some info to it and report the bug. Now we have a password usage detector but not hard coded password detector, so it is time to eliminate false positives (detections that are not real bugs).

For String passwords, we can utilize OpcodeStackDetector and check nullness of stack.getStackItem(0).getConstant() to detect usage of constant String as a first method parameter. Unfortunately, it is not so easy for the other variable types. To detect that an array is initialized with hard-coded values, I am checking whether instruction for new array creation is followed by push and array store instructions while for example not calling methods. Constant arrays are also converted from constant Strings using methods toCharArray and getBytes. After implementing this, we can detect BigIntegers too, since they can be constructed from Strings or byte arrays.

In terms of so called taint analysis, we are able to detect vulnerability source (hard-coded data) and sink (usage of password or cryptographic function), but bug should be reported only if there is a flow from source to sink (we cannot be sure that hard-coded password is really a password until it is used as a password parameter). In the current implementation, no complex flow analysis is performed, we assume that a taint source followed by a taint sink of a matching type inside the same method body are always related. For this reason, false positives are easy to demonstrate, but are quite uncommon in practice. On the other hand, local hard-coded declarations are forgotten while another method is analyzed (visit method is overwritten to reset the state), so passwords are not detected if they are passed as a parameter and used in another method.

Class fields are also taken into an account – if constant data is stored to them, we remember that and consider it as a taint source when they are read. Because of this, the order of the methods matters and as static initializer section is added to the class end by compiler, its analysis is run ‘manually’ by calling doVisitMethod when class analysis starts. In addition, if the field stores hard-coded data and its name is suspicious (like password, secret or aesKey), the bug is reported immediately, since there is a high bug confidence and if it was used in a different class, it would not be reported otherwise (one the other hand, it can be reported twice now).

You can see the whole code on GitHub. I have mentioned some imperfections, but I think the detector is working quite well. Unfortunately, there is not much information about writing detectors, so creating them can be just a matter of trial and error. If you have an idea for improvement or a new detector, don’t hesitate to contact me or pull the code directly. 🙂