During the advancement of JFrog Xray’s Tricks Detection, we evaluated its capabilities by scanning more than 8 million artifacts in popular open-source bundle pc registries. Similarly, for JFrog Xray’s new Container Contextual Analysis feature, we once again tested our detection in a large-scale, real-world usage case, both for removing bugs and for assessing the real-world viability of our existing solution.However, unlike the unexpected results we got in our Tricks Detection research(we found much more active access tokens than we bargained for), the outcomes of our scans of Docker Center container images remained in line with what we were seeing, as security engineers, for many years now.Namely, when the metric for a susceptible system is just “package X is installed,” we expect most security informs to be incorrect positives. And this was precisely the case for the CVEs we found in container images on Docker Hub.In this post we will detail our research approach and findings and use some recommendations for developers and security experts wanting to minimize the volume of CVE false positives. Exploitable vs. ‘susceptible package is installed’ Before diving in
, let’s briefly look at some example vulnerabilities to understand cases where a CVE report could be thought about a false positive, even when a susceptible part exists.This is not an exhaustive list by any methods, but it does cover the most popular causes of CVE incorrect
positives. Library vulnerabilities JFrog Does the truth that a susceptible version of Lodash is set up warranty a vulnerable system?No. By meaning, we can not
identify whether a CVE in a library is exploitable simply by noting that the library is set up. This is since a library is not a runnable entity; there need to be some other code in the system that utilizes the library in a vulnerable manner. In the example above, even if Lodash is set up, the system may not be susceptible. There should be some code that calls the vulnerable function, in this case template( ), from the susceptible Lodash library. In many cases, there are even additional requirements, such as that a person of the arguments passed to design template
() would be attacker-controlled. Other code-related prerequisites might consist of: Whether a mitigating function is called prior to the susceptible function. Whether particular arguments of the susceptible function are set to specific susceptible worths. Service setup JFrog Does the reality that a susceptible version of Cassandra is installed warranty a susceptible system?No. In a lot of modern service vulnerabilities (especially ones with extreme impact
)the vulnerability only
manifests in non-default configurations of the service. This is because the default and sane configuration is
frequently checked the most, either by the designers themselves or merely by the real-world users of the service. In the example above, to achieve remote code execution (RCE), the Cassandra service should be configured with 3 non-default setup flags( among them being rather unusual ). Other configuration-related requirements may include: Whether the part is being kept up particular command-line arguments or environment variables. Whether the vulnerable component was put together with particular develop flags. Running environment JFrog Does the reality that a vulnerable version of Apache Hadoop is set up assurance a vulnerable system?No. In the example above, the vulnerability
only manifests in a Microsoft Windows environment. Therefore if the vulnerable component is set up in a Linux environment, it can not be exploited. Other environment-related prerequisites might consist of: Whether the vulnerable element is running in a specific distribution(e.g. Debian)Whether the susceptible part
is assembled for a particular architecture(e.g. 32-bit Arm). Whether a firewall software obstructs interaction to the susceptible service. Our research approach In this research study, we set out to discover what percentage of vulnerability reports really show that the vulnerability is exploitable, when thinking about two reporting strategies: Naive.
The vulnerability is reported whenever a susceptible
- element is set up in the relevant( vulnerable) variation range. This is how almost
- all SCA tools work today. Context-sensitive. The vulnerability is only reported (or said to be relevant)if
- the context of the image suggests vulnerable usage of the component
. This takes into account factors that were talked about in the previous section(code requirements, configuration prerequisites, running environment). We are interested in testing the above in common real-world environments, and performing this test on as many environments as possible. JFrog We recognized that taking a look at Docker Center’s leading” neighborhood”images must satisfy both demands, for two factors: These images are used exceptionally frequently. For example, the top 25 images currently have more than 1 billion downloads. Neighborhood images normally contain both an interesting element and the code that uses the element to some end, which supplies a reasonable context. This is unlike”main” Docker images that normally include standalone components that are left unused and in their default setup. For instance, an Nginx web server by itself with default configuration would most likely not be prone to any significant CVE, but it does not supply a sensible circumstance. Based upon these aspects, we came to the following method: Pull Docker Center’s top 200 community images, in their”latest”tag. Collect from these images the leading 10 most” popular”CVEs(sorted by CVE occurrence throughout all images). Run our contextual analysis on all 200 images. Compute the percentageof the naive technique false favorable rate, by dividing”non-applicable occurrences”by”overall events” for each of the leading 10 CVEs. What were the leading 10 CVEs?And so we scanned Docker Center’s top 200 neighborhood images. The table below lists the CVEs that appeared in the highest variety of images. CVE ID CVSS Short description CVE-2022-37434 9.8 zlib through 1.2.12 has a heap-based buffer over-read or buffer overflow in pump up( ). Just applications that call inflateGetHeader are affected. CVE-2022-29458 7.1 ncurses 6.3 has an out-of-bounds read and division violation in convert_strings()CVE-2021-39537 8.8 ncurses through v6.2 nc_captoinfo()has a heap-based buffer overflow CVE-2022-30636 N/A Golang x/crypto/acme/ autocert: httpTokenCacheKey allows minimal directory traversal
- on Windows CVE-2022-27664 7.5 Golang net/http before 1.18.6 DoS because an HTTP/2 connection can hang CVE-2022-32189 7.5
- Golang math/big prior to 1.17.13 Float.GobDecode and
- Rat.GobDecode DoS due to stress CVE-2022-28131 7.5 Golang encoding/xml before 1.17.12 Decoder.Skip DoS due to stack fatigue CVE-2022-30630 7.5 Golang io/fs prior to 1.17.12 Glob DoS due
to unrestrained recursion CVE-2022-30631 7.5 Golang compress/gzip prior to 1.17.12 Reader.Read DoS due to unrestrained recursion CVE-2022-30632 7.5 Golang path/filepath prior to 1.17.12 Glob DoS due
to stack fatigue
The number of CVEs were
in fact exploitable?We intentionally picked to
run the contextual scanners
on their
most conservative setting– more on that in the next section.The contextual scanner for each CVE was defined as described in the table below. CVE ID Contextual scanner CVE-2022-37434 Check for 1st-party
code that calls”
inflateGetHeader”and”inflate” CVE-2022-29458 Check for invocations of the ncurses “tic”CLI energy CVE-2021-39537 Examine
for invocations
of the ncurses”
cap2info “CLI energy CVE-2022-30636 Look For Windows OS+ 1st-party code that calls”autocert.NewListener”or references”autocert.DirCache
“CVE-2022-27664 Look for 1st-party code that calls”ListenAndServeTLS”(HTTP/2 is just offered over TLS) CVE-2022-32189 Look for 1st-party
code that calls”
Rat.GobDecode
“or”Float.GobDecode”CVE-2022-28131 Look for 1st-party code that calls” Decoder.Skip”CVE-2022-30630 Check for 1st-party code
that calls”fs.Glob”
with non-constant input CVE-2022-30631 Look for 1st-party code that calls” gzip.Reader.Read
“CVE-2022-30632 Look for 1st-party code
that calls”filepath.Glob”with non-constant input– Running the contextual scanners on all
200 images gave us the
following results, per CVE. JFrog And when we tallied the outcomes of all top 10 CVEs together, here’s what we
discovered:
JFrog 78%of the CVE cases were discovered to be non-applicable! Taking a look at the existing limits of contextual analysis Let’s analyze CVE-2022-30631, which
had an incredibly high applicability rate.CVE-2022-30631 was the only one that crossed 50%applicability. In
layperson
‘s terms, the prerequisite
for this CVE to be
exploitable is”Golang is utilized to extract an attacker-controlled gzip archive
.”In truth, the scanner will signal if first-party Golang code tries to draw out any gzip archive. This is
since ensuring
whether a file is attacker-controlled is an extremely hard job, due to the multitude of
possible sources impacting
the file.When attempting to figure out whether a variable comes from continuous input or external/attacker input, this can
be accomplished for example through information flow analysis. Data circulation analysis is performed by some of our scanners, identifying CVE-2022-30630, for example, which had
a much lower applicability rate.
JFrog But when handling files, even if the function’s file course argument is
continuous, there
is no guarantee that the file is not attacker-controlled, and
vice versa. Therefore
, we anticipate the real-world applicability of this CVE to be even lower.Why 78% is in fact a conservative
number From the example above, we can see that some CVEs may have an exaggerated applicability rate, meaning the real-world applicability might be even lower. It is important to talk about why(in the typical case)it still makes good sense to run conservative scanners. There are 2 factors for this: first, due to the fact that we choose false positives to false unfavorable, and second, since of efficiency considerations.Preference towards incorrect positives and not incorrect negatives Every innovation has its constraints, and this is doubly true when attempting to fix computationally infeasible issues such as”can a specific input be managed by external sources.”In specific cases(the much easier ones), we can make certain
assumptions that make the calculation of the service possible with very high confidence.However, in other cases(the harder one ), where 100% self-confidence is not guaranteed, we ought to do 2 things: Choose scanners that tend to show false positives (in our case, reveal outcomes as suitable when in reality they are not). This is done because, in this case, the result will be analyzed by an engineer and assessed whether it is truly applicable or not. In the opposite case, where the vulnerability would be flagged as non-applicable, the engineer will presume it can be neglected, and thus the vulnerability would be left susceptible, which is a lot more serious scenario. Whenever possible, offer the confidence rate and/or the factor for low confidence in a particular finding, so that even applicable outcomes can be prioritized by security/devops engineers. Efficiency considerations A contextual scanner that’s based on data circulation
analysis(for example, a scanner that tries toidentify whether a particular function’s argument is originating from attacker-controlled input or not)will constantly have an option in its execution whether to provide more accurate outcomes or to run much faster.
For instance, the most accurate kind of contextual scanner need to a minimum of: Allow for an unlimited call depth when trying to build an intra-module information flow graph in between an attacker-controlled source and the requested sink. Consider inter-module calls when building the data flow chart. These operations greatly increase the scanner’s run time.When dealing with a great amount of scanned artifacts per minute (as may be asked for from a JFrog Artifactory/Xray instance)we should accomplish a delicate balance between the precision and the speed of the contextual scanner.Even when thinking about the
talked about constraints, 78% is still a huge number of vulnerabilities that can be either de-prioritized or overlooked. Additionally, we anticipate this number to end up being greater as innovation advances and as less”relevant by default “CVEs are discovered. Source