A Brief Introduction to Semgrep
Introduction
Semgrep is an amazing static analysis tool that we are excited about. This is part 1 of a 2 part series. Josiah’s part 2 dives much deeper into the details of the rules and how to hunt for specific vulnerabilities. This post is to provide high-level information and show you how to run Semgrep.
Semgrep was created from an open-source facebook project, pfff. All of the current functionality is available for free and is open source, although ReturnToCorp, the makers of Semgrep, have plans to commercialize enterprise usages, such as for CI/CD integrations. A similar tool exists called lgtm, and it integrates with GitHub/BitBucket. It is made by Semmle, which was bought by Microsoft/GitHub. It is worth looking into if Semgrep sounds interesting to you.
Semgrep vs Commercial Static Analysis Tools
I’ve yet to use a commercial static analysis tool that I would recommend for penetration testers. I would be hesitant to recommend many commercial solutions for developers as well. The problem is not that they don’t find anything, it’s generally that they are difficult to use and flag every 10th line of code as a potential vulnerability. Some portion of the findings require security expertise to review (e.g. is using MD5 really a vulnerability in this case?). It takes someone who understands the code, the tool, and application security to accurately run most tools, which is not insurmountable, but for fairly mediocre output, it just generally is not worth it. Additionally, some commercial tools excel in a few languages but fall flat on others. It is very important to test out any commercial tool against your code when evaluating it.
For a pentester’s purposes, commerical tools just take too much time to get configured and working, especially considering the pretty weak results. It would be like spending a day getting Word’s spellcheck to work, just to find some errors in a 30-page document. It is much better to spend time manually bug hunting than dealing with errors and requirements of tools. Many commercial static tools also require the project to successfully build, which can take several days depending on the project and its complexity. Conversely, the first time I used Semgrep, I had results to work from with just a few minutes of effort.
Semgrep vs Grep
Given the trouble with commercial tools, penetration testers and others often use grep to find some keywords and search for areas of concern across the codebase. This is great for finding potentially vulnerable function calls, although it can generate a lot of results and miss some cases. In contrast, Semgrep understands the languages it is searching for and has the ability to detect vulnerabilities that span multiple lines.
Example:
Let’s start out by running the java ruleset on Android-InsecureBankv2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
docker run --rm -v "C:\source\Android-InsecureBankv2:/src" returntocorp/semgrep --config "https://semgrep.dev/c/p/java" using config from https://semgrep.dev/p/java. Visit https://semgrep.dev/registry to see all public rules. downloading config... running 28 rules... InsecureBankv2/app/src/main/java/com/android/insecurebankv2/ChangePassword.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 128: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g InsecureBankv2/app/src/main/java/com/android/insecurebankv2/CryptoClass.java severity:warning rule:java.lang.security.audit.cbc-padding-oracle.cbc-padding-oracle: Using CBC with PKCS5Padding is susceptible to padding orcale attacks. A malicious actor could discern the difference between plaintext with valid or invalid padding. Further, CBC mode does not include any integrity checks. See https://find-sec-bugs.github.io/bugs.htm#CIPHER_INTEGRITY. Use 'AES/GCM/NoPadding' instead. 55: cipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); autofix: javax crypto Cipher.getInstance("AES/GCM/NoPadding"); 77: Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); autofix: javax crypto Cipher.getInstance("AES/GCM/NoPadding"); InsecureBankv2/app/src/main/java/com/android/insecurebankv2/DoLogin.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 116: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g InsecureBankv2/app/src/main/java/com/android/insecurebankv2/DoTransfer.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 131: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g 262: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g wip-attackercode/ExploitAES/app/src/main/java/com/android/dns/exploitaes/MainActivity.java severity:warning rule:java.lang.security.audit.cbc-padding-oracle.cbc-padding-oracle: Using CBC with PKCS5Padding is susceptible to padding orcale attacks. A malicious actor could discern the difference between plaintext with valid or invalid padding. Further, CBC mode does not include any integrity checks. See https://find-sec-bugs.github.io/bugs.htm#CIPHER_INTEGRITY. Use 'AES/GCM/NoPadding' instead. 115: Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); autofix: javax crypto Cipher.getInstance("AES/GCM/NoPadding"); |
The results seem okay, but not great. Some of this may be due to the rules focusing on Java, but do not apply to Android. r2c’s security audit rules, which is a good ruleset that covers a lot of languages, provides the same results as the java ruleset. It is safe to assume that they share many or possibly all of the same Java rules.
Let’s try another ruleset, findsecbugs, to see how the results differ:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
docker run --rm -v "C:\source\Android-InsecureBankv2:/src" returntocorp/semgrep --config "https://semgrep.dev/p/findsecbugs" using config from https://semgrep.dev/p/findsecbugs. Visit https://semgrep.dev/registry to see all public rules. downloading config... running 43 rules... InsecureBankv2/app/src/main/java/com/android/insecurebankv2/ChangePassword.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 128: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g InsecureBankv2/app/src/main/java/com/android/insecurebankv2/DoLogin.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 116: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g InsecureBankv2/app/src/main/java/com/android/insecurebankv2/DoTransfer.java severity:warning rule:java.lang.security.audit.crypto.ssl.defaulthttpclient-is-deprecated.defaulthttpclient-is-deprecated: DefaultHttpClient is deprecated. Further, it does not support connections using TLS1.2, which makes using DefaultHttpClient a security hazard. Use SystemDefaultHttpClient instead, which supports TLS1.2. 131: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g 262: HttpClient httpclient = new DefaultHttpClient(); autofix: s/DefaultHttpClient/SystemDefaultHttpClient/g |
The results seem to be a subset of the previous attempts. None of these three rulesets generated too many findings.
Let’s try another codebase, the Damn Vulnerable Java (EE) Application which has all of the OWASP Top 10 vulnerabilities. The https://semgrep.dev/c/p/java and https://semgrep.dev/p/r2c-security-audit rulesets were able to find the same two SQL Injection findings (one of which is not officially listed in the solutions):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
docker run --rm -v "C:\source\dvja:/src" returntocorp/semgrep --config=https://semgrep.dev/p/r2c-security-audit using config from https://semgrep.dev/p/r2c-security-audit. Visit https://semgrep.dev/registry to see all public rules. downloading config... running 202 rules... ran 202 rules on 23 files: 2 findings src/main/java/com/appsecco/dvja/services/ProductService.java severity:warning rule:java.lang.security.audit.formatted-sql-string.formatted-sql-string: Detected a formatted string in a SQL statement. This could lead to SQL injection if variables in the SQL statement are not properly sanitized. Use a prepared statements (java.sql.PreparedStatement) instead. You can obtain a PreparedStatement using 'connection.prepareStatement'. <strong>48: Query query = entityManager.createQuery("SELECT p FROM Product p WHERE p.name LIKE '%" + name + "%'");</strong> src/main/java/com/appsecco/dvja/services/UserService.java severity:warning rule:java.lang.security.audit.formatted-sql-string.formatted-sql-string: Detected a formatted string in a SQL statement. This could lead to SQL injection if variables in the SQL statement are not properly sanitized. Use a prepared statements (java.sql.PreparedStatement) instead. You can obtain a PreparedStatement using 'connection.prepareStatement'. <strong>75: Query query = entityManager.createQuery("SELECT u FROM User u WHERE u.login = '" + login + "'");</strong> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
docker run --rm -v "C:\source\dvja:/src" returntocorp/semgrep --config=https://semgrep.dev/p/xss using config from https://semgrep.dev/p/xss. Visit https://semgrep.dev/registry to see all public rules. downloading config... running 60 rules... ran 60 rules on 172 files: 4 findings src/main/webapp/WEB-INF/dvja/ProductList.jsp severity:warning rule:java.lang.security.audit.xss.jsp.no-scriptlets.no-scriptlets: JSP scriptlet detected. Scriptlets are difficult to use securely and are considered bad practice. See https://stackoverflow.com/a/3180202. Instead, consider migrating to JSF or using the Expression Language '${...}' with the escapeXml function in your JSP files. <strong>23:<%= request.getParameter("searchQuery") %></strong> src/main/webapp/WEB-INF/dvja/common/Footer.jsp severity:warning rule:java.lang.security.audit.xss.jsp.use-escapexml.use-escapexml: Detected an Expression Language segment that does not escape output. This is dangerous because if any data in this expression can be controlled externally, it is a cross-site scripting vulnerability. Instead, use the 'escapeXml' function from the JSTL taglib. See https://www.tutorialspoint.com/jsp/jstl_function_escapexml.htm for more information. 1:${request.contextPath} src/main/webapp/WEB-INF/dvja/common/Head.jsp severity:warning rule:java.lang.security.audit.xss.jsp.use-escapexml.use-escapexml: Detected an Expression Language segment that does not escape output. This is dangerous because if any data in this expression can be controlled externally, it is a cross-site scripting vulnerability. Instead, use the 'escapeXml' function from the JSTL taglib. See https://www.tutorialspoint.com/jsp/jstl_function_escapexml.htm for more information. 10:${request.contextPath} 13:${request.contextPath} |
Further Learning
https://tldrsec.com/blog/tldr-sec-035/ & https://tldrsec.com/blog/tldr-sec-037/ These tl;dr sec newsletters are where I first learned about semgrep. It is a great resource for modern application security, security automation, and devsecops.
We45 has a good video on semgrep. We45 also has many other great application security videos on their channel.