Writing AngularJS Security Semantic Rules using Semgrep

AngularJS Security is something I have looked into in the past. In 2016, I conducted a workshop on AngularJS Security in MWR’s MWRICON which highlighted some common security issues and how they could be exploited. The materials of this workshop can be seen here: MWRICON 2018.

I recently took notice of Semgrep which is a Lightweight static analysis tool. Semgrep is an interesting tool for code reviewers which is more expressive than grep, and is easily customizable to conduct semantic analysis. Since Semgrep is lightweight and fast, it is easy to implement this in a CI/CI pipeline.

Semgrep

Semgrep is fairly easy to setup and install pip3 install semgrep. Furthermore, the Semgrep Live Editor can also be used to write and save rules on r2c’s system.

Semgrep uses pfff (PHP Frontend For Fun) which is a static analysis engine previously developed by facebook and was then deprecated. The developer Yoann Padioleau who initially wrote this now works for r2c. pfff is written in OCaml and it is a set of modules (only few are used as part Semgrep I think).

Semgrep internally uses a generic parser using yacc and ocaml-tree-sitter and some custom code to convert any programming language into an Intermediate Language/AST and then it is possible to conduct Fuzzy AST to AST Matching on this result.

Looking at the engine code, there are references to pointer analysis and datalog but this doesn’t seem to be currently use and Dataflow/taint analysis seems to be planned in the future release by r2c.

Semgrep can be considered a good alternative for ESLint since

Well documented, rules are meant to be powered by community
Semgrep is provided under LGPL-license and is free for commercial use
Very easy to write custom configurations and semantic rules.
Supports popular languages incuding Python, JavaScript, Java, Go, C with PHP and TypeScript support on the way 🥳

As such I ported multiple rules from AngularJS Security Rules For ESLint to Semgrep. The pull request I made can be seen here: Pull Request to semgrep-rules repository. This blog post will highlight the basics of using Semgrep with some AngularJS examples.

Writing Rules - Strict Contextual escaping (SCE) Example

In the code below, the $sceProvider is set to false. Disabling Strict Contextual escaping (SCE) in an AngularJS application could provide additional attack surface for XSS vulnerabilities.

        
      
var app = angular.module('MyApp', []).config(function ($sceProvider) {
    // ruleid: detect-angular-sce-disabled 
    $sceProvider.enabled(false);
});
 app.controller('myCtrl', function($scope) {

 $scope.userInput = 'foo';
     $scope.sayHello = function() {
	  $scope.html = "Hello <b>" + $scope.userInput + "</b>!"

    };

 });

An example such as the above can be easily solved by using Semgrep since you only need to check if the value of $sceProvider.enabled is false. Live Editor - SCE Disabled

        
      
 rules:
- id: my_pattern_id
  pattern: |
    $sceProvider.enabled(false);
  message: |
    Semgrep found a match
  severity: WARNING

And due to the Semgrep only matching AST, this can be used to easily traverse through large codebases quickly.

Writing Rules - Open Redirect Example

The below code is a command example of open redirect that might happen when user input is conce $window.location.href

        
      
var app = angular.module('MyApp', []);
app.controller('myCtrl', function($scope, $sce) {

$scope.userInput = 'foo';
    $scope.sayHello = function() {

     $window.location.href = input + '/app/logout';
     input = $scope.input;
     $window.location.href = input + '/app/logout';


     //Data is not coming from user input
     $location.location.location = test
     $window.location.href = "//untatintedredirect"
   };

});

One way to match code like this is to use MetaVariables and Expression Matching and the pattern: | option. Expression matching searches code for the given pattern. This pattern can match a full expression, or be part of a subexpression and this can be used to identify concatenation., and metavariables are used to track a value across a specific code scope.

This includes variables, functions, arguments, classes, object methods, imports, exceptions, and more. Typed Metavariables are also supported for Java. More information regarding these two features can be seen here: Pattern Features

        
      
      patterns:
        - pattern-either: 
          - pattern: |
              $SOURCE = $INPUT;
              $window.location.href = $SOURCE + $STATICVALUE;
          - pattern: |
              $window.location.href = $SOURCE + $STATICVALUE;

This can even be taken one step further using the pattern-inside option and the ellipsis operator. The ellipsis operator can be used to search for specific function calls or function calls with specific arguments. To search for all calls to a specific function, regardless of its arguments. This can be used to check if my initial pattern is occurring inside the app.controller function.

        
      
            app.controller(..., function($scope,$sce){ 
            ...
            });

Live Editor Example

The ellipsis operator itself is very useful and can be used to check for specific function calls, method calls, function definitions, class definitions, strings, arrays and conditionals.

Another example of using ellipsis is identifying the usage of angular.element method. angular.element can lead to XSS if after,append,html,prepend,replaceWith,wrap are used with user-input. This can be matched using a pattern such as the below:

        
      
angular.element($SOURCE).html(...);

The same concept can also be applied to match expressions that could be nested deep within another expression. Deep Expression Operator

Semgrep YAML File Breakdown

Patterns/Rules can be contributed to the semgrep-rules as a YAML file. An example of a pull request can be seen here: Pull Request

An example of a YAML file can be seen below:

        
      
rules:
    - id: detect-angular-translateprovider-useStrategy-method
      patterns:
        - pattern-either: 
          - pattern: |
              $translateSanitization.useStrategy();
        - pattern-inside: |
            app.controller(..., function($scope,$sce){ 
            ...
            });
      message: |
                If the $translateSanitization.useStrategy is set to null or blank this can be dangerous.
      languages:
      - javascript
      severity: WARNING
      metadata:
        references:
            - https://docs.angularjs.org/api/ng/service/$sce#trustAsUrl
            - https://owasp.org/www-chapter-london/assets/slides/OWASPLondon20170727_AngularJS.pdf
    - id: detect-angular-translateprovider-translations-method
      patterns:
        - pattern-either: 
          - pattern: |
              $translateProvider.translations(...,$SOURCE);
        - pattern-inside: |
            app.controller(..., function($scope,$sce){ 
            ...
            });
      message: |
                The use of $translateProvider.translations method can be dangerous if user input is provided to this API.
      languages:
      - javascript
      severity: WARNING
      metadata:
        references:
            - https://docs.angularjs.org/api/ng/service/$sce#trustAsUrl
            - https://owasp.org/www-chapter-london/assets/slides/OWASPLondon20170727_AngularJS.pdf

This can be broken down as:

patterns – Pattern option to match
language – Language syntax that can be matched

Import YAML keys:

id - ID of a rule, multiple rules can be added in the same file which can be triggered by one ID
metadata - Provide additional information such as references/links
message -Message to display when pattern is matched
pattern - Here we specify the logical option for a pattern to be matched. This can be pattern-not, pattern-inside, pattern, pattern-not-inside etc
severity - Severity of the rule

Concluding Thoughts

While Semgrep doesn’t support fully fledged static/program analysis such as CodeQL, it can be useful for writing quick patterns to search through large codebases quickly. I often find these sorts of features useful when trying to find interesting starting entry points while code reviewing which I can then dig into myself.

I looked forward to Semgrep being the successor of other grep based code scanning projects such as GrepBugs and other open source projects which have died in the past.

Writing AngularJS Security Semantic Rules using Semgrep

Semgrep

Writing Rules - Strict Contextual escaping (SCE) Example

Writing Rules - Open Redirect Example

Semgrep YAML File Breakdown

Further Reading

Joern Cheat Sheet

rs-async-zip Zip Path Traversal (Zip Slip)

Improving GraphQL security with static analysis