From 447f33415103357c17eb3ed9140f0e84f5135a63 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 8 Oct 2018 09:48:23 +1300 Subject: [PATCH 1/9] Add common namespace to facts. --- docs/concepts/CLQL.md | 102 +++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 51 deletions(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index 0f08377..d3921c8 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -53,7 +53,7 @@ To limit the above query to match classes with a particular name, add a "name" p ``` @review.comment -method(depth = any): +common.method(depth = any): name == "myFunc" ``` @@ -67,14 +67,14 @@ This query returns all methods with the name "myFunc". Note that the query decor Properties can be of type string, float, and int. The following finds all int literals with the value 8: ``` -int_lit(depth = any): +common.int_lit(depth = any): value == 8 ``` This query finds float literals with the value 8.7: ``` -float_lit(depth = any): +common.float_lit(depth = any): value: 8.7 ``` @@ -84,7 +84,7 @@ float_lit(depth = any): The comparison operators >, <, >=, and <= are available for floats and ints. The following finds all int literals above negative 3: ``` -int_lit(depth = any): +common.int_lit(depth = any): value: > -3 ``` @@ -95,25 +95,25 @@ int_lit(depth = any): Facts can take any number of facts and properties as children, forming a query with a tree struct of arbitrary depth. A parent-child fact pair will match any parent element even if the child is not a direct descendant. The following query finds all the if statements inside a method called "myMethod", even those nested inside intermediate scopes (for loops etc): ``` -method(depth = any): +common.method(depth = any): name == "myMethod" - if_stmt(depth = any) + common.if_stmt(depth = any) ``` Any fact in a query can be decorated. If `class` is decorated, this query returns all classes named "myClass", but only if it has at least one method: ``` -class(depth = any): +common.class(depth = any): name: “myClass” - method(depth = any) + common.method(depth = any) ``` Any fact in a query can have properties. The following query finds all methods named "myMethod" on the all classes named "myClass": ``` -class(depth = any): +common.class(depth = any): name: “myClass” - method(depth = any): + common.method(depth = any): name: “myMethod” ``` @@ -122,15 +122,15 @@ class(depth = any): Facts use depth ranges to specify the depth at which they can be found below their parent. Depth ranges have two zero based numbers, representing the minimum and maximum depth to find the result at, inclusive and exclusive respectively. The following query finds any if statements that are direct children of their parent method, in other words, if statements at depth zero from methods: ``` -method(depth = any): - if_stmt(depth = 0:1) +common.method(depth = any): + common.if_stmt(depth = 0:1) ``` This query finds if statements at (zero based) depths 3, 4, and 5: ``` -method(depth = any): - if_stmt(depth = 3:6) +common.method(depth = any): + common.if_stmt(depth = 3:6) ``` A depth range where the maximum is not greater than the minimum, i.e. `(depth = 5:5})` or `({depth: 6:0)`, will give an error. @@ -138,15 +138,15 @@ A depth range where the maximum is not greater than the minimum, i.e. `(depth = Depth ranges specifying a single depth can be described with a single number. This query finds direct children at depth zero: ``` -method(depth = any): - if_stmt(depth = 0) +common.method(depth = any): + common.if_stmt(depth = 0) ``` Indices in a depth range can range from 0 to positive infinity. Positive infinity is represented by leaving the second index empty. This query finds all methods, and all their descendant if_statements from depth 5 onwards: ``` -method(depth = any): - if_stmt(depth = 5:) +common.method(depth = any): + common.if_stmt(depth = 5:) ``` Note: The depth range on top level facts, like `method` in the previous examples, determines the depth from the base context to that fact. In this case the base context contains a single program. However, it can be configured to refer to any context, typically a single repository or the root of the graph on which all queryable data hangs. @@ -158,10 +158,10 @@ Note: The depth range on top level facts, like `method` in the previous examples The following query will find a method with a foreach loop, a for loop, and a while loop in that order: ``` -method(depth = any): - for_stmt - foreach_stmt - while_stmt +common.method(depth = any): + common.for_stmt + common.foreach_stmt + common.while_stmt ``` @@ -173,7 +173,7 @@ method(depth = any): Exlude allows queries to match children that *do not* have a given property or child fact. Excluded facts and properties are children of an `exclude` operator. The following query finds all classes except those named "classA": ``` -class(depth = any): +common.class(depth = any): exclude: name == "classA" ``` @@ -181,8 +181,8 @@ class(depth = any): This query finds all classes with a method that is not called String: ``` -class(depth = any): - method: +common.class(depth = any): + common.method: exclude: name: “String” ``` @@ -190,9 +190,9 @@ class(depth = any): The placement of the exclude operator has a significant effect on the query's meaning - this similar query finds all classes without String methods: ``` -class(depth = any): +common.class(depth = any): exlude: - method: + common.method: name: “String” ``` @@ -201,11 +201,11 @@ The exclude operator in the above query can be read as excluding all methods wit Excluding a fact does not affect its siblings. The following query finds all String methods that use an if statement, but don’t use a foreach statement: ``` -method(depth = any): +common.method(depth = any): name: “String” - if_stmt + common.if_stmt exclude: - foreach_stmt + common.foreach_stmt ``` An excluded fact will not return a result and therefore cannot be decorated. @@ -216,10 +216,10 @@ An excluded fact will not return a result and therefore cannot be decorated. Exclusions can be arbitrarily nested. The following query finds methods which only return nil or return nothing, that is, it finds all methods except those with non-nil values in their return statements: ``` -method: +common.method: exclude: - return_stmt(depth = any): - literal: + common.return_stmt(depth = any): + common.literal: exclude: name == "nil" ``` @@ -232,12 +232,12 @@ Facts nested under multiple excludes still do not return results and cannot be d Include allows queries to match patterns without a given parent. The following query is a simple attempt at finding infinitely recursing functions. It works by finding functions that call themselves without an if statement to halt recursion: ``` -func: +common.func: name as funcName exclude: - if_stmt: + common.if_stmt: include: - func_call: + common.func_call: name as funcName ``` @@ -253,12 +253,12 @@ Results under include statements appear as children of the parent of the corresp A fact with multiple children will match against elements of the code that have child1 *and* child2 *and* child3 etc. The `any_of` operator overrides the implicit "and". The following query finds all String methods that use basic loops: ``` -method(depth = any): +common.method(depth = any): name: “String” any_of: - foreach_stmt - while_stmt - for_stmt + common.foreach_stmt + common.while_stmt + common.for_stmt ``` @@ -271,13 +271,13 @@ Facts that do not have a parent-child relationship can be compared by assigning The following query compares two classes (which do have a parent-child relationship) and returns the methods which both classes implement: ``` -class(depth = any): +common.class(depth = any): name: “classA” - method: + common.method: name as methodName -class(depth = any): +common.class(depth = any): name: “classB” - method: + common.method: name as methodName ``` @@ -294,9 +294,9 @@ Functions allow users to execute arbitrary logic on variables. There are two typ A resolver function is used on the right hand side of a property assertion. In the following example, we assert that the name property of the method fact is equal to the value returned from the concat function: ``` -class(depth = any): +common.class(depth = any): name as className - method: + common.method: name == concat("New", className) ``` @@ -307,8 +307,8 @@ Asserter functions return a Boolean value and can only be called on their own li The following query uses the inbuilt `regex` function to match methods with capitalised names: ``` -class(depth = any): - method: +common.class(depth = any): + common.method: name as methodName regex(/^[A-Z]/, methodName) // pass in the methodName variable to the regex function and assert that the name is capitalised. ``` @@ -335,10 +335,10 @@ tenets: This method appears to be a constructor name: constructor-finder query: | - class(depth = any): + common.class(depth = any): name as className @review.comment - method: + common.method: name == newConcat("New", className) ``` @@ -359,7 +359,7 @@ tenets: This method has a long name name: long-method-name query: | - method: + common.method: name as methodName stringLengthGreaterThan(methodName, 15) ``` From 9d55cca6bc27b373bfe70cf6078121df9b5efa9c Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 8 Oct 2018 15:53:15 +1300 Subject: [PATCH 2/9] Document block statements. --- docs/concepts/CLQL.md | 179 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 179 insertions(+) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index d3921c8..1316c43 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -262,6 +262,185 @@ common.method(depth = any): ``` +# Block + +Block statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branches of the subtree can be recombined at the end of the block allowing queries along multiple paths. + +## Linear + +The following query finds functions with triply nested if statements: + +```clql +common.func: + block(repeat = 3): + common.if_stmt: + blockcontinue +``` + +It can be rewritten by expanding the CLQL between `block` and `blockcontinue` 3 times: + +```clql +common.func: + common.if_stmt: + common.if_stmt: + common.if_stmt +``` + +The following finds all functions call by a function called `someFunc`: + +```clql +common.func: + name == "someFunc" + block(repeat = any): + common.func_call(depth = any): + edge("calls"): + common.func: + blockcontinue +``` + +It can be rewritten (into demonstrative pseudocode) by expanding the CLQL betweeen `block` and `blockcontinue` once, twice etc to infinity, and grouping the expansions under an `any_of`: + +```clql +common.func: + name == "someFunc" + any_of: + common.func_call(depth = any): + edge("calls"): + common.func + common.func_call(depth = any): + edge("calls"): + common.func: + common.func_call(depth = any): + edge("calls"): + common.func + ... + common.func_call(depth = any): + edge("calls"): + common.func: + common.func_call(depth = any): + edge("calls"): + common.func: + ... +``` + +## Complex subtrees + +All CLQL elements that are children of `block` but not `blockcontinue` are repeated. The following query matches functions with triply nested if statements that all check the same value: + +```clql +common.func: + block(repeat = 3): + common.if_stmt: + common.condition: + common.var: + name as varName + blockcontinue +``` + +It can be rewritten by replacing the blockcontinue statement with the contents of the block statement 3 times and replacing any repeated definition of `varName` with an assertion: + +```clql +common.func: + common.if_stmt: + common.condition: + common.var: + name as varName + common.if_stmt: + common.condition: + common.var: + name == varName + common.if_stmt: + common.condition: + common.var: + name == varName +``` + +## Branching + +A `block` statement can have multiple `blockcontinue` statements. This allows multiple parents to have the same children. These children are defined under a `blockend` statment, rather than as children of the `blockcontinue` statements. It is often used with an `any_of` to match classes of simlar facts such as methods, closures, and functions, for example. + +The following query finds functions or methods with doubly nested for loops: + +``` +common.file: + block: + any_of: + common.func: + blockcontinue + common.method + blockcontinue + blockend: + common.for_stmt: + common.for_stmt +``` + +It can be rewritten by replacing each `blockcontinue` statement with the children of the blockend statement: + +``` +common.file: + any_of: + common.func: + common.for_stmt: + common.for_stmt + common.method + common.for_stmt: + common.for_stmt +``` + +The following query matches all functions with function calls inside doubly nested for/while statements: + +```clql +common.func: + block(repeat = 2): + any_of: + common.for_stmt: + blockcontinue + common.while_stmt: + blockcontinue + blockend: + common.func_call +``` + +It can be rewritten by replacing the `blockcontinue` statements with the children of the `block` statement, and replacing the resulting `blockcontinue` statements with the children of the `blockend` statement. + +```clql +common.func: + any_of: + common.for_stmt: + any_of: + common.for_stmt: + common.func_call + common.while_stmt: + common.func_call + common.while_stmt: + any_of: + common.for_stmt: + common.func_call + common.while_stmt: + common.func_call +``` + +## Nested blocks + +Nested blocks are not yet valid CLQL. The following query is intended to follow the callgraph from `someFunc` and via function calls with multiply nested for loops. It will currently give a parse error. + +```clql +common.func: + name == "someFunc" + block(repeat = any): + block(repeat = 2:): + common.for_stmt: + blockcontinue: + common.func_call(depth = any): + edge("calls"): + common.func: + blockcontinue +``` + +## Decorators + +Some decorators such as `@review.comment` can only be used once per query. Using them in a repeated block will cause an error. +
# Variables From 3da602022930cf94fa66a46949273f591c2bbc65 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 8 Oct 2018 16:01:59 +1300 Subject: [PATCH 3/9] Document edge keyword. --- docs/concepts/CLQL.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index 1316c43..cdb9620 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -262,6 +262,22 @@ common.method(depth = any): ``` +
+ +# Edge + +Facts in AST lexicons refer to nodes in an AST, and the parent/child relationship between facts refers to the parent/child relationship of nodes in the AST. Nodes can have other parent/child relationships that are orthogonal to AST, such as calls. These relationships can be queried with the `edge` keyword. + +The following query finds function calls at the top level of a file and follows the `calls` edge to their definition: + +``` +common.func_call: + edge("calls"): + common.func +``` + +
+ # Block Block statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branches of the subtree can be recombined at the end of the block allowing queries along multiple paths. From 7e25422c838cbcbf4151f58ad8f6920de03a540b Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Tue, 9 Oct 2018 09:33:44 +1300 Subject: [PATCH 4/9] s/block/path --- docs/concepts/CLQL.md | 60 +++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index cdb9620..412509d 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -278,9 +278,9 @@ common.func_call:
-# Block +# Path -Block statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branches of the subtree can be recombined at the end of the block allowing queries along multiple paths. +Path statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branched paths can rejoin allowing a fact to match nodes with different kinds of parents. ## Linear @@ -288,12 +288,12 @@ The following query finds functions with triply nested if statements: ```clql common.func: - block(repeat = 3): + path(repeat = 3): common.if_stmt: - blockcontinue + pathcontinue ``` -It can be rewritten by expanding the CLQL between `block` and `blockcontinue` 3 times: +It can be rewritten by expanding the CLQL between `path` and `pathcontinue` 3 times: ```clql common.func: @@ -307,14 +307,14 @@ The following finds all functions call by a function called `someFunc`: ```clql common.func: name == "someFunc" - block(repeat = any): + path(repeat = any): common.func_call(depth = any): edge("calls"): common.func: - blockcontinue + pathcontinue ``` -It can be rewritten (into demonstrative pseudocode) by expanding the CLQL betweeen `block` and `blockcontinue` once, twice etc to infinity, and grouping the expansions under an `any_of`: +It can be rewritten (into demonstrative pseudocode) by expanding the CLQL betweeen `path` and `pathcontinue` once, twice etc to infinity, and grouping the expansions under an `any_of`: ```clql common.func: @@ -341,19 +341,19 @@ common.func: ## Complex subtrees -All CLQL elements that are children of `block` but not `blockcontinue` are repeated. The following query matches functions with triply nested if statements that all check the same value: +All CLQL elements that are children of `path` but not `pathcontinue` are repeated. The following query matches functions with triply nested if statements that all check the same value: ```clql common.func: - block(repeat = 3): + path(repeat = 3): common.if_stmt: common.condition: common.var: name as varName - blockcontinue + pathcontinue ``` -It can be rewritten by replacing the blockcontinue statement with the contents of the block statement 3 times and replacing any repeated definition of `varName` with an assertion: +It can be rewritten by replacing the pathcontinue statement with the contents of the path statement 3 times and replacing any repeated definition of `varName` with an assertion: ```clql common.func: @@ -373,24 +373,24 @@ common.func: ## Branching -A `block` statement can have multiple `blockcontinue` statements. This allows multiple parents to have the same children. These children are defined under a `blockend` statment, rather than as children of the `blockcontinue` statements. It is often used with an `any_of` to match classes of simlar facts such as methods, closures, and functions, for example. +A `path` statement can have multiple `pathcontinue` statements. This allows multiple parents to have the same children. These children are defined under a `pathend` statment, rather than as children of the `pathcontinue` statements. It is often used with an `any_of` to match classes of simlar facts such as methods, closures, and functions, for example. The following query finds functions or methods with doubly nested for loops: ``` common.file: - block: + path: any_of: common.func: - blockcontinue + pathcontinue common.method - blockcontinue - blockend: + pathcontinue + pathend: common.for_stmt: common.for_stmt ``` -It can be rewritten by replacing each `blockcontinue` statement with the children of the blockend statement: +It can be rewritten by replacing each `pathcontinue` statement with the children of the pathend statement: ``` common.file: @@ -407,17 +407,17 @@ The following query matches all functions with function calls inside doubly nest ```clql common.func: - block(repeat = 2): + path(repeat = 2): any_of: common.for_stmt: - blockcontinue + pathcontinue common.while_stmt: - blockcontinue - blockend: + pathcontinue + pathend: common.func_call ``` -It can be rewritten by replacing the `blockcontinue` statements with the children of the `block` statement, and replacing the resulting `blockcontinue` statements with the children of the `blockend` statement. +It can be rewritten by replacing the `pathcontinue` statements with the children of the `path` statement, and replacing the resulting `pathcontinue` statements with the children of the `pathend` statement. ```clql common.func: @@ -436,26 +436,26 @@ common.func: common.func_call ``` -## Nested blocks +## Nested paths -Nested blocks are not yet valid CLQL. The following query is intended to follow the callgraph from `someFunc` and via function calls with multiply nested for loops. It will currently give a parse error. +Nested paths are not yet valid CLQL. The following query is intended to follow the callgraph from `someFunc` and via function calls with multiply nested for loops. It will currently give a parse error. ```clql common.func: name == "someFunc" - block(repeat = any): - block(repeat = 2:): + path(repeat = any): + path(repeat = 2:): common.for_stmt: - blockcontinue: + pathcontinue: common.func_call(depth = any): edge("calls"): common.func: - blockcontinue + pathcontinue ``` ## Decorators -Some decorators such as `@review.comment` can only be used once per query. Using them in a repeated block will cause an error. +Some decorators such as `@review.comment` can only be used once per query. Using them in a repeated path will cause an error.
From de7e08e78488faeb935b3ac4f0424e6bc4998b6c Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Tue, 9 Oct 2018 11:08:01 +1300 Subject: [PATCH 5/9] Reorder path wording. --- docs/concepts/CLQL.md | 121 +++++++++++++++++++++--------------------- 1 file changed, 60 insertions(+), 61 deletions(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index 412509d..869e28e 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -284,37 +284,23 @@ Path statements encapsulate CLQL trees. These subtrees can be repeated with a si ## Linear -The following query finds functions with triply nested if statements: +Say we wanted to find triply nested if statements, our query would look like the following: ```clql -common.func: - path(repeat = 3): - common.if_stmt: - pathcontinue -``` - -It can be rewritten by expanding the CLQL between `path` and `pathcontinue` 3 times: - -```clql -common.func: +common.if_stmt: common.if_stmt: - common.if_stmt: - common.if_stmt + common.if_stmt ``` -The following finds all functions call by a function called `someFunc`: +With paths, we can express the same thing like so: ```clql -common.func: - name == "someFunc" - path(repeat = any): - common.func_call(depth = any): - edge("calls"): - common.func: - pathcontinue +path(repeat = 3): + common.if_stmt: + pathcontinue ``` -It can be rewritten (into demonstrative pseudocode) by expanding the CLQL betweeen `path` and `pathcontinue` once, twice etc to infinity, and grouping the expansions under an `any_of`: +Some queries cannot be written with `path` statements. Say we wanted to find all functions called by `someFunc()` and an arbitrarily long chain of calls. Our query would have to explicitly match either directly called functions, or functions with 1, 2, 3 etc intermediaries to infinity. ```clql common.func: @@ -339,58 +325,56 @@ common.func: ... ``` +With paths the same query is trivial: + +```clql +common.func: + name == "someFunc" + path(repeat = any): + common.func_call(depth = any): + edge("calls"): + common.func: + pathcontinue +``` + ## Complex subtrees -All CLQL elements that are children of `path` but not `pathcontinue` are repeated. The following query matches functions with triply nested if statements that all check the same value: +Say we wanted to match triply nested if statements that all check the same value, our query would look like the following: ```clql -common.func: - path(repeat = 3): +common.if_stmt: + common.condition: + common.var: + name as varName + common.if_stmt: + common.condition: + common.var: + name == varName common.if_stmt: common.condition: common.var: - name as varName - pathcontinue + name == varName ``` -It can be rewritten by replacing the pathcontinue statement with the contents of the path statement 3 times and replacing any repeated definition of `varName` with an assertion: +With paths our query has much less repitition: ```clql common.func: - common.if_stmt: - common.condition: - common.var: - name as varName + path(repeat = 3): common.if_stmt: common.condition: common.var: - name == varName - common.if_stmt: - common.condition: - common.var: - name == varName + name as varName + pathcontinue ``` +Note that CLQL elements that are children of `path`, not just the `if_stmt`. Also note that repeated definitions of `varName` are replaced with assertions. + ## Branching A `path` statement can have multiple `pathcontinue` statements. This allows multiple parents to have the same children. These children are defined under a `pathend` statment, rather than as children of the `pathcontinue` statements. It is often used with an `any_of` to match classes of simlar facts such as methods, closures, and functions, for example. -The following query finds functions or methods with doubly nested for loops: - -``` -common.file: - path: - any_of: - common.func: - pathcontinue - common.method - pathcontinue - pathend: - common.for_stmt: - common.for_stmt -``` - -It can be rewritten by replacing each `pathcontinue` statement with the children of the pathend statement: +Say we wanted to find functions or methods with doubly nested for loops. Without `pathend` we would need to repeat the doubly nested for loop facts under `func` and `method`: ``` common.file: @@ -403,21 +387,22 @@ common.file: common.for_stmt ``` -The following query matches all functions with function calls inside doubly nested for/while statements: +With `pathend` there is no repitition: -```clql -common.func: - path(repeat = 2): +``` +common.file: + path: any_of: - common.for_stmt: + common.func: pathcontinue - common.while_stmt: + common.method pathcontinue pathend: - common.func_call + common.for_stmt: + common.for_stmt ``` -It can be rewritten by replacing the `pathcontinue` statements with the children of the `path` statement, and replacing the resulting `pathcontinue` statements with the children of the `pathend` statement. +Say we wanted to find functions with function calls inside doubly nested for/while statements, our query would have to handle all combinations of for/for for/while, while/for, and while/while: ```clql common.func: @@ -436,6 +421,20 @@ common.func: common.func_call ``` +With paths, we can express the same thing like so: + +```clql +common.func: + path(repeat = 2): + any_of: + common.for_stmt: + pathcontinue + common.while_stmt: + pathcontinue + pathend: + common.func_call +``` + ## Nested paths Nested paths are not yet valid CLQL. The following query is intended to follow the callgraph from `someFunc` and via function calls with multiply nested for loops. It will currently give a parse error. From 1e03b7b40c881f8e2cdc079887f77aaa0e9b15a6 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Tue, 9 Oct 2018 11:30:52 +1300 Subject: [PATCH 6/9] Explain pathcontinue and repeat ranges. --- docs/concepts/CLQL.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index 869e28e..cb61c31 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -300,6 +300,10 @@ path(repeat = 3): pathcontinue ``` +Once a query reaches a `pathcontinue` statement it continues from the `path` statement until the path has been repeated the specified number of times. + +## Repeat range + Some queries cannot be written with `path` statements. Say we wanted to find all functions called by `someFunc()` and an arbitrarily long chain of calls. Our query would have to explicitly match either directly called functions, or functions with 1, 2, 3 etc intermediaries to infinity. ```clql @@ -330,13 +334,15 @@ With paths the same query is trivial: ```clql common.func: name == "someFunc" - path(repeat = any): + path(repeat = 1:): common.func_call(depth = any): edge("calls"): common.func: pathcontinue ``` +`repeat = 1:` is a range specifying that the path should be repeated one or more times. + ## Complex subtrees Say we wanted to match triply nested if statements that all check the same value, our query would look like the following: From c17fff3b8afe334add312ee50ed3813e7949ff38 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 17 Dec 2018 11:30:04 +1300 Subject: [PATCH 7/9] Remove nested and any_of paths. --- docs/concepts/CLQL.md | 86 +------------------------------------------ 1 file changed, 2 insertions(+), 84 deletions(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index cb61c31..2cf42f1 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -266,7 +266,7 @@ common.method(depth = any): # Edge -Facts in AST lexicons refer to nodes in an AST, and the parent/child relationship between facts refers to the parent/child relationship of nodes in the AST. Nodes can have other parent/child relationships that are orthogonal to AST, such as calls. These relationships can be queried with the `edge` keyword. +Facts in AST lexicons refer to nodes in an AST, and the parent/child relationship between facts refers to the parent/child relationship of nodes in the AST. These nodes can have other parent/child relationships that are orthogonal to AST, such as calls. These relationships can be queried with the `edge` keyword. The following query finds function calls at the top level of a file and follows the `calls` edge to their definition: @@ -376,91 +376,9 @@ common.func: Note that CLQL elements that are children of `path`, not just the `if_stmt`. Also note that repeated definitions of `varName` are replaced with assertions. -## Branching - -A `path` statement can have multiple `pathcontinue` statements. This allows multiple parents to have the same children. These children are defined under a `pathend` statment, rather than as children of the `pathcontinue` statements. It is often used with an `any_of` to match classes of simlar facts such as methods, closures, and functions, for example. - -Say we wanted to find functions or methods with doubly nested for loops. Without `pathend` we would need to repeat the doubly nested for loop facts under `func` and `method`: - -``` -common.file: - any_of: - common.func: - common.for_stmt: - common.for_stmt - common.method - common.for_stmt: - common.for_stmt -``` - -With `pathend` there is no repitition: - -``` -common.file: - path: - any_of: - common.func: - pathcontinue - common.method - pathcontinue - pathend: - common.for_stmt: - common.for_stmt -``` - -Say we wanted to find functions with function calls inside doubly nested for/while statements, our query would have to handle all combinations of for/for for/while, while/for, and while/while: - -```clql -common.func: - any_of: - common.for_stmt: - any_of: - common.for_stmt: - common.func_call - common.while_stmt: - common.func_call - common.while_stmt: - any_of: - common.for_stmt: - common.func_call - common.while_stmt: - common.func_call -``` - -With paths, we can express the same thing like so: - -```clql -common.func: - path(repeat = 2): - any_of: - common.for_stmt: - pathcontinue - common.while_stmt: - pathcontinue - pathend: - common.func_call -``` - -## Nested paths - -Nested paths are not yet valid CLQL. The following query is intended to follow the callgraph from `someFunc` and via function calls with multiply nested for loops. It will currently give a parse error. - -```clql -common.func: - name == "someFunc" - path(repeat = any): - path(repeat = 2:): - common.for_stmt: - pathcontinue: - common.func_call(depth = any): - edge("calls"): - common.func: - pathcontinue -``` - ## Decorators -Some decorators such as `@review.comment` can only be used once per query. Using them in a repeated path will cause an error. +Some decorators such as `@review comment` can only be used once per query. Using them in a repeated path will cause an error.
From 258f593b3e487be7b58c1ec3c7d3e25962dd8072 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 17 Dec 2018 11:39:51 +1300 Subject: [PATCH 8/9] Document pathend. --- docs/concepts/CLQL.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index 2cf42f1..e244d8c 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -376,6 +376,35 @@ common.func: Note that CLQL elements that are children of `path`, not just the `if_stmt`. Also note that repeated definitions of `varName` are replaced with assertions. +## Pathend + +Suppose we wanted to match triply nested if statements with a function call inside the innermost if statement. Without paths our query looks like: + +```clql +common.if_stmt: + common.if_stmt: + common.if_stmt: + common.func_call +``` + +with paths our query looks like: + +```clql +path(repeat = 3): + common.if_stmt: + pathcontinue + pathend: + common.func_call +``` + +## Caveats + +Branching, where `path` statement has multiple `pathcontinue` statements is currently not supported. + +Nested paths are not supported. + +Using `any_of` inside a path statement is not supported. + ## Decorators Some decorators such as `@review comment` can only be used once per query. Using them in a repeated path will cause an error. From de2cf7fbc032a10c451f6b47eed67d29a7450fb0 Mon Sep 17 00:00:00 2001 From: BlakeMScurr Date: Mon, 17 Dec 2018 11:44:35 +1300 Subject: [PATCH 9/9] Revert "Add common namespace to facts." This reverts commit 447f33415103357c17eb3ed9140f0e84f5135a63. --- docs/concepts/CLQL.md | 102 +++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 51 deletions(-) diff --git a/docs/concepts/CLQL.md b/docs/concepts/CLQL.md index e244d8c..3c90544 100644 --- a/docs/concepts/CLQL.md +++ b/docs/concepts/CLQL.md @@ -53,7 +53,7 @@ To limit the above query to match classes with a particular name, add a "name" p ``` @review.comment -common.method(depth = any): +method(depth = any): name == "myFunc" ``` @@ -67,14 +67,14 @@ This query returns all methods with the name "myFunc". Note that the query decor Properties can be of type string, float, and int. The following finds all int literals with the value 8: ``` -common.int_lit(depth = any): +int_lit(depth = any): value == 8 ``` This query finds float literals with the value 8.7: ``` -common.float_lit(depth = any): +float_lit(depth = any): value: 8.7 ``` @@ -84,7 +84,7 @@ common.float_lit(depth = any): The comparison operators >, <, >=, and <= are available for floats and ints. The following finds all int literals above negative 3: ``` -common.int_lit(depth = any): +int_lit(depth = any): value: > -3 ``` @@ -95,25 +95,25 @@ common.int_lit(depth = any): Facts can take any number of facts and properties as children, forming a query with a tree struct of arbitrary depth. A parent-child fact pair will match any parent element even if the child is not a direct descendant. The following query finds all the if statements inside a method called "myMethod", even those nested inside intermediate scopes (for loops etc): ``` -common.method(depth = any): +method(depth = any): name == "myMethod" - common.if_stmt(depth = any) + if_stmt(depth = any) ``` Any fact in a query can be decorated. If `class` is decorated, this query returns all classes named "myClass", but only if it has at least one method: ``` -common.class(depth = any): +class(depth = any): name: “myClass” - common.method(depth = any) + method(depth = any) ``` Any fact in a query can have properties. The following query finds all methods named "myMethod" on the all classes named "myClass": ``` -common.class(depth = any): +class(depth = any): name: “myClass” - common.method(depth = any): + method(depth = any): name: “myMethod” ``` @@ -122,15 +122,15 @@ common.class(depth = any): Facts use depth ranges to specify the depth at which they can be found below their parent. Depth ranges have two zero based numbers, representing the minimum and maximum depth to find the result at, inclusive and exclusive respectively. The following query finds any if statements that are direct children of their parent method, in other words, if statements at depth zero from methods: ``` -common.method(depth = any): - common.if_stmt(depth = 0:1) +method(depth = any): + if_stmt(depth = 0:1) ``` This query finds if statements at (zero based) depths 3, 4, and 5: ``` -common.method(depth = any): - common.if_stmt(depth = 3:6) +method(depth = any): + if_stmt(depth = 3:6) ``` A depth range where the maximum is not greater than the minimum, i.e. `(depth = 5:5})` or `({depth: 6:0)`, will give an error. @@ -138,15 +138,15 @@ A depth range where the maximum is not greater than the minimum, i.e. `(depth = Depth ranges specifying a single depth can be described with a single number. This query finds direct children at depth zero: ``` -common.method(depth = any): - common.if_stmt(depth = 0) +method(depth = any): + if_stmt(depth = 0) ``` Indices in a depth range can range from 0 to positive infinity. Positive infinity is represented by leaving the second index empty. This query finds all methods, and all their descendant if_statements from depth 5 onwards: ``` -common.method(depth = any): - common.if_stmt(depth = 5:) +method(depth = any): + if_stmt(depth = 5:) ``` Note: The depth range on top level facts, like `method` in the previous examples, determines the depth from the base context to that fact. In this case the base context contains a single program. However, it can be configured to refer to any context, typically a single repository or the root of the graph on which all queryable data hangs. @@ -158,10 +158,10 @@ Note: The depth range on top level facts, like `method` in the previous examples The following query will find a method with a foreach loop, a for loop, and a while loop in that order: ``` -common.method(depth = any): - common.for_stmt - common.foreach_stmt - common.while_stmt +method(depth = any): + for_stmt + foreach_stmt + while_stmt ``` @@ -173,7 +173,7 @@ common.method(depth = any): Exlude allows queries to match children that *do not* have a given property or child fact. Excluded facts and properties are children of an `exclude` operator. The following query finds all classes except those named "classA": ``` -common.class(depth = any): +class(depth = any): exclude: name == "classA" ``` @@ -181,8 +181,8 @@ common.class(depth = any): This query finds all classes with a method that is not called String: ``` -common.class(depth = any): - common.method: +class(depth = any): + method: exclude: name: “String” ``` @@ -190,9 +190,9 @@ common.class(depth = any): The placement of the exclude operator has a significant effect on the query's meaning - this similar query finds all classes without String methods: ``` -common.class(depth = any): +class(depth = any): exlude: - common.method: + method: name: “String” ``` @@ -201,11 +201,11 @@ The exclude operator in the above query can be read as excluding all methods wit Excluding a fact does not affect its siblings. The following query finds all String methods that use an if statement, but don’t use a foreach statement: ``` -common.method(depth = any): +method(depth = any): name: “String” - common.if_stmt + if_stmt exclude: - common.foreach_stmt + foreach_stmt ``` An excluded fact will not return a result and therefore cannot be decorated. @@ -216,10 +216,10 @@ An excluded fact will not return a result and therefore cannot be decorated. Exclusions can be arbitrarily nested. The following query finds methods which only return nil or return nothing, that is, it finds all methods except those with non-nil values in their return statements: ``` -common.method: +method: exclude: - common.return_stmt(depth = any): - common.literal: + return_stmt(depth = any): + literal: exclude: name == "nil" ``` @@ -232,12 +232,12 @@ Facts nested under multiple excludes still do not return results and cannot be d Include allows queries to match patterns without a given parent. The following query is a simple attempt at finding infinitely recursing functions. It works by finding functions that call themselves without an if statement to halt recursion: ``` -common.func: +func: name as funcName exclude: - common.if_stmt: + if_stmt: include: - common.func_call: + func_call: name as funcName ``` @@ -253,12 +253,12 @@ Results under include statements appear as children of the parent of the corresp A fact with multiple children will match against elements of the code that have child1 *and* child2 *and* child3 etc. The `any_of` operator overrides the implicit "and". The following query finds all String methods that use basic loops: ``` -common.method(depth = any): +method(depth = any): name: “String” any_of: - common.foreach_stmt - common.while_stmt - common.for_stmt + foreach_stmt + while_stmt + for_stmt ``` @@ -418,13 +418,13 @@ Facts that do not have a parent-child relationship can be compared by assigning The following query compares two classes (which do have a parent-child relationship) and returns the methods which both classes implement: ``` -common.class(depth = any): +class(depth = any): name: “classA” - common.method: + method: name as methodName -common.class(depth = any): +class(depth = any): name: “classB” - common.method: + method: name as methodName ``` @@ -441,9 +441,9 @@ Functions allow users to execute arbitrary logic on variables. There are two typ A resolver function is used on the right hand side of a property assertion. In the following example, we assert that the name property of the method fact is equal to the value returned from the concat function: ``` -common.class(depth = any): +class(depth = any): name as className - common.method: + method: name == concat("New", className) ``` @@ -454,8 +454,8 @@ Asserter functions return a Boolean value and can only be called on their own li The following query uses the inbuilt `regex` function to match methods with capitalised names: ``` -common.class(depth = any): - common.method: +class(depth = any): + method: name as methodName regex(/^[A-Z]/, methodName) // pass in the methodName variable to the regex function and assert that the name is capitalised. ``` @@ -482,10 +482,10 @@ tenets: This method appears to be a constructor name: constructor-finder query: | - common.class(depth = any): + class(depth = any): name as className @review.comment - common.method: + method: name == newConcat("New", className) ``` @@ -506,7 +506,7 @@ tenets: This method has a long name name: long-method-name query: | - common.method: + method: name as methodName stringLengthGreaterThan(methodName, 15) ```