Skip to content
147 changes: 147 additions & 0 deletions docs/concepts/CLQL.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,153 @@ method(depth = any):

<br />

# Edge

Facts in AST lexicons refer to nodes in an AST, and the parent/child relationship between facts refers to the parent/child relationship of nodes in the AST. These nodes can have other parent/child relationships that are orthogonal to AST, such as calls. These relationships can be queried with the `edge` keyword.

The following query finds function calls at the top level of a file and follows the `calls` edge to their definition:

```
common.func_call:
edge("calls"):
common.func
```

<br />

# Path

Path statements encapsulate CLQL trees. These subtrees can be repeated with a single argument allowing succint repition of complex patterns. Branched paths can rejoin allowing a fact to match nodes with different kinds of parents.

## Linear

Say we wanted to find triply nested if statements, our query would look like the following:

```clql
common.if_stmt:
common.if_stmt:
common.if_stmt
```

With paths, we can express the same thing like so:

```clql
path(repeat = 3):
common.if_stmt:
pathcontinue
```

Once a query reaches a `pathcontinue` statement it continues from the `path` statement until the path has been repeated the specified number of times.

## Repeat range

Some queries cannot be written with `path` statements. Say we wanted to find all functions called by `someFunc()` and an arbitrarily long chain of calls. Our query would have to explicitly match either directly called functions, or functions with 1, 2, 3 etc intermediaries to infinity.

```clql
common.func:
name == "someFunc"
any_of:
common.func_call(depth = any):
edge("calls"):
common.func
common.func_call(depth = any):
edge("calls"):
common.func:
common.func_call(depth = any):
edge("calls"):
common.func
...
common.func_call(depth = any):
edge("calls"):
common.func:
common.func_call(depth = any):
edge("calls"):
common.func:
...
```

With paths the same query is trivial:

```clql
common.func:
name == "someFunc"
path(repeat = 1:):
common.func_call(depth = any):
edge("calls"):
common.func:
pathcontinue
```

`repeat = 1:` is a range specifying that the path should be repeated one or more times.

## Complex subtrees

Say we wanted to match triply nested if statements that all check the same value, our query would look like the following:

```clql
common.if_stmt:
common.condition:
common.var:
name as varName
common.if_stmt:
common.condition:
common.var:
name == varName
common.if_stmt:
common.condition:
common.var:
name == varName
```

With paths our query has much less repitition:

```clql
common.func:
path(repeat = 3):
common.if_stmt:
common.condition:
common.var:
name as varName
pathcontinue
```

Note that CLQL elements that are children of `path`, not just the `if_stmt`. Also note that repeated definitions of `varName` are replaced with assertions.

## Pathend

Suppose we wanted to match triply nested if statements with a function call inside the innermost if statement. Without paths our query looks like:

```clql
common.if_stmt:
common.if_stmt:
common.if_stmt:
common.func_call
```

with paths our query looks like:

```clql
path(repeat = 3):
common.if_stmt:
pathcontinue
pathend:
common.func_call
```

## Caveats

Branching, where `path` statement has multiple `pathcontinue` statements is currently not supported.

Nested paths are not supported.

Using `any_of` inside a path statement is not supported.
Copy link

@mullikine mullikine Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that I must change this tenet to use a nested any_of? https://github.com/mullikine/codelingo/blob/psr/tenets/codelingo/psr-1/uppercase-class-constants/codelingo.yaml Will the path syntax used in the tenet be supported in the future? Also, is path expanded into a nested any_of before it is evaluated? Knowing this would help me to understand how I'm supposed to use it. For example, I'm questioning if I should put the depth = any inside the path or the path's children. This raises a couple of questions. Can I have multiple arguments to the path fact i.e depth and repeat? Can I have path at the root of the CLQL query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be supported in the future, that's a key use case.

Can you give an example of where you would be choosing between putting a depth = any inside the path or the path's children?

The path element only takes a repeat argument, facts inside path can take depth arguments. Element, by the way, is the generic term for fact, property, path, any_of, etc.

Yes you can have a path at the root.

Copy link

@mullikine mullikine Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case I had in mind when asking the question was the tenet I linked to above. Here I have "depth = any" specified for both children. But if a repeat was specified in the path then each nested child would have a "depth = any". I'm unsure what this would do to performance, but I'd imagine you might get a tetration thing going. Also, it kind of makes sense to place the "depth = any" within the path fact because then you're only specifying it once. Either that or place the "depth = any" on a zero-width parent to the path fact. Should this go into discuss?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't know what you're suggesting. I can't think of any argument repeat argument that you could pass to path that would replace the need for a depth = any. Can you write up some CLQL?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a topic on discuss to continue the conversation.
https://discuss.codelingo.io/t/clql-syntax-new-features/87


## Decorators

Some decorators such as `@review comment` can only be used once per query. Using them in a repeated path will cause an error.

<br />

# Variables

Facts that do not have a parent-child relationship can be compared by assigning their properties to variables. A query with a variable will only match a pattern in the code if all properties representing that variable are equal.
Expand Down