Skip to content

Conversation

Octachron
Copy link
Contributor

This PR updates the vendored parsers and related types to mirror the 5.4 version of the OCaml parser and adds a basic support for labeled tuples and bivariance.

The parser updates means that it is possible to locate comments between Longident.t component, and it is always possible to distinguish between (module M): (module S) at least for expressions.

The support for labeled expressions is bit ad-hoc and hackish in this version due to the way that labeled tuple element interact with parenthesis, for instance inside:

let t = ~x:(Fun.id 0), 1

and it could definitively be improved.

- update vendored parsers to mirror upstream at 5.4:
  * introduce locations for Longident.t components
  * distinguish (module M:S) and ((module M):(module S)) for expressions
- support for new syntaxes:
  * bivariance
  * labelled tuples
Copy link
Collaborator

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow reply. This is awesome :) Thanks a lot!

@@ -3185,10 +3316,12 @@ type_variance:
| MINUS BANG { [ mkvarinj "-" $loc($1); mkvarinj "!" $loc($2) ] }
| BANG MINUS { [ mkvarinj "!" $loc($1); mkvarinj "-" $loc($2) ] }
| INFIXOP2
{ if ($1 = "+!") || ($1 = "-!") then [ mkvarinj $1 $sloc ]
{ if $1 = "+!" || $1 = "-!" || $1 = "+-"|| $1 = "-+"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could accept anything starting in +, - or ! here ? The standard parser will catch errors there anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the parser point of view, we could accept any INFIXOP2. And I agree that it probably better to accept this when building the CST.

lib/Fmt_ast.ml Outdated
@@ -2822,13 +2863,36 @@ and fmt_expression c ?(box = true) ?(pro = noop) ?eol ?parens
in
let outer_wrap = has_attr && parens in
let inner_wrap = has_attr || parens in
let with_label (lbl, exp) =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead handling special cases for ( ~foo, ~(bar : t) ), etc.. do you think we could represent the concrete syntax in the parsetree ?
I mean something like type tuple_label = Tl_pun of string loc | Tl_constraint of string loc * core_type | Tl_normal of string loc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are case where people don't want to normalize ~x:x, ~y to ~x, ~y ? If yes, it is probably better to separate the two forms in the parsetree. Otherwise, I am not sure if this is that useful?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to make the formatting code simpler and less buggy. Decoding the AST was a major source of bugs in the past.
We generally do the normalization by modifying the AST: https://github.com/ocaml-ppx/ocamlformat/blob/main/lib/Extended_ast.ml#L68

I'd be happy to do the changes if you don't have the time ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After experimenting a bit more, I agree that it feels better to not have a straight mapping of locations with punning. I will send my updated commits after some more tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants