Skip to content

Conversation

sacerdot
Copy link
Collaborator

@sacerdot sacerdot commented Jul 17, 2025

DO NOT MERGE YET

Bugs to be fixed:

  • "lambdapi websearch --require HOLLight.theory_hol" triggers an assert failure due to changes to recursive require/open

Todo:

  • docs and changelog need to be updated
  • also see TODO.md for a list of pending things before merging

New features (to be documented too):

  1. new command "lambdapi deindex"
  2. the current development (i.e. the set of currently required files) is now indexed and deindexed automatically
  3. new filter | "regexp" to restrict the set of results using a regexp over the full qid
  4. "lambdapi websearch" now uses a rocqParser that accepts (also) the syntax of Rocq for fun/forall/exists; it also remaps /\ / ~ to the corresponding LP tokens for logical connectives
  5. new flag "--source" to load a map from LP identifiers to Rocq sources (file + line); the Rocq sources are also shown by websearch
  6. the "--require" flag can now be passed multiple times (no longer necessary since Frederic has implemented recursive open, but nice to have anyway; see bug to be fixed above though)
  7. when an identifier is ambiguous, but after normalization it is no more, the query is now considered non ambiguous
  8. "lambdapi search" now streams the result
  9. Plac in rewriting rule is now supported (handled like a variable... hoping that it makes sense)
  10. bug fix: many spurious justifications were included
  11. bug fix: avoid Stack_overflow by rewriting a few functions using tail recursion

sacerdot added 30 commits July 3, 2025 16:44
It deindex all constants whose path is a suffix of PATH.

Question: what are useful CLI args for deindex? E.g.
- a list of filenames
- a .pkg file (or the fact that it must use the .pkg file)
- a user-provided PATH like it is now?

Moreover notice that there is currently no check that PATH is a well-formed
PATH and A.B matches A.BC as a prefix.
Now the query 'concl >= _ | ".*vsum.*"' terminates on the medium-sized HOL
light extraction
a query like 'concl = _' was returning justifications of the form

_ found in hypothesis

Now only really relevant justifications are returned

Signed-off-by: Claudio Sacerdoti Coen <[email protected]>
1. during indexing using --source=PATH one can add indexing information
   mapping LP identifiers to Coq identifiers and locations
2. when the search results are shown, the Coq code is also printed if
   the LP identifier is found in the map in point 1

The code to generate the file used in 1 should belong to HOL2DK. Right now
we have hacked some best-effort shell script that we will commit in the
HOL2DK_indexing repository
It used to match only prefixes of strings, which was quite weird
Note: there's no easy way to implement the same check for rules.
However it is unlikely to index twice a file that contains only
rewriting rules.
E.g. in last rewriting rule in tests/OK/coercions.lp
1. we used a Timed.ref to update the indexes with the local development
 (i.e. just the files that have been required, not the others) so that it
 is possible to query the local development
2. when doing "require ... as Foo" the user can use Foo.X in the queries;
 (but not regular expressions involving Foo, but that seems reasonable)

We have introduced some "hakish" code here and there to be reviewed. In
particular:

1. we store in Common/Mode if we are running LSP
2. we detected that the current LSP buffer has "fname" set to an URI
   "file:///". We use this feature to distinguish between the buffer
   and the filesystem so that, when showing query results to the user,
   if the text comes from the buffer we can retrieve it
3. we use a finely tuned combination of Stdlib.ref/Timed.ref to "remember"
   things that would be forgotten. E.g.:
   a) we remember the LSP buffer content *after* the end of the compilation
   b) we remember the index *after* the end of the indexing of a single
      file
4. we used a maison stream-like representation of file contents to be able
   to retrieve the text to be shown to the user both from the filesystem
   or from strings (i.e. the content of the LSP buffer)
5. since indexing is now performed even during compilation when things
   enter a signature in LSP mode, we created a huge circular dependency
   in the modules that we break using multiple callbacks and Stdlib.ref
   that are set here and there. Maybe one can do (much) better.
   In particular pure now depends on tool.
- doc and welcome page of websearch to be updated with the new syntax
- in "lambdapi search" and VScode -> and forall cannot be used any more
- the new syntax allows all the LP syntax plus:
  t ::= ... | forall binders, t | exists binders, t | fun binders => t
            | t -> t
- the binders used in the new entries allow also "x ... z : t" that is
  currently rejected by LP (but it is allowed in Coq)

TODO: but for exists, all the Coq's stdlib notations are NOT in place
(e.g. arithmetic/logic connectives, numbers, ...). How should we add them?
Maybe one possible path is to have "special" notation files that can be fed
to websearch when it starts. As for rewriting rules for normalization, the
notation files are supposed to add notation to symbols that have NOT been
put in the environment. It may be feasible: the file that calls Pratter
already uses our find_sym to resolve symbols!
- this basically removes ALL overloaded problems from the HOL export
- the code is written so that the two normalized symbols may be
  unequal (in the sense of =) but have the same path. Maybe one should
  investigate why this is necessary, but for the time being it is safe
  to ignore. [ We are probably comparing a real symbol with a bogus one
  or even two bogus ones that differ ]
…kept

Example: Real was no longer found because it must become R and not Real
In particular one can require (and open) theory_hol.lp and then
require (and open) a notation.lp file to add infix notation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants