Skip to content

Feature request: preserve source line numbers in AST nodes #64

@lvh

Description

@lvh

Summary

It would be helpful if the parsed AST nodes included source line number information from the original markdown text.

Use case

When building linting tools that check markdown structure (e.g., heading hierarchy validation), error messages are much more useful when they can reference the specific line number:

Line 15: H1 not allowed in blog posts: 'Introduction'

vs just:

H1 not allowed in blog posts: 'Introduction'

Current behavior

The AST nodes only contain content, not source positions:

(require '[nextjournal.markdown :as md])
(md/parse "## Heading\n\nParagraph")
;; => {:content [{:type :heading, :heading-level 2, :content [...], :attrs {...}}
;;               {:type :paragraph, :content [...]}], ...}

Proposed behavior

Include source position information, perhaps as metadata or additional keys:

{:type :heading
 :heading-level 2
 :content [...]
 :attrs {...}
 :source {:line 1 :column 0}}  ;; or similar

Notes

The underlying commonmark-java parser does track source spans (SourceSpans), so this information is available during parsing but currently discarded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions