Skip to content

Change Generation Tools to Allow Insertion of Items #43

@Ryfernandes

Description

@Ryfernandes

It seems like a reasonable use case and one that I have run up against in testing to want to have a client insert new content into a Docling Document (rather than always add it to the end). Thus, it would be useful to add this functionality to the tools in the generation.py file. However, some open questions exists regarding its implementation.

Since we are seeking to reduce the total number of tools in pursuit of better small model performance, it seems to make sense to me to implement this feature as a refactor to the add_* methods. I have already extended the DoclingDocument methods to include insert_* methods that make this implementation simple, and it makes sense that a client could simply not pass a sibling anchor to trigger the default "append to end" behavior. It seems reasonable that the client would be able to grasp and correctly call this based solely the docstring/parameter annotations. However, I am open to discussion on this being a separate group of tools rather than a refactor.

Additionally, most NodeItems are simple to add, but list items of course have the restriction that their parent must be a list group. In the current server, it seems that this is tracked using a local stack cache, but I wonder if this needs to be done in the same manner (or even can be done away with entirely) when implementing similar checks for the insert_* tools. If we can just resolve the sibling NodeItem from the cref in the document anchor, how much different would it be to just resolve and check the parent attribute of this NodeItem, rather than source it from the stack? In my mind, removing the stack could make the project simpler, but if there is some other reason for it that I am naively missing, please let me know. It makes sense that the stack can be useful for opening/closing lists, but I feel there should be other ways to model this same behavior without the stack—especially if the stack itself is not exposed to the client.

I will start blocking out the implementation of the insert_* tools in a draft pr, but will leave the decision on whether or not to combine them with the add_* methods and what to do with the stack until later, once it has been discussed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions