-
Notifications
You must be signed in to change notification settings - Fork 319
Feature/compaction #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/compaction #311
Conversation
Please note that Claude-Sonnet-3.5 works much better than Gemini-2.5-Flash. |
I think, though, that this is mostly a LLM-related behavior. But I'll need to dig deeper into this. Maybe it is not (only) that Gemini-2.5-Flash is just worse at tool use. |
Some integration tests fail since I merged the latest uv.lock into this branch. It seems like the main change was that mcp was updated from All unit tests are still passing, and the agent runs normally, as far as I can tell. Will try to find the root cause on the weekend. |
storlien wrote this in #286:
|
Hi @janspoerer -- send me a DM on discord if you need help with a key. |
This is the error I got when trying out gemini-2.5-pro.
Had to add
several places in the direct_decorators.py file first. Haven't tried with your latest changes. This occured when your latest commit was "Made linter happy". Can try with your latest changes now. Full traceback:
|
Thanks a lot. Will have a look into this on Monday. I have progressed with the OpenAI truncation as well. Will push the changes now. |
Wait, I see that I should put the params in the RequestParams. Trying it out now |
I changed:
to:
As there was a dependency conflict.
|
I am not sure why GitHub displays the warning "This branch has conflicts that must be resolved." I have merged all changes from the main branch locally into this feature branch. Do you have an idea, @evalstate? Other than that, given that the tests and linting work here, I think this is ready for review. |
Ah my bad, I have of course compared this to my forked main branch. Sorry, will sync my fork. |
Alright, now the tests are being queued... All good :-) |
… into feature/compaction
@janspoerer how are you running the tests? they work from the command line if api keys are set; not otherwise. are api keys strictly needed for these (if so, we can shift to e2e). the |
Sorry for the late reply! Thanks for looking at this. I run the tests using:
And now they are running. I turned off the integration test that required Google API keys. This test is not needed. |
I will use the features from this PR next week in some long-running workflows. The features will thus be somewhat battle-tested (and not only have integration tests and unit tests). |
Update: Still battle-testing. :) |
Good stuff. I'm definitely going to need your help with the merge to 0.3.0..... |
Hi @janspoerer lmk how the testing went. I've been having a look at this to figure out the best way to proceed with merging with 0.3.0. My current thought is to bring the e2e tests over as-is (as the top-level interface remains the same), and then manually merge the functionality (as I think post-refactor we can reduce some of the duplication you've had to deal with). |
Hi @evalstate, sorry for the late answer. I'm building a browser use MCP for work that I wanted to use to test this compaction feature, but the MCP took longer than expected, so my testing got delayed. So far, I could not find bugs with the compaction feature. Maybe to clarify:
Can you maybe elaborate on what you mean with "manually merging"? Thank you :) Next step would thus be: I remove the e2e changes? Anything else that you would like to see being done here? |
No description provided.