-
Notifications
You must be signed in to change notification settings - Fork 97
Robust parallel_chat_structured(..., convert = TRUE) (fixes #864) #725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
I don't think flattening is the right solution here — if |
I suppose it depends on the interpretation of "convert", which for me, implies that a list would be flattened into a data.frame. If that's not what's wanted, then a user should arguably not specify I've reduced the problem to a simpler reprex, added here. The fix seems to be limited to one of:
I'd be happy to revise the PR if you choose which fix you prefer. Leaving it without a fix though can cause frustrating (and costly) data loss when a user is unaware that a nested type will cause this error exit condition. |
I think you can just leave this to me; I know exactly what data type this should be, and for now I'm pretty certain that's correct, even if it gives you a relatively unusual data structure. (There's no guarantee it will be tidy, but I don't think that's a guarantee that applies to tools like ellmer that need to interface with other systems; you certainly might want to use some tidyr afterward if you are looking for a tidy df). |
OK sounds good. The main thing is to modify the conversion avoid the cost and time only to have it crash on executing the convert. Close up the PR or do with it what you will. Hopefully my test conditions will help. |
Solves #684
convert = TRUE
and the schema is an array of objects that themselves contain nested objects (e.g., economic and social), thenconvert_from_type()
produces data.frame columns that were themselves data frames. This triggeredlist2DF()
conversion failures and made downstream handling (including tokens/cost) brittle.R/chat-structured.R
,convert_from_type()
, for TypeArray(TypeObject) with declared properties, build columns for each property across items, then flatten any nested data.frame columns by prefixing with the parent property name before callinglist2DF()
.data.frame.
R/parallel-chat.R multi_convert()
, coerce token fields to integer with as.integer(turn@tokens) beforeaggregation. Some providers (e.g., Gemini) return numeric doubles; this prevents the vapply “values must be type 'integer'”
error under
convert = TRUE
.tests/testthat/test-parallel-chat-structured.R
) usingchat_openai_test()
and short inlineprompts validates:
convert = TRUE
returns a flattened data.frame with the expected nested column names and includes input/output/cached_input token columns and cost.convert = FALSE
returns the raw list with the expected nested structure for comparison.as.integer()
coercion and ensuring token columns attach cleanly. I added this because I was getting a different error crashing the conversion to data.frame withchat_google_gemini()
.a flat, analysis-ready data.frame (e.g., economic.score, social.evidence), addressing the list/data.frame conversion failure
reported in #864.
and call parallel_chat_structured(..., convert=TRUE); check that the resulting data.frame has columns like economic.score,
economic.evidence, social.score, social.evidence and that tokens/cost attach without errors.