-
Notifications
You must be signed in to change notification settings - Fork 151
Add SmallTag
type for more compact Dual
types
#748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
SmallTag
type for more compute Dual
typesSmallTag
type for more compact Dual
types
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #748 +/- ##
==========================================
- Coverage 89.57% 86.92% -2.66%
==========================================
Files 11 10 -1
Lines 969 1025 +56
==========================================
+ Hits 868 891 +23
- Misses 101 134 +33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
We could also revive #572, closed it was thought that SciML printing problems were solved in Base. Can you comment on the comparison? |
Base has type-folding now that helps many stack traces, but you still sometimes need a
The main difference is that debugging perturbation issues without the actual function / array types can be very confusing, so it seems useful to keep the original functionality around. Other than that, the implementation should be mostly equivalent (the hash is based in many cases on the Also it's worth mentioning that |
This is an alternative to `Tag` that provides largely the same functionality, but carries around only the hash of the function / array types instead of the full types themselves. This can make these types much less bulky to print and easier to visually scan for.
This provides a convenient interface to ask for a SmallTag.
Latest patch release seems to have tightened this up.
This old version of Julia doesn't have call-site `@inline` / `@noinline` so version-guard against this trick for now.
98b57bb
to
d88bf58
Compare
Chatted on this with @oscardssmith and @topolarity . This is better than using the objectid because it is consistent with respect to precompilation, i.e. in a new session the objectid can change depending on the order that you evaluate functions, while this is consistent. That plus on v1.11 this is const eval-able, and thus using this form of a short tag is type-stable and makes it so you will precompile the autodiff calls if you autodiff w.r.t. the same function. With those two properties though, I would be highly in favor of defaulting to this on v1.11+, since it won't hurt performance or precompilation, and I cannot see a scenario where a normal user would favor the tags based on function types. It would be odd to have huge multi-screen types on v1.10 and then good stack traces on v1.11, but IMO that's not a reason to make it wait longer. I'd like to hear @KristofferC's opinion here. But either way, I think we'd want to set this up as well in ADTypes and DifferentiationInterface @gdalle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/config.jl
Outdated
# SmallTag is similar to a Tag, but carries just a small UInt64 hash, instead | ||
# of the full type, which makes stacktraces / types easier to read while still | ||
# providing good resilience to perturbation confusion. | ||
struct SmallTag{H} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an actual docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it should be marked public or exported, but so should other symbols (#744)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this needs a new type at all? Could we just extend Dual
's first type parameter from Union{Nothing, Tag}
to perhaps Union{Nothing, Tag, Integer}
?
julia> ForwardDiff.Dual(pi/2, 1)
Dual{Nothing}(1.5707963267948966,1.0)
julia> ForwardDiff.derivative(x -> @show(x), pi/2)
x = Dual{ForwardDiff.Tag{var"#37#38", Float64}}(1.5707963267948966,1.0)
1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it does need a new type, the obvious name is HashTag
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it should be a type. Otherwise people might start passing 1 by accident. A UInt might be ok though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's safer as a type, so it should be a type.
I'm not really convinced that #724 is a proper fix for the issue it claims to solve - it is still possible for the counter to overlap, e.g., between multiple precompiled packages. I am happy to update the type exports while we're at it though |
I don't feel confident enough to evaluate that one, but I thought it looked related to what you did with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review but some comments:
-
If we want to keep two methods of tagging around, is this the right way to access them? It seems quite an obscure internal detail, which should never change any results. So perhaps need not be a new keyword on every user-facing function.
-
I wonder if the implementation can be much simpler, like just storing the integer as the tag, instead of adding more structs.
src/config.jl
Outdated
# SmallTag is similar to a Tag, but carries just a small UInt64 hash, instead | ||
# of the full type, which makes stacktraces / types easier to read while still | ||
# providing good resilience to perturbation confusion. | ||
struct SmallTag{H} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this needs a new type at all? Could we just extend Dual
's first type parameter from Union{Nothing, Tag}
to perhaps Union{Nothing, Tag, Integer}
?
julia> ForwardDiff.Dual(pi/2, 1)
Dual{Nothing}(1.5707963267948966,1.0)
julia> ForwardDiff.derivative(x -> @show(x), pi/2)
x = Dual{ForwardDiff.Tag{var"#37#38", Float64}}(1.5707963267948966,1.0)
1.0
src/config.jl
Outdated
@inline function ≺(::Type{SmallTag{H1}}, ::Type{SmallTag{H2}}) where {H1,H2} | ||
tagcount(SmallTag{H1}) < tagcount(SmallTag{H2}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be obvious, but can you explain a bit why this is designed this way? Why calculate hash H
and store that in the type, but call the generated tagcount(SmallTag{H1})
to get a different integer whenever you use it. Why not store the integer from TAGCOUNT
as a type parameter directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just to preserve the current ordering behavior for tags as closely as possible - the tag hashes are not sorted chronologically, so they would make a very different order
Also, am I correct that this doesn't make perturbation confusion impossible, only vanishingly unlikely? |
@gdalle that is correct. With a 64 bit hash, we have a 50% collision chance once you get to 4 billion different tags (that are used in the same operation), but as long as your number of tags is below millions (which should always be the case), the chance of confusion is incredibly unlikely |
IMO we should merge this. |
Would someone care to answer this, and related comments above?
|
If someone happens to be doing ForwardDiff of the same function against a few million different arguments at the same time, then there would be an extremely small probability of a perturbation confusion. |
No I don't think it's related. This PR just uses the current way for the tag incrementer, which could potentially have an issue with some odd multithreading cases, but that PR also isn't a proper fix. But it's pretty unrelated since the current code already has this issue, and so there's no difference in this respect between default and smalltag, just whomever does end up wanting to do something about the multithreading issue will need to hit these two functions. I don't think that's a major maintenance burden to stop a PR like this. |
This was heavily discussed with about 30 people at JuliaCon. Basically I made a "SmallTag triage" JuliaCon Hackathon discussion. Everyone was really excited that there was a fix that was already made and the question was "why is this not merged"? The general consensus about certain topics in here was:
So with this, I think we have answered @gdalle's questions (mostly seemed to just be questions rather than concerns) and @topolarity is making some of these final changes based on @mcabbott's requests, and then two JuliaDiff maintainers (@oscardssmith and I) are sitting together with @topolarity and the community and when it all seems good plan to merge. |
The three of us agreed that using a tag type is more clear and less likely to run into a potential issue. One potential issue is someone putting a type parameter in the wrong spot, so trying to specify a chunksize=1 and instead accidentally setting a tag to 1, and then opening themselves up to perturbation confusion. So using a type and UInt just is a nice safety measure. We note that the exact tag was considered by everyone to just be an implementation detail and not public API anyone should be relying on, so it wouldn't be a breaking change to change this to just an Int in the future. That could be a discussion we can continue, but it doesn't seem like there was much support for that idea. |
@topolarity noted that v1.10 cannot compute a stable hash at compile time, and this means that these hashes cannot be inferred. Because of this, it would not make a good default on v1.10. It should be considered to be removed after the next LTS bump, but as mentioned in the previous comment, the general Julia user did not believe that we should wait for another LTS in order to make this the default and so we would need to live with the two separate defaults for at least the short term future. |
The reason for the keyword argument is due to the potential of hash collision, this gives a method to flip to the version for which it's debuggable without that potential issue. Though as mentioned, it's vanishingly unlikely, but we should at least move with some care to start and give people an easy way to diagnose if it does cause an issue. |
Can you explain what you mean, who are these multiple people? You are envisaging that libraries which call ForwardDiff may want to choose not to use the hash one on their internal calls, even if their end users do? But I don't see why. If this thing is safe enough to turn on for everyone, and I buy that it is, then why must we inflict clutter on them to make it easy to turn off? If it's not so safe and is an experimental feature, that's different. |
Changed to HashTag.
So you want it a bool? Let's just figure out what you want. This is going in circles. Should we just get on a call? I can now make it a bool if that makes you happy.
Its safe for everyone. I don't know what you're getting at. |
It's not going in circles. I'm against clutter, especially user-facing clutter. If this is safe for everyone, then I don't understand the desire for every user-facing function to gain a new keyword to control it, and every user-facing docstring to gain a description of this new keyword's states. Leaving an obscure way to change the internal behaviour back in place seems preferable to that. |
Okay so no keyword argument now? I can do that. |
I am keen not to drive the code in circles. Switching from the keyword to Preferences is some work, and if there are solid arguments against it, better to figure them out before coding it. I just haven't seen them made here. |
Okay, is this form what you were looking for? I'm trying triangulate the 3 difference responses of (1) make it a bool, (2) make it as simple as possible because it's safe so just do it and (3) use Preferences, each of which are competing. I think the simplest version is all that's needed. If it's safe and what everyone should use, does there need to be a Preferences? |
By removing the option, which is effectively reverting the test changes because now the option is gone and what the tests added was tests that the option worked. Because the tests already have inference testing, even the inference test is unnecessary to keep so we're good.
So now the option is gone, the tests which use the option are gone (so basically a revert of that), and the docstring changes that reference the new option are gone. Are we good? |
Ahh actually that extra inline stuff can go... |
We can be civil and discuss pros and cons of all options. Ideally before coding each of them in sequence. I'm against unnecessary user-facing clutter, but open to persuasion about what's necessary. Maybe a bool keyword isn't in fact a great idea, as it used to accept 3 options (nothing, :default, :small) but then 4 (nothing, :default meaning 1.10 vs 1.11 auto, :type and :hash). So that leaves
I still think that the last option is the worst. Although I'm not sure I understand yet what argument Oscar was making for it above. Paging @devmotion and @KristofferC for their thoughts? This isn't a tiny change. |
This was the other PR, which never merged because of relocatability issues and inference on v1.10, so that's a no.
This one is a no and was already removed.
This is what the PR doing. Is that fine? |
And I have also added a Preferences version. So from the list:
I am still confused which one you want, but we now have an implementation of all 4 versions. Please let me know which version you prefer and we can just take that commit. |
Firstly, stuff like
is not a great way to go about things. It puts unfair pressure on the maintainers and it is basically emotional manipulation to merge something because you don't want to be the party pooper against all those "30 people" in a room cherring in champagne about how great this PR is. Secondly, this PR is really hard to review, it has like a gazillion of Update commits etc. If you want to work on this on the hackathon I suggest first cleaning up the PR, make the commits nice etc and then we can take another shot at looking at this objectively. |
clean new dispatches out Update src/config.jl Update src/config.jl Update src/config.jl Update src/config.jl Update src/config.jl Update src/config.jl Update src/config.jl Add a preferences version Update src/config.jl Update src/config.jl
c62e14f
to
5ef9860
Compare
It was a moving target. First I was told that using the keyword argument
Which do you want? We will just do that and then squash. But if every single time the PR is updated it's changed to another then it's going to be a mess. So please let me know which you would be happy with and we will squash to that version. |
Bump @mcabbott is this the form you think is optimal? |
I do think it's worth discussing the pros and cons as @mcabbott suggested There's no need to rush this PR - IIUC we're on the same page that we probably want a change like this, so we just have to discuss and convince ourselves of the details (which seems fair to me) Tackling that:
My personal preference is a combination of (4) and (2), but (3) seems OK too I expect the majority of end users (>90%) will not change this setting, via Preferences or otherwise, so I agree we need good defaults. The hash is probably not a good default on 1.10, because it unfortunately makes the tag constructor type-unstable. However, it does seem like a good choice on 1.11+ My feelings are less strong about the user config piece (happy to defer to your judgment @mcabbott ), but I'm generally reluctant to add preferences for things that are likely to be "hidden" concerns for an end-user. In this case, a lot of indirect users of ForwardDiff exist that have no preference / understanding for the perturbation system / issue, so it seems strange to expect them to know that this preference exists and that they will set it. Moving it to preferences also makes it impossible for a package doing its own AD calculations to have an opinion about perturbation tracking (even if, e.g., that package hits a perturbation bug of some kind and would like to be able to code in a fix for its users) All of that said, it's true that we mostly expect these two options to have similar perturbation tracking capability. I can definitely see the argument that the preference really only affects display / diagnostics, which is arguably in the end user's "domain" Anyway I'm OK with either direction for config Hopefully that at least clarifies the original intent before this PR was hackathon'd |
Maybe the way to see it is that the kwarg was trying to simplify the existing "tag control" in the API. That way, a direct user of ForwardDiff only has to choose from a set of options, instead of constructing a tag themselves like they'd do now if they want to tweak tag behavior. If we'd prefer the original interface though that's fine - packages can still pass in the other tag type if they really want to control this aspect of tag behavior |
This is largely to improve printing and reduce visual noise for packages with large function types (e.g. SciML)
Compare
tag = :small
:to
tag = :default
: