-
Notifications
You must be signed in to change notification settings - Fork 601
Formalize symbol names reserved for Perl's use in XS #24003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
This creates a regular expression pattern of names that we feel free to
expose to XS code's namespace. Hence they are names reserved for our use,
and should any conflicts arise, the module needs to change, not us.
Naturally, the pattern is pretty restrictive.
Any symbol beginning with "PL_"
Any symbol containing perl, Perl, or PERL, usually delimitted on
both sides so as to keep it from being part of a larger word.
Any other spelling that we expose could be considered to pollute the XS
code space. We have felt free to do that all the time. Any new
function's short name will do that.
And we generally feel free to create macros with arbitrary names which
could conflict with an existing XS name.
Some important potential conflicts are:
New keywords: We create an exposed KEY_foo macro. Some existing
modules use some of these. My grep of CPAN shows maybe a dozen of these
get used; mostly KEY_END.
config.h is full of symbols like HAS_foo, I_bar, and others that are all
exposed. I don't imagine we can claim to reserve any symbol beginning
with either HAS_ or I_. And I don't know what to do here.
Informally, myself and others have used a trailing underscore to
indicate a private symbol. There are a few distributions that use some
of these anyway. And there has been pushback when new short symbols
that use this convention have been added.
I would like to get a formal rule about use of this convention. There
are 200+ of these currently. We could reserve any names with trailing
underscores, or if that is too much, any ending in, say, '_pl_' or
'_PL_'.
We have 3000+ undocumented macro names that don't end in underscores and
which are currently visible to XS code. This number includes the
KEY_foo ones, but not the ones in config.h.
To deal with namespace pollution, we have had the -DNO_SHORT_NAMES
Configure option for use just with embedded perls. This hasn't worked
at least since we added inline functions, and it always applied to only
functions. I have a WIP to get this to work again, and to extend it to
work with documented macros. It just occurred to me how to make this be
customizable, so that downstream someone could add a list of symbols
that should only exist as 'Perl_foo', and then recompile, leaving short
names for everything not in the list.
273e7d6 to
5e26279
Compare
|
I can't see us restricting new name use to Do you expect new SV/CV/HV etc flags to use a Similarly for flags like AMGf_*, OP code macros. I don't think we can reserve names with trailing underscores, those are in use in too many other code bases. You know I think the perl codebase is badly polluting, especially at the macro level, but I don't think it's fixable unless we decide on some sort of limited API like the Python one (even that uses some non-Py-prefix names) I think in the general case it's just too big a change in practice. |
|
It is not my intent to restrict new names. The purpose is to say that we consider all names that match this pattern, as finally determined, to be fair game for us to use without any consideration on their effect in anyone's namespace. I think we should give consideration of that effect with other names, but the decision is likely to be to go ahead and use any reasonable ones. It might be that we consider anything beginning with [ACGHS][Vv] to be fair game that needs no consideration either, along with related setters and getters. It's overly clumsy and hard to read to have names that have required prefixes. But we are now in a position where a Configure call could specify not to use names x,y,z but instead we generate Perl_x, Perl_y, and Perl_z for just those. And this could be expanded to a per module basis so the module says I use x,y,z for my purposes, don't define them for perl's. This latter would require a much bigger embed.h, but it is automatically generated. I'm liking the idea of reserving symbols that end in |
|
I don't think I at all understand the impact of this change. I haven't really looked at the detail of |
|
regen/embed.pl looks at the source and creates various files from it, including embed.h and proto.h. It is run via But there is an exception list of macros that are to be externally visible even if there is nothing to indicate that they should be. That list was initialized to everything that is currently so visible. That means that this change has null effect. This list is the giant one @leonerd mentioned. The goal here is to impose future discipline on us. Newly created macros will have to have an explicit visibility specification in order to be seen by the outside world. That is easily accomplished by documenting the macro, which is something that should have been done all along, but there were no immediate negative consequences of not doing so. Except for the deficiencies in my code that generates the giant list, none of those thousands of macros on it are formally documented. That means they are effectively namespace pollutants. They are symbols that the XS code is stuck with having, and there was no notice given of their existence. That is only a real problem if there are collisions with names the XS code is somewhat likely to use. We should rename any such or remove them from external visibility.
And that is where the pattern comes in. I believe we should have a statement in some pod to the effect that we reserve for perl's use any symbol that matches this pattern. C, C++, and POSIX all have such statements. But they in turn say they won't create symbols that don't match it. We can't do that. So what we can say is that if your code has symbols that match the pattern, we won't change to accommodate you. We'll consider requests to change new symbols we have created that don't match the pattern but clash with yours. So adding the pattern has no effect on its own, but is a basis for changes to documentation. Now that the recent changes to embed.pl are in core, it would not be hard for us to allow a module to have an import list of core symbols, or an import-all-but list. The macros excluded by such lists would be accessible only via long names. |
This creates a regular expression pattern of names that we feel free to expose to XS code's namespace. Hence they are names reserved for our use, and should any conflicts arise, the module needs to change, not us.
Naturally, the pattern is pretty restrictive.
Any other spelling that we expose could be considered to pollute the XS code space. We have felt free to do that all the time. Any new function's short name will do that.
And we generally feel free to create macros with arbitrary names which could conflict with an existing XS name.
Some important potential conflicts are:
New keywords: We create an exposed KEY_foo macro. Some existing modules use some of these. My grep of CPAN shows maybe a dozen of these get used; mostly KEY_END.
config.h is full of symbols like HAS_foo, I_bar, and others that are all exposed. I don't imagine we can claim to reserve any symbol beginning with either HAS_ or I_. And I don't know what to do here.
Informally, myself and others have used a trailing underscore to indicate a private symbol. There are a few distributions that use some of these anyway. And there has been pushback when new short symbols that use this convention have been added.
I would like to get a formal rule about use of this convention. There are 200+ of these currently. We could reserve any names with trailing underscores, or if that is too much, any ending in, say, 'pl' or 'PL'.
We have 3000+ undocumented macro names that don't end in underscores and which are currently visible to XS code. This number includes the KEY_foo ones, but not the ones in config.h.
To deal with namespace pollution, we have had the -DNO_SHORT_NAMES Configure option for use just with embedded perls. This hasn't worked at least since we added inline functions, and it always applied to only functions. I have a WIP to get this to work again, and to extend it to work with documented macros. It just occurred to me how to make this be customizable, so that downstream someone could add a list of symbols that should only exist as 'Perl_foo', and then recompile, leaving short names for everything not in the list.