From 0c2be0f1bc184965b6ede84a694ae4b0a607f5f6 Mon Sep 17 00:00:00 2001 From: "David A. Wheeler" Date: Wed, 7 May 2025 18:05:31 -0400 Subject: [PATCH 1/2] Add more discussion with a Python 2->3 example Signed-off-by: David A. Wheeler --- docs/Simplifying-Software-Component-Updates.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/Simplifying-Software-Component-Updates.md b/docs/Simplifying-Software-Component-Updates.md index d61d0347..61fbe359 100644 --- a/docs/Simplifying-Software-Component-Updates.md +++ b/docs/Simplifying-Software-Component-Updates.md @@ -16,6 +16,8 @@ Historically fewer software components were reused, and there were fewer layers Backward-incompatible changes are also increasingly problematic because most software components are used **indirectly**. When there are many layers of dependencies, it takes time for each layer’s updates to trickle up, introducing a sort of “speed of light” rate limit for updating software. Any delay in updating any intermediate layer impedes updates of all transitive users. For example, [[Wetter2021](https://security.googleblog.com/2021/12/understanding-impact-of-apache-log4j.html)] found in response to the Log4Shell vulnerability that “most artifacts that depend on log4j do so indirectly. The deeper the vulnerability is in a dependency chain, the more steps are required for it to be fixed. […] For greater than 80% of the packages, the vulnerability is more than one level deep, with a majority affected five levels down (and some as many as nine levels down).” +Backward-incompatible changes are even harder to deal with today because of the larger scale of software today. Custom software is often larger, and often depends on many other components. If a new interface must eventually be used, it may be possible to slowly change over time different files and components through a series of releases, though it can be costly and time-consuming. Demanding that "everything change at once" is far more difficult. For example, Python 3.0 was released on 2008-12-03. This was a backwards-incompatible release; transitioning from Python2 to Python3 required all code and libraries to simultaneously change. This transition was notoriously difficult and slow, with [even its creator finding it difficult in his organization](https://www.youtube.com/watch?v=Oiw23yfqQy8&t=1278s). [Python2 support was sunset on 2020-01-1](https://www.python.org/doc/sunset-python-2), yet the [Python Developers Survey 2022](https://lp.jetbrains.com/python-developers-survey-2022/#PythonVersions) found that 7% of Python users overall still used the older Python2, with notable uses in data analysis (29%), web development (19%), and DevOps (23%) — 14 years after Python3's release. As of 2025-05-07, [19.1% of websites using Python use Python2](https://w3techs.com/technologies/details/pl-python). Large scale and many components may simultaneous changes of everything in a program far more difficult. + Developers have created mechanisms to deal with backward incompatibility, but these often create larger problems later. A developer may clone some code; in cloning, code is copied into the project. Unfortunately, these copies may include vulnerabilities, and since their origin is no longer automatically tracked, those vulnerabilities are hidden by the development process and are no longer automatically updated. An alternative is shading, a “variant of cloning where entire packages are cloned and renamed.” This may be done at build time (aka “b-shading”) and some ecosystems have tools specifically to support b-shading (e.g., the Maven shade plugin). While b-shading solves an immediate problem and is trackable by tools, the approach also introduces longer-term risks as it tends to endlessly defer necessary updates. Other kinds of shading are used as well [[Dietrich2023](https://arxiv.org/abs/2306.05534)]. In all cases, alternatives create risks when compared to simply updating a given component to its current version. In extreme cases, such as the Log4Shell vulnerability, specialized programs were created to directly hotpatch programs to perform updates [[Nalley2021](https://aws.amazon.com/blogs/opensource/hotpatch-for-apache-log4j/)]. This extreme approach is **not** reasonable to apply in “normal” circumstances and risks causing many additional problems. From d1055ffdb41e943152c7d7a0d1e604187c1a3d9f Mon Sep 17 00:00:00 2001 From: "David A. Wheeler" Date: Thu, 15 May 2025 11:36:56 -0400 Subject: [PATCH 2/2] Tweak Python text Signed-off-by: David A. Wheeler --- docs/Simplifying-Software-Component-Updates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Simplifying-Software-Component-Updates.md b/docs/Simplifying-Software-Component-Updates.md index 61fbe359..3e636de9 100644 --- a/docs/Simplifying-Software-Component-Updates.md +++ b/docs/Simplifying-Software-Component-Updates.md @@ -16,7 +16,7 @@ Historically fewer software components were reused, and there were fewer layers Backward-incompatible changes are also increasingly problematic because most software components are used **indirectly**. When there are many layers of dependencies, it takes time for each layer’s updates to trickle up, introducing a sort of “speed of light” rate limit for updating software. Any delay in updating any intermediate layer impedes updates of all transitive users. For example, [[Wetter2021](https://security.googleblog.com/2021/12/understanding-impact-of-apache-log4j.html)] found in response to the Log4Shell vulnerability that “most artifacts that depend on log4j do so indirectly. The deeper the vulnerability is in a dependency chain, the more steps are required for it to be fixed. […] For greater than 80% of the packages, the vulnerability is more than one level deep, with a majority affected five levels down (and some as many as nine levels down).” -Backward-incompatible changes are even harder to deal with today because of the larger scale of software today. Custom software is often larger, and often depends on many other components. If a new interface must eventually be used, it may be possible to slowly change over time different files and components through a series of releases, though it can be costly and time-consuming. Demanding that "everything change at once" is far more difficult. For example, Python 3.0 was released on 2008-12-03. This was a backwards-incompatible release; transitioning from Python2 to Python3 required all code and libraries to simultaneously change. This transition was notoriously difficult and slow, with [even its creator finding it difficult in his organization](https://www.youtube.com/watch?v=Oiw23yfqQy8&t=1278s). [Python2 support was sunset on 2020-01-1](https://www.python.org/doc/sunset-python-2), yet the [Python Developers Survey 2022](https://lp.jetbrains.com/python-developers-survey-2022/#PythonVersions) found that 7% of Python users overall still used the older Python2, with notable uses in data analysis (29%), web development (19%), and DevOps (23%) — 14 years after Python3's release. As of 2025-05-07, [19.1% of websites using Python use Python2](https://w3techs.com/technologies/details/pl-python). Large scale and many components may simultaneous changes of everything in a program far more difficult. +Backward-incompatible changes are even harder to deal with today because of the larger scale of software today. Custom software is often larger, and often depends on many other components. If a new interface must eventually be used, it may be possible to slowly change over time different files and components through a series of releases, though it can be costly and time-consuming. Demanding that "everything change at once" is far more difficult. For example, Python 3.0 was released on 2008-12-03. This was a backwards-incompatible release; transitioning from Python2 to Python3 required all code and libraries to simultaneously change. This transition was notoriously difficult and slow, with [even its creator finding it difficult in his organization](https://www.youtube.com/watch?v=Oiw23yfqQy8&t=1278s). [Python2 support was sunset on 2020-01-01](https://www.python.org/doc/sunset-python-2), yet the [Python Developers Survey 2022](https://lp.jetbrains.com/python-developers-survey-2022/#PythonVersions) found that 14 years after the Python3 release, 7% of Python users overall still used the older Python2, with notable uses in data analysis (29%), web development (19%), and DevOps (23%). As of 2025-05-07, 17 years later, [19.1% of websites using Python use Python2](https://w3techs.com/technologies/details/pl-python). Backward-incompatible changes are difficult to manage. Developers have created mechanisms to deal with backward incompatibility, but these often create larger problems later. A developer may clone some code; in cloning, code is copied into the project. Unfortunately, these copies may include vulnerabilities, and since their origin is no longer automatically tracked, those vulnerabilities are hidden by the development process and are no longer automatically updated. An alternative is shading, a “variant of cloning where entire packages are cloned and renamed.” This may be done at build time (aka “b-shading”) and some ecosystems have tools specifically to support b-shading (e.g., the Maven shade plugin). While b-shading solves an immediate problem and is trackable by tools, the approach also introduces longer-term risks as it tends to endlessly defer necessary updates. Other kinds of shading are used as well [[Dietrich2023](https://arxiv.org/abs/2306.05534)]. In all cases, alternatives create risks when compared to simply updating a given component to its current version.