Skip to content

Conversation

MilesCranmer
Copy link
Member

@MilesCranmer MilesCranmer commented Jun 29, 2025

The .size of an array might be getting set too early; before the memory is actually allocated.

@MilesCranmer MilesCranmer changed the title Set array size when safe to do so Set array size only when safe to do so Jun 29, 2025
@LilithHafner
Copy link
Member

@oscardssmith, can you look at this?

@mbauman mbauman added arrays [a, r, r, a, y, s] backport 1.11 Change should be backported to release-1.11 labels Jun 30, 2025
@MilesCranmer
Copy link
Member Author

MilesCranmer commented Jun 30, 2025

Copying my comment from my premature backport:

Ideally these would be atomic so order doesn't matter. But in case of misuse, I think this order is the least bad.

  • If the array is to shrink, then you ought to modify the length at the start.
  • If the array is to grow, then you ought to modify the length at the end.

This prevents one thread from reading .size and then indexing past the end of the array buffer before .ref has been set.

Now, while this lowers the chance of corruption from thread race, you can still totally get corruption. But this version is slightly less bad.

@vtjnash
Copy link
Member

vtjnash commented Jul 1, 2025

That would require a lock, since there are 3 fields to update (data, offset, length), which isn't really feasible. The described order is still at risk from all sorts of hoisting and sinking operations, as well as concurrent growing and shrinking on threads, but still is probably the best possible option (though I'd suggest making sure they are always exactly consecutively length then memory ref, as that is the only order that allows the CPU to do them as a single cache line update most of the time)

@MilesCranmer
Copy link
Member Author

Happy with whatever, just let me know what

they are always exactly consecutively length then memory ref, as that is the only order that allows the CPU to do them as a single cache line update most of the time

Interesting! Why does order matter if they are getting fused into a single update?

@vtjnash
Copy link
Member

vtjnash commented Jul 1, 2025

There is a write barrier after the store of memref

@MilesCranmer
Copy link
Member Author

Cool. How much would it affect real Julia performance? I would have assumed there aren’t the right effects in place for the compiler to do such an optimization – such as there permitting multiple references to the same object

@vtjnash
Copy link
Member

vtjnash commented Jul 1, 2025

The fusing will only happen if you use that order. Otherwise they will be separate operations.

@MilesCranmer
Copy link
Member Author

Sorry, are you saying that this fusing is already happening in Julia? Or that you’d like me to modify the PR so that this does happen?

If the latter, how much % performance improvement would this bring?

@vtjnash
Copy link
Member

vtjnash commented Jul 1, 2025

You'll need to modify the PR to make sure the setfield of length only occurs either immediately before another setfield, or after any other logic if there is no other setfield

@MilesCranmer
Copy link
Member Author

Current master branch doesn’t have this property; there is logic sitting in between each setfield. Maybe the ordering was profiled and then determined not to actually matter for performance?

@KristofferC KristofferC mentioned this pull request Aug 19, 2025
65 tasks
DilumAluthge added a commit that referenced this pull request Sep 5, 2025
Backported PRs:
- [x] #54840 <!-- Add boundscheck in speccache_eq to avoid OOB access
due to data race -->
- [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package
code (was: "using considered harmful") -->
- [x] #58127 <!-- [DOC] Update installation docs: /downloads/ =>
/install/ -->
- [x] #58202 <!-- [release-1.11] malloc: use jl_get_current_task to fix
null check -->
- [x] #58584 <!-- Make `Ptr` values static-show w/ type-information -->
- [x] #58637 <!-- Make late gc lower handle insertelement of alloca use.
-->
- [x] #58837 <!-- fix null comparisons for non-standard address spaces
-->
- [x] #57826 <!-- Add a `similar` method for `Type{<:CodeUnits}` -->
- [x] #58293 <!-- fix trailing indices stackoverflow in reinterpreted
array -->
- [x] #58887 <!-- Pkg: Allow configuring can_fancyprint(io::IO) using
IOContext -->
- [x] #58937 <!-- Fix nthreadpools size in JLOptions -->
- [x] #58978 <!-- Fix precompilepkgs warn loaded setting -->
- [x] #58998 <!-- Bugfix: Use Base.aligned_sizeof instead of sizeof in
Mmap.mmap -->
- [x] #59120 <!-- Fix memory order typo in "src/julia_atomics.h" -->
- [x] #59170 <!-- Clarify and enhance confusing precompile test -->

Need manual backport:
- [ ] #56329 <!-- loading: clean up more concurrency issues -->
- [ ] #56956 <!-- Add "mea culpa" to foreign module assignment error.
-->
- [ ] #57035 <!-- linux: workaround to avoid deadlock inside
dl_iterate_phdr in glibc -->
- [ ] #57089 <!-- Block thread from receiving profile signal with
stackwalk lock -->
- [ ] #57249 <!-- restore non-freebsd-unix fix for profiling -->
- [ ] #58011 <!-- Remove try-finally scope from `@time_imports`
`@trace_compile` `@trace_dispatch` -->
- [ ] #58062 <!-- remove unnecessary edge from `exp_impl` to `pow` -->
- [ ] #58157 <!-- add showing a string to REPL precompile workload -->
- [ ] #58209 <!-- Specialize `one` for the `SizedArray` test helper -->
- [ ] #58108 <!-- Base.get_extension & Dates.format made public -->
- [ ] #58356 <!-- codegen: remove readonly from abstract type calling
convention -->
- [ ] #58415 <!-- [REPL] more reliable extension loading -->
- [ ] #58510 <!-- Don't filter `Core` methods from newly-inferred list
-->
- [ ] #58110 <!-- relax dispatch for the `IteratorSize` method for
`Generator` -->
- [ ] #58965 <!-- Fix `hygienic-scope`s in inner macro expansions -->
- [ ] #58971 <!-- Fix alignment of failed precompile jobs on CI -->
- [ ] #59066 <!-- build: Also pass -fno-strict-aliasing for C++ -->

Contains multiple commits, manual intervention needed:
- [ ] #55877 <!-- fix FileWatching designs and add workaround for a stat
bug on Apple -->
- [ ] #56755 <!-- docs: fix scope type of a `struct` to hard -->
- [ ] #57809 <!-- Fix fptrunc Float64 -> Float16 rounding through
Float32 -->
- [ ] #57398 <!-- Make remaining float intrinsics require float
arguments -->
- [ ] #56351 <!-- Fix `--project=@script` when outside script directory
-->
- [ ] #57129 <!-- clarify that time_ns is monotonic -->
- [ ] #58134 <!-- Note annotated string API is experimental in Julia
1.11 in HISTORY.md -->
- [ ] #58401 <!-- check that hashing of types does not foreigncall
(`jl_type_hash` is concrete evaluated) -->
- [ ] #58435 <!-- Fix layout flags for types that have oddly sized
primitive type fields -->
- [ ] #58483 <!-- Fix tbaa usage when storing into heap allocated
immutable structs -->
- [ ] #58512 <!-- Make more types jl_static_show readably -->
- [ ] #58012 <!-- Re-enable tab completion of kwargs for large method
tables -->
- [ ] #58683 <!-- Add 0 predecessor to entry basic block and handle it
in inlining -->
- [ ] #59112 <!-- Add builtin function name to add methods error -->

Non-merged PRs with backport label:
- [ ] #59329 <!-- aotcompile: destroy LLVM context after serializing
combined module -->
- [ ] #58848 <!-- Set array size only when safe to do so -->
- [ ] #58535 <!-- gf.c: include const-return methods in
`--trace-compile` -->
- [ ] #58038 <!-- strings/cstring: `transcode`: prevent Windows sysimage
invalidation -->
- [ ] #57604 <!-- `@nospecialize` for `string_index_err` -->
- [ ] #57366 <!-- Use ptrdiff_t sized offsets for gvars_offsets to allow
large sysimages -->
- [ ] #56890 <!-- Enable getting non-boxed LLVM type from Julia Type -->
- [ ] #56823 <!-- Make version of opaque closure constructor in world
-->
- [ ] #55958 <!-- also redirect JL_STDERR etc. when redirecting to
devnull -->
- [ ] #55956 <!-- Make threadcall gc safe -->
- [ ] #55534 <!-- Set stdlib sources as read-only during installation
-->
- [ ] #55499 <!-- propagate the terminal's `displaysize` to the
`IOContext` used by the REPL -->
- [ ] #55458 <!-- Allow for generically extracting unannotated string
-->
- [ ] #55457 <!-- Make AnnotateChar equality consider annotations -->
- [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #50813 <!-- More doctests for Sockets and capitalization fix -->
- [ ] #50157 <!-- improve docs for `@inbounds` and
`Base.@propagate_inbounds` -->

---------

Co-authored-by: Kiran Pamnany <[email protected]>
Co-authored-by: adienes <[email protected]>
Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Keno Fischer <[email protected]>
Co-authored-by: Simeon David Schaub <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Fons van der Plas <[email protected]>
Co-authored-by: Ian Butterworth <[email protected]>
Co-authored-by: JonasIsensee <[email protected]>
Co-authored-by: Curtis Vogt <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
Co-authored-by: DilumAluthgeBot <[email protected]>
Co-authored-by: DilumAluthge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] backport 1.11 Change should be backported to release-1.11
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants