-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Update .gitattributes
for the wrongencoding files
#13811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
0b6ccf6
to
bc8f923
Compare
Mark wrongenc.inc files as binary in .gitattributes to prevent encoding conversion issues on different platforms. These files contain intentional Latin-1 encoded content for testing Sphinx's encoding handling and should remain byte-for-byte identical. Fixes git errors like: - "failed to encode from UTF-8 to latin-1" - "patch does not apply" in pre-commit hooks - stash/unstash failures with these files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right; the files are text, not binary content. My attempt here was to properly mark the encoding as I understand that Git's internal index is stored in Unicode, whereas we have a few test files that we intentionally have in different encodings.
Does macOS support the Latin-1 codec in general? I imagine there's no support for Windows-1252? (the default Windows codepage)
A third option would be to remove these files from the repo and have the tests write them to disk each time, which might be somewhat clearer in the intent.
A
I am on macOS and I would like to help because I have encountered the issue too. The problem is that I do not know how to reproduce it. I did not encounter it while working on some PRs I merged last few days, but I definitely had the issue about five days ago, and it was a bit painful because one had to be careful to not do |
Emacs on macOS definitely support the Latin-1 codec and probably Windows-1252 inclusive of the problem of EOLs. There are some issues with Unicode rather when one copies files to external hard disks where the macOS might automatically change the type of Unicode normalization, but this looks like something else. |
I thought yes
This seems easiest Although I also cannot replicate it now :( which is extremely confusing. Maybe this was a temporary bug somewhere else (git, iterm, mac?)? So maybe this is just resolved, and we can close this, and if it comes up again hopefully someone will find this and not feel despair. |
I am not 100% sure but it does look as if it is 5cf62e5 which fixed the issue for me once I had updated my locale. But trying to revert it I fail so far to trigger Git into complaining again about those wrongly encoded files. |
Confusingly I only made a fork a few days ago, so I would for sure have had this already when I started seeing the issue. |
I got the same error (Win11/WSL2). Once I touched those files they were "stuck" and could not be reverted/discarded anymore. |
Purpose
There was an existing
.gitattributes
for these files, but it wasn't working on my system (M2 mac). I was unable to restore or stash these files. Which made rebasing difficulty. This approach fixed the git issues for me.I was getting errors like:
whenever I tried to restore, stash, or rebase.
Full disclosure - this was a level of git that was beyond me so this is an AI assisted PR. Claude summarizes the downsides as:
Although I was able to open them in neovim without issue.
References
It looks like there was a recent similar change: 5cf62e5