Skip to content

C#: Improve some existing manual models. #19940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion csharp/ql/lib/ext/System.IO.model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ extensions:
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32,System.Boolean)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32,System.IO.FileOptions)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[])", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[])", "", "Argument[0].Element", "Argument[this]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. We do the opposite in CodeQL for Go - byte arrays are often used synonymously with strings, and we always taint the whole array rather than individual elements, just as we would if it was a string. I guess either way works as long as your consistent, but I seem to recall that for Go it was better to taint the whole array because strings can be considered as byte arrays in some circumstances. Just thought I'd mention it.

Copy link
Contributor

@aschackmull aschackmull Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Java, we would also taint a whole byte[] rather than its members - in fact single byte values are typically considered too small to carry taint and used as type-based sanitizers.

Copy link
Contributor Author

@michaelnebel michaelnebel Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both for looking into this!
So the change is probably a bit more controversial than advertised in slack. 😄
To address the questions/comments: Yes, I think it makes sense to consider both byte[] and char[] as "bulk" like datatypes like it is done for Java. The change in this PR solves the reported issue (as it is now) and doesn't break the pattern that is already being used for C# in terms of modelling - but we should probably change this.
Will it be acceptable to merge the models as they are and then as a follow up try and convert all the models (at least all the manual ones) to check what the consequences will be? This might also tie into model generator as well.

- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[],System.Boolean)", "", "Argument[0].Element", "Argument[this]", "taint", "manual"]
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[],System.Int32,System.Int32)", "", "Argument[0].Element", "Argument[this]", "taint", "manual"]
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[],System.Int32,System.Int32,System.Boolean)", "", "Argument[0].Element", "Argument[this]", "taint", "manual"]
Expand Down
13 changes: 13 additions & 0 deletions csharp/ql/lib/ext/System.Runtime.Serialization.model.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
extensions:
- addsTo:
pack: codeql/csharp-all
extensible: summaryModel
data:
- ["System.Runtime.Serialization", "SerializationInfo", False, "AddValue", "(System.String,System.Object)", "", "Argument[1]", "Argument[this]", "taint", "manual"]
- ["System.Runtime.Serialization", "SerializationInfo", False, "AddValue", "(System.String,System.Object,System.Type)", "", "Argument[1]", "Argument[this]", "taint", "manual"]
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetEnumerator", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"]
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetString", "(System.String)", "", "Argument[this]", "ReturnValue", "taint", "manual"]
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetValue", "(System.String,System.Type)", "", "Argument[this]", "ReturnValue", "taint", "manual"]
# Note that SerializationEntry hasn't been modeled yet, so the model below for get_Current will not in itself provide more flow.
- ["System.Runtime.Serialization", "SerializationInfoEnumerator", False, "get_Current", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns a SerializationEntry struct. I'm not familiar with CodeQL for C#, so I just want to check that it's sufficient to taint that struct, and then accessing the fields will propagate the taint, without requiring any extra models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we would need extra models for that as well. The model itself doesn't bring any value unless models for SerializationEntry is also provided. However, I decided to include it for consistency (such that all properties of SerializationInfoEnumerator are modelled - or at least those where it makes sense to have a model). But I have stop the modelling somehere 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Perhaps a comment on this line saying that it won't work until models for SerializationEntry are added, would save the next person to run into this some time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- ["System.Runtime.Serialization", "SerializationInfoEnumerator", False, "get_Value", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"]
14 changes: 7 additions & 7 deletions csharp/ql/lib/ext/System.Text.model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,18 @@ extensions:
pack: codeql/csharp-all
extensible: summaryModel
data:
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0].Element", "Argument[2]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For many of the models in this file, I don't understand why the input has .Element but the output doesn't. Please can you explain how this works? (Obviously the old model was just wrong, though.)

Copy link
Contributor Author

@michaelnebel michaelnebel Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an excellent question - I didn't want to investigate that as a part of adding the models - but now I have checked: The cause is that for C#, support for implicit reads at sinks hasn't been implemented. That is, if the elements of a collection are tainted but the sink is a collection type, then taint isn't propagated. Since we have a few byte[] sinks, it appears that the models are made like this.
This should probably be addressed (but not as a part of this PR) - will open an issue for it (I think it is just a matter of making a simple implementation for defaultImplicitTaintRead - but it will need to be tested separately).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all the skeletons are falling out of the closet 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine to me, as long as there's a follow-up issue and we don't totally forget about it.

Copy link
Contributor Author

@michaelnebel michaelnebel Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened issues for both ((1) Implicit reads at sinks and (2) Convert byte[] and char[] models)).

- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.ReadOnlySpan<System.Char>,System.Span<System.Byte>)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0].Element", "Argument[3]", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.ReadOnlySpan<System.Char>,System.Span<System.Byte>)", "", "Argument[0].Element", "Argument[1]", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.String)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", False, "GetBytes", "(System.String,System.Int32,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte*,System.Int32,System.Char*,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0]", "Argument[3]", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte*,System.Int32,System.Char*,System.Int32)", "", "Argument[0].Element", "Argument[2]", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32)", "", "Argument[0].Element", "Argument[3]", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetChars", "(System.ReadOnlySpan<System.Byte>,System.Span<System.Char>)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", False, "GetString", "(System.Byte*,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Text", "Encoding", True, "GetString", "(System.Byte[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
Expand Down
19 changes: 19 additions & 0 deletions csharp/ql/lib/ext/System.Xml.model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,25 @@ extensions:
extensible: summaryModel
data:
- ["System.Xml", "XmlAttributeCollection", False, "CopyTo", "(System.Xml.XmlAttribute[],System.Int32)", "", "Argument[this].Element", "Argument[0].Element", "value", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[0].Element", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[3]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[0].Element", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[3]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[5]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[3]", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[5]", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Int32,System.Int32,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[0].Element", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.Byte[],System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[0].Element", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[0]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[1]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[0]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[1]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession)", "", "Argument[3]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[1]", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.IXmlDictionary,System.Xml.XmlDictionaryReaderQuotas,System.Xml.XmlBinaryReaderSession,System.Xml.OnXmlDictionaryReaderClose)", "", "Argument[3]", "ReturnValue", "taint", "manual"]
- ["System.Xml", "XmlDictionaryReader", False, "CreateBinaryReader", "(System.IO.Stream,System.Xml.XmlDictionaryReaderQuotas)", "", "Argument[0]", "ReturnValue", "taint", "df-manual"]
- ["System.Xml", "XmlDocument", False, "Load", "(System.IO.Stream)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.Xml", "XmlDocument", False, "Load", "(System.IO.TextReader)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
- ["System.Xml", "XmlDocument", False, "Load", "(System.String)", "", "Argument[0]", "Argument[this]", "taint", "manual"]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Explicitly added summary models for all overloads of `System.Xml.XmlDictionaryReader.CreateBinaryReader`. Added models for some of the methods and properties in `System.Runtime.Serialization.SerializationInfo` and `System.Runtime.Serialization.SerializationInfoEnumerator`. Updated models for `System.Text.Encoding.GetBytes`, `System.Text.Encoding.GetChars` and the constructor for `System.IO.MemoryStream`. This generally improves the library modelling and thus reduces the number of false negatives.
Loading