-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C#: Improve some existing manual models. #19940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4036140
13b40bb
8ee16f6
3ae69d5
064c4fc
5c05ff8
95763dd
eba901f
8f8b042
70bf61d
e9fdca7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/csharp-all | ||
extensible: summaryModel | ||
data: | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "AddValue", "(System.String,System.Object)", "", "Argument[1]", "Argument[this]", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "AddValue", "(System.String,System.Object,System.Type)", "", "Argument[1]", "Argument[this]", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetEnumerator", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetString", "(System.String)", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetValue", "(System.String,System.Type)", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
# Note that SerializationEntry hasn't been modeled yet, so the model below for get_Current will not in itself provide more flow. | ||
- ["System.Runtime.Serialization", "SerializationInfoEnumerator", False, "get_Current", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This returns a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, we would need extra models for that as well. The model itself doesn't bring any value unless models for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense. Perhaps a comment on this line saying that it won't work until models for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
- ["System.Runtime.Serialization", "SerializationInfoEnumerator", False, "get_Value", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,18 +3,18 @@ extensions: | |
pack: codeql/csharp-all | ||
extensible: summaryModel | ||
data: | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0].Element", "Argument[2]", "taint", "manual"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For many of the models in this file, I don't understand why the input has There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is an excellent question - I didn't want to investigate that as a part of adding the models - but now I have checked: The cause is that for C#, support for implicit reads at sinks hasn't been implemented. That is, if the elements of a collection are tainted but the sink is a collection type, then taint isn't propagated. Since we have a few There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like all the skeletons are falling out of the closet 😄 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That seems fine to me, as long as there's a follow-up issue and we don't totally forget about it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have opened issues for both ((1) Implicit reads at sinks and (2) Convert |
||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.ReadOnlySpan<System.Char>,System.Span<System.Byte>)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char[],System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0].Element", "Argument[3]", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.ReadOnlySpan<System.Char>,System.Span<System.Byte>)", "", "Argument[0].Element", "Argument[1]", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.String)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", False, "GetBytes", "(System.String,System.Int32,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte*,System.Int32,System.Char*,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetBytes", "(System.String,System.Int32,System.Int32,System.Byte[],System.Int32)", "", "Argument[0]", "Argument[3]", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte*,System.Int32,System.Char*,System.Int32)", "", "Argument[0].Element", "Argument[2]", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.Byte[],System.Int32,System.Int32,System.Char[],System.Int32)", "", "Argument[0].Element", "Argument[3]", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetChars", "(System.ReadOnlySpan<System.Byte>,System.Span<System.Char>)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", False, "GetString", "(System.Byte*,System.Int32)", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
- ["System.Text", "Encoding", True, "GetString", "(System.Byte[])", "", "Argument[0].Element", "ReturnValue", "taint", "manual"] | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
--- | ||
category: minorAnalysis | ||
--- | ||
* Explicitly added summary models for all overloads of `System.Xml.XmlDictionaryReader.CreateBinaryReader`. Added models for some of the methods and properties in `System.Runtime.Serialization.SerializationInfo` and `System.Runtime.Serialization.SerializationInfoEnumerator`. Updated models for `System.Text.Encoding.GetBytes`, `System.Text.Encoding.GetChars` and the constructor for `System.IO.MemoryStream`. This generally improves the library modelling and thus reduces the number of false negatives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. We do the opposite in CodeQL for Go - byte arrays are often used synonymously with strings, and we always taint the whole array rather than individual elements, just as we would if it was a string. I guess either way works as long as your consistent, but I seem to recall that for Go it was better to taint the whole array because strings can be considered as byte arrays in some circumstances. Just thought I'd mention it.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, we would also taint a whole
byte[]
rather than its members - in fact singlebyte
values are typically considered too small to carry taint and used as type-based sanitizers.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both for looking into this!
So the change is probably a bit more controversial than advertised in slack. 😄
To address the questions/comments: Yes, I think it makes sense to consider both
byte[]
andchar[]
as "bulk" like datatypes like it is done for Java. The change in this PR solves the reported issue (as it is now) and doesn't break the pattern that is already being used for C# in terms of modelling - but we should probably change this.Will it be acceptable to merge the models as they are and then as a follow up try and convert all the models (at least all the manual ones) to check what the consequences will be? This might also tie into model generator as well.