-
Notifications
You must be signed in to change notification settings - Fork 721
Use file content heuristics to decide file reader. #1962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
…sed on the magic number.
…ics detection method.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #1962 +/- ##
==========================================
- Coverage 83.51% 83.48% -0.03%
==========================================
Files 310 310
Lines 54884 54985 +101
Branches 12220 12244 +24
==========================================
+ Hits 45834 45905 +71
- Misses 7786 7859 +73
+ Partials 1264 1221 -43
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
enum class CaptureFileFormat | ||
{ | ||
Unknown, | ||
Pcap, | ||
PcapNG, | ||
Snoop, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: this can be an enum inside of CaptureFileFormatDetector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose. It has internal linkage so it doesn't really matter.
But then we would end up with really long case names: CaptureFileFormatDetector::FileFormat::Pcap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's fine? It's all internal anyway...
Tests/Pcap++Test/Tests/FileTests.cpp
Outdated
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader)); | ||
PTF_ASSERT_TRUE(genericReader->open()); | ||
// ------- IFileReaderDevice::createReader() Factory | ||
// TODO: Move to a separate unit test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add the following to get more coverage:
- Open a snoop file
- Open a file that is not any of the options
- Open pcap files with different magic numbers
- Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3d713ab adds the following tests:
- Pcap, PcapNG, Zst file with correct content + extension
- Pcap, PcanNG file with correct content + wrong extension
- Bogus content file with correct extension (pcap, pcapng, zst)
- Bogus content file with wrong extension (txt)
Haven't found a snoop file to add. Do we have any?
Open pcap files with different magic numbers
Do you mean Pcap content that has just its magic number changed? Because IMO it is reasonable to consider that invalid format and fail as regular bogus data.
Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions
Pending on #1962 (comment) .
Move it out if it needs to be reused somewhere.
Libpcap supports reading this format since 0.9.1. The heuristics detection will identify such magic number as pcap and leave final support decision to the pcap backend infrastructure.
@Dimi1010 some CI tests fail... |
@seladb I think I found the issue. According to ChatGPT the Winpcap's The NPcap sdk and drivers provide their own wpcap.dll where they do support nanosec precision. The tests we have on nanosecond precision pcap files ("nanosecs.pcap") currently are flawed. We don't use a static base base to compare to. We write a packet with the Due to the above the current nanosecond test in the master branch passes with Winpcap, even though it doesn't actually write / read a nanosecond precision pcap. The new tests use a static copy of Edit: |
The PR adds heuristics based on the file content that is more robust than deciding based on the file extension.
The new decision model scans the start of the file for its magic number signature. It then compares it to the signatures of supported file types [1] and constructs a reader instance based on the result.
A new function
createReader
has been added due to changes in the public API of the factory.nullptr
is returned. (previously returnedPcapFileDeviceReader
)std::runtime_error
is thrown. (previously returnedPcapFileDeviceReader
)std::unique_ptr<IFileDeviceReader>
instead ofIFileDeviceReader*
.