Skip to content

Conversation

Dimi1010
Copy link
Collaborator

The PR adds heuristics based on the file content that is more robust than deciding based on the file extension.

The new decision model scans the start of the file for its magic number signature. It then compares it to the signatures of supported file types [1] and constructs a reader instance based on the result.

A new function createReader has been added due to changes in the public API of the factory.

  • If no reader is available to read the file: nullptr is returned. (previously returned PcapFileDeviceReader)
  • If the file could not be opened or does not exist: std::runtime_error is thrown. (previously returned PcapFileDeviceReader)
  • The new function returns std::unique_ptr<IFileDeviceReader> instead of IFileDeviceReader*.

Copy link

codecov bot commented Sep 12, 2025

Codecov Report

❌ Patch coverage is 77.68595% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.48%. Comparing base (098dd4b) to head (46418ec).

Files with missing lines Patch % Lines
Pcap++/src/PcapFileDevice.cpp 68.96% 23 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1962      +/-   ##
==========================================
- Coverage   83.51%   83.48%   -0.03%     
==========================================
  Files         310      310              
  Lines       54884    54985     +101     
  Branches    12220    12244      +24     
==========================================
+ Hits        45834    45905      +71     
- Misses       7786     7859      +73     
+ Partials     1264     1221      -43     
Flag Coverage Δ
alpine320 75.93% <64.47%> (-0.03%) ⬇️
fedora42 76.08% <62.82%> (-0.04%) ⬇️
macos-13 81.62% <72.72%> (-0.03%) ⬇️
macos-14 81.62% <72.72%> (-0.03%) ⬇️
macos-15 81.62% <72.22%> (-0.03%) ⬇️
mingw32 70.23% <46.42%> (-0.08%) ⬇️
mingw64 70.22% <46.42%> (+0.03%) ⬆️
npcap ?
rhel94 75.78% <64.47%> (-0.03%) ⬇️
ubuntu2004 60.30% <67.34%> (+0.05%) ⬆️
ubuntu2004-zstd 60.40% <67.34%> (+0.05%) ⬆️
ubuntu2204 75.72% <64.47%> (-0.05%) ⬇️
ubuntu2204-icpx 60.68% <36.98%> (-0.11%) ⬇️
ubuntu2404 75.97% <64.47%> (+<0.01%) ⬆️
ubuntu2404-arm64 75.58% <64.47%> (-0.05%) ⬇️
unittest 83.48% <77.68%> (-0.03%) ⬇️
windows-2022 85.40% <56.33%> (+0.04%) ⬆️
windows-2025 85.43% <56.33%> (+0.02%) ⬆️
winpcap 85.43% <56.33%> (-0.17%) ⬇️
xdp 53.45% <0.00%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Dimi1010 Dimi1010 added the API deprecation Pull requests that deprecate parts of the public interface. label Sep 12, 2025
@Dimi1010 Dimi1010 marked this pull request as ready for review September 12, 2025 11:36
@Dimi1010 Dimi1010 requested a review from seladb as a code owner September 12, 2025 11:36
Comment on lines +90 to +96
enum class CaptureFileFormat
{
Unknown,
Pcap,
PcapNG,
Snoop,
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: this can be an enum inside of CaptureFileFormatDetector

Copy link
Collaborator Author

@Dimi1010 Dimi1010 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose. It has internal linkage so it doesn't really matter.

But then we would end up with really long case names: CaptureFileFormatDetector::FileFormat::Pcap?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's fine? It's all internal anyway...

PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));
PTF_ASSERT_TRUE(genericReader->open());
// ------- IFileReaderDevice::createReader() Factory
// TODO: Move to a separate unit test.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the following to get more coverage:

  • Open a snoop file
  • Open a file that is not any of the options
  • Open pcap files with different magic numbers
  • Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3d713ab adds the following tests:

  • Pcap, PcapNG, Zst file with correct content + extension
  • Pcap, PcanNG file with correct content + wrong extension
  • Bogus content file with correct extension (pcap, pcapng, zst)
  • Bogus content file with wrong extension (txt)

Haven't found a snoop file to add. Do we have any?

Open pcap files with different magic numbers

Do you mean Pcap content that has just its magic number changed? Because IMO it is reasonable to consider that invalid format and fail as regular bogus data.

Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions

Pending on #1962 (comment) .

Move it out if it needs to be reused somewhere.
Libpcap supports reading this format since 0.9.1. The heuristics detection will identify such magic number as pcap and leave final support decision to the pcap backend infrastructure.
@seladb
Copy link
Owner

seladb commented Sep 21, 2025

@Dimi1010 some CI tests fail...

@Dimi1010
Copy link
Collaborator Author

Dimi1010 commented Sep 26, 2025

@Dimi1010 some CI tests fail...

@seladb I think I found the issue. According to ChatGPT the Winpcap's wpcap.dll does not actually support nanosecond precision files [1], due to being based on libpcap 1.0.x. Nanosecond precision pcaps were added in libpcap 1.5.

The NPcap sdk and drivers provide their own wpcap.dll where they do support nanosec precision.

The tests we have on nanosecond precision pcap files ("nanosecs.pcap") currently are flawed. We don't use a static base base to compare to. We write a packet with the PcapFileWriterDevice configured to write nanoseconds precision and then attempt to read that with PcapFileReaderDevice. The issue with that is that the writer has 0 error reporting if its configured to write nanosecond precision but it can't due to PCAP_TSTAMP_PRECISION_NANO not being defined and just uses pcap_open_dead which uses default precision.

Due to the above the current nanosecond test in the master branch passes with Winpcap, even though it doesn't actually write / read a nanosecond precision pcap.

The new tests use a static copy of nanoseconds.pcap generated from Npcap for base image to compare to. Due to that, the a PcapFileWriterDevice with the Winpcap backend fails to read that and fails the test due to open() failure.

Edit:
[1] - Confirmed that by looking at the Winpcap DLL API. It does not have functions like pcap_open_dead_with_tstamp_precision, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API deprecation Pull requests that deprecate parts of the public interface. enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants