Skip to content

Commit 392bab3

Browse files
authored
Add comment and TODO to canWeSkipSeeking() (#1041)
1 parent 2c07f58 commit 392bab3

File tree

1 file changed

+37
-31
lines changed

1 file changed

+37
-31
lines changed

src/torchcodec/_core/SingleStreamDecoder.cpp

Lines changed: 37 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1088,32 +1088,17 @@ void SingleStreamDecoder::setCursor(int64_t pts) {
10881088
cursor_ = pts;
10891089
}
10901090

1091-
/*
1092-
Videos have I frames and non-I frames (P and B frames). Non-I frames need data
1093-
from the previous I frame to be decoded.
1094-
1095-
Imagine the cursor is at a random frame with PTS=lastDecodedAvFramePts (x for
1096-
brevity) and we wish to seek to a user-specified PTS=y.
1097-
1098-
If y < x, we don't have a choice but to seek backwards to the highest I frame
1099-
before y.
1100-
1101-
If y > x, we have two choices:
1102-
1103-
1. We could keep decoding forward until we hit y. Illustrated below:
1104-
1105-
I P P P I P P P I P P I P P I P
1106-
x y
1107-
1108-
2. We could try to jump to an I frame between x and y (indicated by j below).
1109-
And then start decoding until we encounter y. Illustrated below:
1110-
1111-
I P P P I P P P I P P I P P I P
1112-
x j y
1113-
1114-
(2) is more efficient than (1) if there is an I frame between x and y.
1115-
*/
11161091
bool SingleStreamDecoder::canWeAvoidSeeking() const {
1092+
// Returns true if we can avoid seeking in the AVFormatContext based on
1093+
// heuristics that rely on the target cursor_ and the last decoded frame.
1094+
// Seeking is expensive, so we try to avoid it when possible.
1095+
// Note that this function itself isn't always that cheap to call: in
1096+
// particular the calls to getKeyFrameIndexForPts below in approximate mode
1097+
// are sometimes slow.
1098+
// TODO we should understand why (is it because it reads the file?) and
1099+
// potentially optimize it. E.g. we may not want to ever seek, or even *check*
1100+
// if we need to seek in some cases, like if we're going to decode 80% of the
1101+
// frames anyway.
11171102
const StreamInfo& streamInfo = streamInfos_.at(activeStreamIndex_);
11181103
if (streamInfo.avMediaType == AVMEDIA_TYPE_AUDIO) {
11191104
// For audio, we only need to seek if a backwards seek was requested
@@ -1136,13 +1121,34 @@ bool SingleStreamDecoder::canWeAvoidSeeking() const {
11361121
// implement caching.
11371122
return false;
11381123
}
1139-
// We are seeking forwards.
1140-
// We can only skip a seek if both lastDecodedAvFramePts and
1141-
// cursor_ share the same keyframe.
1142-
int lastDecodedAvFrameIndex = getKeyFrameIndexForPts(lastDecodedAvFramePts_);
1124+
// We are seeking forwards. We can skip a seek if both the last decoded frame
1125+
// and cursor_ share the same keyframe:
1126+
// Videos have I frames and non-I frames (P and B frames). Non-I frames need
1127+
// data from the previous I frame to be decoded.
1128+
//
1129+
// Imagine the cursor is at a random frame with PTS=lastDecodedAvFramePts (x
1130+
// for brevity) and we wish to seek to a user-specified PTS=y.
1131+
//
1132+
// If y < x, we don't have a choice but to seek backwards to the highest I
1133+
// frame before y.
1134+
//
1135+
// If y > x, we have two choices:
1136+
//
1137+
// 1. We could keep decoding forward until we hit y. Illustrated below:
1138+
//
1139+
// I P P P I P P P I P P I P
1140+
// x y
1141+
//
1142+
// 2. We could try to jump to an I frame between x and y (indicated by j
1143+
// below). And then start decoding until we encounter y. Illustrated below:
1144+
//
1145+
// I P P P I P P P I P P I P
1146+
// x j y
1147+
// (2) is only more efficient than (1) if there is an I frame between x and y.
1148+
int lastKeyFrameIndex = getKeyFrameIndexForPts(lastDecodedAvFramePts_);
11431149
int targetKeyFrameIndex = getKeyFrameIndexForPts(cursor_);
1144-
return lastDecodedAvFrameIndex >= 0 && targetKeyFrameIndex >= 0 &&
1145-
lastDecodedAvFrameIndex == targetKeyFrameIndex;
1150+
return lastKeyFrameIndex >= 0 && targetKeyFrameIndex >= 0 &&
1151+
lastKeyFrameIndex == targetKeyFrameIndex;
11461152
}
11471153

11481154
// This method looks at currentPts and desiredPts and seeks in the

0 commit comments

Comments
 (0)