Skip to content

Conversation

ikreymer
Copy link
Member

@ikreymer ikreymer commented Oct 3, 2025

The fix is to be more concretely differentiate between these two configurations:

# remote cdx + remote proxy loading via LiveWebLoader
remote: cdx+https://remote-archive.example.com/web/cdx

# remote cdx + but *local* WARC loading:
local_coll:
    index_paths: cdx+http://outbackcdx:8080/coll
    archive_paths: /path/to/mywarcs/

The former is for proxying via remote CDX while the latter is the recommend config for OutbackCDX

These supersedes the fix proposed in #917 by adding a conditional to check which configuration is used
(remote cdx + local loading or remote cdx + remote liveweb loading) by checking if archive_paths was provided,
as is needed for usage with OutbackCDX

The skipping of the livewebloader
is needed to be able to retry the lookup path again for revisit records, while of course remote proxying
will not need to do that.

Fixes #865

Screenshots (if appropriate):

Types of changes

  • Replay fix (fixes a replay specific issue)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added or updated tests to cover my changes.
  • All new and existing tests passed.

Differentiate between when LiveWebLoader is used for
fully remote loading (eg. remote CDX + pywb) vs
just remote index (remote CDX + local WARCs)
by checking if 'archive_paths' has been explicitly set.
If it has, then skip LiveWebLoader when filename/offset
are provided (to fallback to revisit)
Copy link
Member

@tw4l tw4l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense and testing well!

@ikreymer ikreymer merged commit 100295b into main Oct 7, 2025
7 checks passed
@ikreymer ikreymer deleted the issue-865-fix-liveweb-fallback-conditional branch October 7, 2025 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pywb failing to handle self-redirects from OutbackCDX

2 participants