-
Notifications
You must be signed in to change notification settings - Fork 148
Description
Somehow it has taken me 10 months of frequent usage of a script to figure out this is the cause for infrequent crashes, whereas looking at the code suggests it should be crashing every time.
The issue is as follows: I have a script that switches between session and driver a few times, transferring cookies every time it does so. It starts by extracting information and links from a page with the session, then uses that to step through it on the webdriver, and finally passes it back to the session to download the information it got.
A barebones version of the setup looks like this
req_s = Session(...)
req_s.get("domain")
# do things
req_s.transfer_session_cookies_to_driver()
req_s.driver.get("domain/specific/path")
# do things with driver
req_s.transfer_driver_cookies_to_session()
# do other things with session
req_s.driver.quit()
req_s.close()
Infrequently, the script crashes on the req_s.transfer_session_cookies_to_driver()
line. While I currently install through PyPI and am running on 0.4.0, I checked the code, and the issue should still appear on 0.5.0 as the relevant lines didn't change.
Although the transfer_session_cookies_to_driver()
functions makes sure to check that there's a domain and there's a last visited url in the session, it does not do this check for the driver.
requestium/requestium/requestium_session.py
Lines 123 to 131 in 38f5b17
if not domain: | |
msg = "Trying to transfer cookies to selenium without specifying a domain and without having visited any page in the current session" | |
raise InvalidCookieDomainException(msg) | |
# Transfer cookies | |
for c in [c for c in self.cookies if domain in c.domain]: | |
cookie = {"name": c.name, "value": c.value, "path": c.path, "expiry": c.expires, "domain": c.domain} | |
self.driver.ensure_add_cookie({k: v for k, v in cookie.items() if v is not None}) |
When it passes the individual cookies to the driver, things get a little bit more interesting:
requestium/requestium/requestium_mixin.py
Lines 68 to 75 in 38f5b17
if override_domain: | |
cookie["domain"] = override_domain | |
cookie_domain = cookie["domain"] if cookie["domain"][0] != "." else cookie["domain"][1:] | |
try: | |
browser_domain = tldextract.extract(self.current_url).fqdn | |
except AttributeError: | |
browser_domain = "" |
The webdriver's
self.current_url
is defined on Selenium's webdriver as a property that executes a command in the current browser window:https://github.com/SeleniumHQ/selenium/blob/2ab802bd4bf1e590e67b5c664fa1c01040f96bf1/py/selenium/webdriver/remote/webdriver.py#L585
However, when there is no active browser window, because no request has been made yet, calling a command in the browser window will fail. As such, the call to self.current_url
will give a NoSuchWindowException, that is not caught.
I've noticed that when running the browser headless the code will infrequently fail on this line, but when not headless, there appears to be a window from the start, and it will be able to run a command. Given the circumstances, I would have expected this to always give this exception.
As a workaround, I am now adding a call to req_s.driver.get("domain")
prior to transferring the cookies, which solves the no window issue.
Environment:
Python 3.12.4
OS: MacOS 14.4 (Sonoma)
Webdriver: Undetected Chromedriver
Chrome 139
Relevant packages:
requestium 0.4.0
selenium 4.34.2
requests 2.32.4
undetected-chromedriver 3.5.5
Peculiarities about setup: this particular script's implementation relies on cdp events to intercept network requests. The main usage of Selenium in this approach is to avoid having to extract canvas elements from the page, and instead grab what is inserted into the canvas, then pass that back to the session and download those from there. As there's CSRF tokens involved in order to access those URLs that are passed in session cookies, the transfer is essential for the script to function.
While I don't believe that the use of CDP is relevant for this particular issue, the combined version of the code looks as follows:
req_s = Session(driver=uc.Chrome(headless=True, enable_cdp_events=True))
req_s.driver.add_cdp_listener('Network.requestWillBeSent', handle_network_request)
Stactrace showcasing the issue:
Traceback (most recent call last):
File "/Users/afelix/PycharmProjects/LoghiExperiments/matricula_sandbox_redux.py", line 106, in <module>
req_s.transfer_session_cookies_to_driver()
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/requestium/requestium.py", line 118, in transfer_session_cookies_to_driver
self.driver.ensure_add_cookie({k: v for k, v in cookie.items() if v is not None})
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/requestium/requestium.py", line 227, in ensure_add_cookie
browser_domain = tldextract.extract(self.current_url).fqdn
^^^^^^^^^^^^^^^^
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/undetected_chromedriver/__init__.py", line 806, in __getattribute__
return super().__getattribute__(item)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py", line 583, in current_url
return self.execute(Command.GET_CURRENT_URL)["value"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/selenium/webdriver/remote/webdriver.py", line 454, in execute
self.error_handler.check_response(response)
File "/Users/afelix/.local/share/virtualenvs/LoghiExperiments-BX_9XuM1/lib/python3.12/site-packages/selenium/webdriver/remote/errorhandler.py", line 232, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchWindowException: Message: no such window: target window already closed
from unknown error: web view not found
(Session info: chrome=139.0.7258.67)
Stacktrace:
0 undetected_chromedriver 0x0000000101322e98 undetected_chromedriver + 5918360
1 undetected_chromedriver 0x000000010131a42a undetected_chromedriver + 5882922
2 undetected_chromedriver 0x0000000100de6e20 undetected_chromedriver + 429600
3 undetected_chromedriver 0x0000000100dbb880 undetected_chromedriver + 252032
4 undetected_chromedriver 0x0000000100e670b8 undetected_chromedriver + 954552
5 undetected_chromedriver 0x0000000100e85a2c undetected_chromedriver + 1079852
6 undetected_chromedriver 0x0000000100e5ece3 undetected_chromedriver + 920803
7 undetected_chromedriver 0x0000000100e2b29b undetected_chromedriver + 709275
8 undetected_chromedriver 0x0000000100e2bf81 undetected_chromedriver + 712577
9 undetected_chromedriver 0x00000001012dfba0 undetected_chromedriver + 5643168
10 undetected_chromedriver 0x00000001012e3a54 undetected_chromedriver + 5659220
11 undetected_chromedriver 0x00000001012bb412 undetected_chromedriver + 5493778
12 undetected_chromedriver 0x00000001012e44ff undetected_chromedriver + 5661951
13 undetected_chromedriver 0x00000001012aa3b4 undetected_chromedriver + 5424052
14 undetected_chromedriver 0x0000000101307718 undetected_chromedriver + 5805848
15 undetected_chromedriver 0x00000001013078e0 undetected_chromedriver + 5806304
16 undetected_chromedriver 0x000000010131a001 undetected_chromedriver + 5881857
17 libsystem_pthread.dylib 0x00007ff81b08c18b _pthread_start + 99
18 libsystem_pthread.dylib 0x00007ff81b087ae3 thread_start + 15