@@ -70,21 +70,28 @@ Let's take a look at how the following code is structured:
70
70
def to_item (self ):
71
71
... # more specific parsing
72
72
73
- @handle_urls ([" dualexample.com" , " dualexample.net" ], overrides = GenericProductPage)
73
+ @handle_urls ([" dualexample.com/shop/?product=* " , " dualexample.net/store/?pid=* " ], overrides = GenericProductPage)
74
74
class DualExampleProductPage (ItemWebPage ):
75
75
def to_item (self ):
76
76
... # more specific parsing
77
77
78
78
The code above declares that:
79
79
80
- - For sites that matches the ``example.com `` pattern, ``ExampleProductPage ``
80
+ - For sites that match the ``example.com `` pattern, ``ExampleProductPage ``
81
81
would be used instead of ``GenericProductPage ``.
82
- - The same is true for ``YetAnotherExampleProductPage `` where it is used
83
- instead of ``GenericProductPage `` for two URLs: ``dualexample.com `` and
84
- ``dualexample.net ``.
85
- - However, ``AnotherExampleProductPage `` is only used instead of ``GenericProductPage ``
86
- when we're parsing pages from ``anotherexample.com `` which doesn't contain
87
- ``/digital-goods/ `` in its URL path.
82
+ - The same is true for ``DualExampleProductPage `` where it is used
83
+ instead of ``GenericProductPage `` for two URL patterns which works as:
84
+
85
+ - **(match) ** https://www.dualexample.com/shop/electronics/?product=123
86
+ - **(match) ** https://www.dualexample.com/shop/books/paperback/?product=849
87
+ - (NO match) https://www.dualexample.com/on-sale/books/?product=923
88
+ - **(match) ** https://www.dualexample.net/store/kitchen/?pid=776
89
+ - **(match) ** https://www.dualexample.net/store/?pid=892
90
+ - (NO match) https://www.dualexample.net/new-offers/fitness/?pid=892
91
+
92
+ - On the other hand, ``AnotherExampleProductPage `` is only used instead of
93
+ ``GenericProductPage `` when we're parsing pages from ``anotherexample.com ``
94
+ which doesn't contain ``/digital-goods/ `` in its URL path.
88
95
89
96
The override mechanism that ``web-poet `` offers could still be further
90
97
customized. You can read some of the specific parameters and alternative ways
@@ -115,10 +122,11 @@ code example below:
115
122
def to_item (self ):
116
123
... # more specific parsing
117
124
118
- @primary_registry.handle_urls ([" dualexample.com" , " dualexample.net" ], overrides = GenericProductPage)
119
- @secondary_registry.handle_urls ([" dualexample.com" , " dualexample.net" ], overrides = GenericProductPage)
125
+ @primary_registry.handle_urls ([" dualexample.com/shop/?product=* " , " dualexample.net/store/?pid=* " ], overrides = GenericProductPage)
126
+ @secondary_registry.handle_urls ([" dualexample.com/shop/?product=* " , " dualexample.net/store/?pid=* " ], overrides = GenericProductPage)
120
127
class DualExampleProductPage (ItemWebPage ):
121
128
def to_item (self ):
129
+ ... # more specific parsing
122
130
123
131
If you need more control over the Registry, you could instantiate your very
124
132
own :class: `~.PageObjectRegistry ` and use its ``@handle_urls `` to annotate and
@@ -159,11 +167,11 @@ like ``web_poet my_project.page_objects`` would produce the following:
159
167
160
168
.. code-block ::
161
169
162
- Use this instead of for the URL patterns except for the patterns with priority meta
163
- ---------------------------------------------------- ------------------------------------------ -------------------------------------- ------------------------- --------------- ------
164
- my_project.page_objects.ExampleProductPage my_project.page_objects.GenericProductPage ['example.com'] [] 500 {}
165
- my_project.page_objects.AnotherExampleProductPage my_project.page_objects.GenericProductPage ['anotherexample.com'] ['/digital-goods/'] 500 {}
166
- my_project.page_objects.DualExampleProductPage my_project.page_objects.GenericProductPage ['dualexample.com', 'dualexample.net'] [] 500 {}
170
+ Use this instead of for the URL patterns except for the patterns with priority meta
171
+ ---------------------------------------------------- ------------------------------------------ -------------------------------------- ------------------------- --------------- ------
172
+ my_project.page_objects.ExampleProductPage my_project.page_objects.GenericProductPage ['example.com'] [] 500 {}
173
+ my_project.page_objects.AnotherExampleProductPage my_project.page_objects.GenericProductPage ['anotherexample.com'] ['/digital-goods/'] 500 {}
174
+ my_project.page_objects.DualExampleProductPage my_project.page_objects.GenericProductPage ['dualexample.com/shop/?product=* ', 'dualexample.net/store/?pid=* '] [] 500 {}
167
175
168
176
Organizing Page Object Overrides
169
177
--------------------------------
0 commit comments