You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Porter is the all-purpose PHP data importer. She fetches data from anywhere and serves it as a single record or an iterable [record collection](#record-collections), encouraging processing one record at a time instead of loading full data sets into memory at once. Her [durability](#durability) feature provides automatic, transparent recovery from intermittent network connectivity errors by default.
13
13
14
-
Porter's interface trichotomy of [providers](#providers), [resources](#resources) and [connectors](#connectors) maps well to APIs. For example, a typical API such as GitHub would define the provider as GitHub, a resource as `GetUser` or `ListRepositories` and the connector could be [HttpConnector][].
14
+
Porter's [interface trichotomy](#overview) of [providers](#providers), [resources](#resources) and [connectors](#connectors) maps well to APIs. For example, a typical API such as GitHub would define the provider as GitHub, a resource as `GetUser` or `ListRepositories` and the connector could be [HttpConnector][].
15
15
16
16
Porter provides a dual API for synchronous and [asynchronous](#asynchronous) imports, both of which are concurrency safe, so multiple imports can be paused and resumed simultaneously. Asynchronous mode allows large scale imports across multiple connections to work at maximum efficiency without waiting for each network call to complete.
17
17
@@ -149,8 +149,7 @@ Options may be configured using the methods below.
149
149
-`enableCache()`– Enables caching. Requires a `CachingConnector`.
150
150
-`setMaxFetchAttempts(int)`– Sets the maximum number of fetch attempts per connection before failure is considered permanent.
151
151
-`setFetchExceptionHandler(FetchExceptionHandler)`– Sets the exception handler invoked each time a fetch attempt fails.
152
-
153
-
In synchronous code, import specifications are an instance of `ImportSpecification`
152
+
-`setThrottle(Throttle)`– Sets the asynchronous connection throttle, invoked each time a connector fetches data. Applies to `AsyncImportSpecification` only.
154
153
155
154
Record collections
156
155
------------------
@@ -195,11 +194,15 @@ Programming asynchronously requires an understanding of Amp, the async framework
195
194
196
195
### Throttling
197
196
198
-
The asynchronous import model is very powerful because it changes our application's performance from being I/O-bound to being CPU-bound. That is, in the traditional synchronous model, each import operation must wait for the previous to complete before the next begins, meaning the total import time depends how long it takes each import's network I/O to complete. In the async model, since we send many requests concurrently without waiting for the previous to complete, on average each import operation will only take as long as our CPU takes to process it, since we are busy processing another import during network latency.
197
+
The asynchronous import model is very powerful because it changes our application's performance model from I/O-bound, limited by the speed of the network, to CPU-bound, limited by the speed of the CPU. In the traditional synchronous model, each import operation must wait for the previous to complete before the next begins, meaning the total import time depends on how long it takes each import's network I/O to finish. In the async model, since we send many requests concurrently without waiting for the previous to complete, on average each import operation only takes as long as our CPU takes to process it, since we are busy processing another import during network latency (except during the initial "spin-up").
198
+
199
+
Synchronously, we seldom trip protection measures even for high volume imports, however the naïve approach to asynchronous imports is often fraught with perils. If we import 10,000 HTTP resources at once, one of two things usually happens: either we run out of PHP memory and the process terminates prematurely or the HTTP server rejects us after sending too many requests in a short period. The solution is throttling.
199
200
200
-
High volume synchronous imports are, in a way, self-throttling and it is rare to trip protection measures in this mode, however the naïve approach to asynchronous imports is often fraught with perils. For example, when we import 10,000 HTTP resources at once, one of two things usually happens: either we run out of PHP memory and the process is killed or the HTTP server blocks us for sending too many requests in a short period. The solution is throttling.
201
+
[Async Throttle][] is a library included with Porter to throttle asynchronous imports. The throttle works by preventing additional operations starting when too many are executing concurrently, based on user-defined limits. By default, `NullThrottle` is assigned, which does not throttle connections. `DualThrottle` can be used to set two independent connection rate limits: the maximum number of connections per second and the maximum number of concurrent connections. A `DualThrottle` can be assigned by modifying the import specification as follows.
201
202
202
-
We provide [Async Throttle][] to throttle asynchronous imports. The Async Throttle is a separate project that does not have any direct integration with Porter because that is not needed. The throttle operates on any Amp promises, such as those returned by Porter. The throttle works by preventing additional operations starting when too many are concurrently executing, based on user-defined limits.
203
+
```php
204
+
(new AsyncImportSpecification)->setThrottle(new DualThrottle)
0 commit comments