You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-28Lines changed: 11 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -119,7 +119,7 @@ Porter's API
119
119
`Porter` provides just two public methods for importing data. These are the methods to be most familiar with, where the life of a data import operation begins.
120
120
121
121
*`import(Import): PorterRecords|CountablePorterRecords`– Imports one or more records from the resource contained in the specified import specification. If the total size of the collection is known, the record collection may implement `Countable`, otherwise `PorterRecords` is returned.
122
-
*`importOne(Import): ?array`– Imports one record from the resource contained in the specified import specification. If more than one record is imported, `ImportException` is thrown. Use this when a provider implements `SingleRecordResource`, returning just a single record.
122
+
*`importOne(Import): mixed`– Imports one record from the resource contained in the specified import specification. If more than one record is imported, `ImportException` is thrown. Use this when a provider implements `SingleRecordResource`, returning just a single record.
123
123
124
124
Overview
125
125
--------
@@ -153,13 +153,13 @@ Options may be configured using the methods below.
153
153
Record collections
154
154
------------------
155
155
156
-
Record collections are `Iterator`s, guaranteeing imported data is enumerable using `foreach`. Each *record* of the collection is the familiar and flexible `array` type, allowing us to present structured or flat data, such as JSON, XML or CSV, as an array.
156
+
Record collections are `Iterator`s, guaranteeing imported data is enumerable using `foreach`. Each *record* of the collection is the `mixed` type, offering the flexibility to present data series in whatever format is most useful for the user, such as an array for JSON data or an object that bundles data with functionality that the user can directly invoke.
157
157
158
158
### Details
159
159
160
160
Record collections may be `Countable`, depending on whether the imported data was countable and whether any destructive operations were performed after import. Filtering is a destructive operation since it may remove records and therefore the count reported by a `ProviderResource` would no longer be accurate. It is the responsibility of the resource to supply the total number of records in its collection by returning an iterator that implements `Countable`, such as `ArrayIterator`, or more commonly, `CountableProviderRecords`. When a countable iterator is used, Porter returns `CountablePorterRecords`, provided no destructive operations were performed.
161
161
162
-
Record collections are composed by Porter using the decorator pattern. If provider data is not modified, `PorterRecords` will decorate the `ProviderRecords` returned from a `ProviderResource`. That is, `PorterRecords` has a pointer back to the previous collection, which could be written as: `PorterRecords` → `ProviderRecords`. If a [filter](#filtering) was applied, the collection stack would be `PorterRecords` → `FilteredRecords` → `ProviderRecords`. Normally this is an unimportant detail but can sometimes be useful for debugging.
162
+
Record collections are composed by Porter using the decorator pattern. If provider data is not modified, `PorterRecords` will decorate the `ProviderRecords` returned from a `ProviderResource`. That is, `PorterRecords` has a pointer back to the previous collection, which could be written as: `PorterRecords` → `ProviderRecords`. If a [filter](#filtering) was applied, the collection stack would be `PorterRecords` → `FilteredRecords` → `ProviderRecords`. Normally, this is an unimportant detail but can sometimes be useful for debugging.
163
163
164
164
The stack of record collection types informs us of the transformations a collection has undergone and each type holds a pointer to relevant objects that participated in the transformation. For example, `PorterRecords` holds a reference to the `Import` that was used to create it and can be accessed using `PorterRecords::getImport`.
165
165
@@ -249,7 +249,7 @@ Transformers should also implement the `__clone` magic method if they store any
249
249
Filtering
250
250
---------
251
251
252
-
Filtering provides a way to remove some records. For each record, if the specified predicate function returns `false` (or a falsy value), the record will be removed, otherwise the record will be kept. The predicate receives the current record as an array as its first parameter and context as its second parameter.
252
+
Filtering provides a way to remove some records. For each record, if the specified predicate function returns `false` (or a falsy value), the record will be removed, otherwise the record will be kept. The predicate receives the current record as its first parameter and context as its second parameter.
253
253
254
254
In general, we would like to avoid filtering because it is inefficient to import data and then immediately remove some of it, but some immature APIs do not provide a way to reduce the data set on the server, so filtering on the client is the only alternative. Filtering also invalidates the record count reported by some resources, meaning we no longer know how many records are in the collection before iteration.
255
255
@@ -281,7 +281,7 @@ The exception handler can be changed by calling `setFetchExceptionHandler`. For
Durability only applies when connectors throw a recoverable exception type derived from `RecoverableConnectorException`. If an unexpected exception occurs the fetch attempt will be aborted. For more information, see [implementing connector durability](#durability-1). Exception handlers receive the thrown exception as their first argument. An exception handler can inspect the recoverable exception and throw its own exception if it decides the exception should be treated as fatal instead of recoverable.
284
+
Durability only applies when connectors throw a recoverable exception type derived from `RecoverableConnectorException`. If an unexpected exception occurs, the fetch attempt will be aborted. For more information, see [implementing connector durability](#durability-1). Exception handlers receive the thrown exception as their first argument. An exception handler can inspect the recoverable exception and throw its own exception if it decides the exception should be treated as fatal instead of recoverable.
285
285
286
286
Caching
287
287
-------
@@ -362,7 +362,7 @@ final class MyProvider implements Provider
362
362
Resources
363
363
---------
364
364
365
-
Resources fetch data using the supplied connector and format it as a collection of arrays. A resource implements `ProviderResource` that defines the following three methods.
365
+
Resources fetch data using the supplied connector and format it as an iterator. A resource implements `ProviderResource` that defines the following three methods.
366
366
367
367
```php
368
368
public function getProviderClassName(): string;
@@ -371,33 +371,22 @@ public function fetch(ImportConnector $connector): \Iterator;
371
371
372
372
A resource supplies the class name of the provider it expects a connector from when `getProviderClassName()` is called.
373
373
374
-
When `fetch()` is called it is passed the connector from which data must be fetched. The resource must ensure data is formatted as an iterator of array values whilst remaining as true to the original format as possible; that is, we must avoid renaming or restructuring data because it is the caller's prerogative to perform data customization if desired. The recommended way to return an iterator is to use `yield` to implicitly return a `Generator`, which has the added benefit of processing one record at a time.
374
+
When `fetch()` is called it is passed the connector from which data must be fetched. The resource must ensure data is formatted as an iterator of values whilst remaining as true to the original format as possible; that is, we must avoid renaming or restructuring data because it is the caller's prerogative to perform data customization if desired. The recommended way to return an iterator is to use `yield` to implicitly return a `Generator`, which has the added benefit of processing one record at a time.
375
375
376
376
The fetch method receives an `ImportConnector`, which is a runtime wrapper for the underlying connector supplied by the provider. This wrapper is used to isolate the connector's state from the rest of the application. Since PHP doesn't have native immutability support, working with cloned state is the only way we can guarantee unexpected changes do not occur once an import has started. This means it's safe to import one resource, make changes to the connector's settings and then start another import before the first has completed. Providers can also safely make changes to the underlying connector by calling `getWrappedConnector()`, because the wrapped connector is cloned as soon as `ImportConnector` is constructed.
377
377
378
378
Providing immutability via cloning is an important concept because resources are often implemented using generators, which implies delayed code execution. Multiple fetches can be started with different settings, but execute in a different order some time later when they're finally enumerated. This issue will become even more pertinent when Porter supports asynchronous fetches, enabling multiple fetches to execute concurrently. However, we don't need to worry about this implementation detail unless writing a connector ourselves.
379
379
380
380
### Writing a resource
381
381
382
-
Resources must implement the `ProviderResource` interface. `getProviderClassName()` usually returns a hard-coded provider class name and `fetch()` must always return an iterator of array values.
383
-
384
-
In this contrived example that uses dummy data and ignores the connector, suppose we want to return the numeric series one to three: the following implementation would be invalid because it returns an iterator of integer values instead of an iterator of array values.
385
-
386
-
```php
387
-
public function fetch(ImportConnector $connector): \Iterator
388
-
{
389
-
return new ArrayIterator(range(1, 3)); // Invalid return type.
390
-
}
391
-
```
382
+
Resources must implement the `ProviderResource` interface. `getProviderClassName()` usually returns a hard-coded provider class name and `fetch()` must always return an iterator.
392
383
393
384
Either of the following `fetch()` implementations would be valid.
394
385
395
386
```php
396
387
public function fetch(ImportConnector $connector): \Iterator
397
388
{
398
-
foreach (range(1, 3) as $number) {
399
-
yield [$number];
400
-
}
389
+
return new ArrayIterator(range(1, 3)); // Invalid return type.
401
390
}
402
391
```
403
392
@@ -406,13 +395,7 @@ Since the total number of records is known, the iterator can be wrapped in `Coun
406
395
```php
407
396
public function fetch(ImportConnector $connector): \Iterator
408
397
{
409
-
$series = function ($limit) {
410
-
foreach (range(1, $limit) as $number) {
411
-
yield [$number];
412
-
}
413
-
};
414
-
415
-
return new CountableProviderRecords($series($count = 3), $count, $this);
398
+
return new CountableProviderRecords(new ArrayIterator(range(1, $count = 3)), $count, $this);
416
399
}
417
400
```
418
401
@@ -564,7 +547,7 @@ Porter is supported by [JetBrains for Open Source][] products.
0 commit comments