Avoid printing to standard output

Several methods in the EDDL API write directly to standard output, so that library users have no control over what gets printed and when. Since this happens within each method's implementation, the Python bindings have no way to prevent it, thus Python library users are also affected. This is particularly inconvenient when using the library from a Jupyter notebook: in order to display output in notebook cells, Jupyter replaces `sys.stdout` with a custom `ipykernel.iostream.OutStream` object. However, `sys.stdout` is just a Python-level abstraction, that *usually* wraps the system's file descriptor for stdout (but not in this case), while the underlying C++ library is writing to the real stdout, so its output ends up in the terminal window where Jupyter was launched (and gets mixed up with logging output from Jupyter itself). This affects several kinds of methods, most notably:

* Info prints such as `Tensor.info()` and `summary(net)`
* Progress from dataset downloaders such as `download_mnist`
* Progress from network training methods such as `fit`

In the case of info prints, the amount of text is relatively small and gets printed all at once, so it should probably be enough to change these methods to return a string containing the informational dump.

The case of methods that periodically report progress is a bit more complicated. One thing that should work fine is to add support for a callback function that would be passed the relevant info at each tick. For instance, this is what Python's standard library does in [urlretrieve](https://docs.python.org/3/library/urllib.request.html#urllib.request.urlretrieve), a tool for downloading remote files:

```python
def urlretrieve(url, filename=None, reporthook=None, data=None):
    # [...]
```

`reporthook` expects a function that takes three arguments: a block count, a block size, and the total file size. For instance, the user could write the following function to pass as `reporthook`:

```python
def print_progress(count, block_size, total_size):
    perc = 100 * count * block_size / total_size
    sys.stdout.write("\r>> Getting %s %.1f%%" % (tar_name, perc))
    sys.stdout.flush()
```

Each time `urlretrieve` wants to report on its progress, it calls the `print_progress` function specified by the user, which now has complete freedom of what to do with the information: in this case it is simply dumped to `sys.stdout`, but it could be used to populate a database or update a chart instead. Even the simple example shown above should be enough to fix the Jupyter problem though.

The relevant C++ methods should be modified to take extra optional `std::function` arguments, which are already used in `newloss` and `newmetric` and thus should be OK to bind in Python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid printing to standard output #321

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid printing to standard output #321

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions