Releases: quixio/quix-streams
v2.2.0a
What's Changed
Added Windowed Aggregations
- Implemented two types of time-based windows:
-
Tumbling windows slice time into even non-overlapping periods.
Example: [0, 10), [10, 20), [20,30) -
Hopping windows slice time into even overlapping periods with a fixed step.
Example: [0,10), [1, 11), [2, 12)
-
- Support for various aggregation functions:
sum
,count
,mean
,min
,max
,reduce
- Two modes of emitting aggregated results:
- final - to emit results when the window is closed
- current - to emit results for each incoming message as it's processed
- Support for out-of-order processing
Find more about Windowed Aggregations in the docs
Full Changelog: v2.1.4a...v2.2.0a
v2.1.4a
What's Changed
- Add support for Column families by @daniil-quix in #268
- Update the mapping of broker settings for Quix apps by @daniil-quix in #270
- Fix bug in RocksDB transaction on deleting cached key by @harisbotic in #269
- Add API to create a pre-configured Producer and Consumer on the Application level by @harisbotic in #261
Full Changelog: v2.1.3a...v2.1.4a
v2.1.3a
What's Changed
- Implement better handling for all broker down error by @peter-quix in #264
## Repo cleanup
- Make URLs absolute in the docs by @daniil-quix in #256
- Update repo banner by @SteveRosam in #262
- Update README.md by @SteveRosam in #263
Full Changelog: v2.1.2a...v2.1.3a
v2.1.2a
What's Changed
Full Changelog: v2.1.1a...v2.1.2a
v2.1.1a
What's Changed
-
[New] Implemented
StreamingDataFrame.apply(expand=True)
to produce multiple individual values based on the collection. [#246]
You may use.apply(expand=True)
to produce multiple values to the output topics for a single input message
Docs -
[Fix] Make "and" and "or" comparisons lazy in
StreamingSeries
[#248]
In previous versions, code likesdf['a'] | sdf['b']
andsdf['a'] & sdf['b']
was always evaluating both sides of the expression, while usualand
andor
can evaluate the right side lazily.
Full Changelog: v2.1alpha1...v2.1.1a
v2.1a1
Release v2.1a1
This release brings new features and breaking API changes.
Read on for more details.
What's changed
Many updates to StreamingDataFrame
and StreamingSeries
(ex-Column
) [#238]
We updated how streaming data is processed by StreamingDataFrame
and StreamingSeries
classes.
They now use the same engine to apply functions to the message values.
It introduces both new features and breaking changes:
1. [breaking] Changes to StreamingDataFrame
and new methods .update()
and .filter()
-
Functions passed to
StreamingDataFrame.apply()
should always return a new value. The result of the function will be propagated downstream.
Previously,.apply()
was used to both mutate values and generate new ones.
Now it's discouraged, and the.update()
method should be used instead to mutate data and perform side effects (e.g. to print to the console).
Docs -
New method
StreamingDataFrame.update()
to mutate values in place.
Docs -
New method
StreamingDataFrame.filter()
to filter values.
Docs -
New syntax to filter values -
dataframe[dataframe.apply(<func>)]
StreamingDataFrame
can now filter data using a similar API aspandas.DataFrame
.
Docs -
New syntax to update values -
dataframe['field'] = dataframe.apply(<func>)
StreamingDataFrame
can now assign new keys to the values using functions.
Docs` -
StreamingDataFrame
doesn't require values to be dictionaries anymore. -
StreamingDataFrame
methods now return a newStreamingDataFrame
.
Previously, many methods like.apply()
and.to_topic()
were mutating the state of theStreamingDataFrame
instance.
Now all of them return a new instance, and the current instance remains intact. -
Functions passed to
.apply()
now get only one argument - message value.
Previously, functions passed to.apply()
were provided with at least 2 positional arguments: message value andMessageContext
with Kafka message metadata.
Now they don't need to acceptMessageContext
to access a current message key, and it is stored globally for each incoming message.
To get a current message key and or full message metadata usequixstreams.message_key()
andquixstreams.message_context()
in
your custom functions.
Docs
New method to check if the key exists in the message value - StreamingDataFrame.contains()
[#235]
New method to clear the local state - Application.clear_state()
[#239]
Use Application.clear_state()
to clear the local state of the application.
Quix Streams keeps state data per consumer group, so the state will be cleared only for the
particular consumer group.
[breaking] Changes to logical operators |
, &
and ~
in StreamingSeries
[#238]
The logical operators |
, &
, and ~
now do logical comparisons.
Previously it worked this way only with booleans, when for numbers it would do a bitwise operation.
Support for Python 3.12 [#244]
Quix Streams now works with Python 3.12
Full Changelog: v2.0alpha2...v2.1alpha1
v2.0a2
This is the first alpha release of the new Quix Streams 2.0.
It's a complete rewrite, providing a different API to work with streaming data in Python.
What's Changed
Introduced Streaming DataFrames - a primary API to define the processing pipeline
-
It provides a
pandas.DataFrame
-like API to structure the message transformations.
For the full description ofStreamingDataFrame
API please see StreamingDataFrame: Detailed Overview -
It supports stateful operations backed by RocksDB state storage. The details regarding work with state are outlined here - Stateful Processing
-
It supports different message serialization formats like strings, integers, doubles, JSON and Quix formats.
Detailed overview of serialization in 2.0 can be found here - Serialization
Removed a dependency on .NET
The 2.0 version doesn't depend on .NET anymore and is pure Python.
It makes the library code more stable, safe, and easier to maintain and update.
Compatibility with Quix Streams 0.5.7
Although 2.0 introduces a totally new API, both 0.5.7 and 2.0 are mostly compatible on the data level - both 0.5.7 and 2.0a consumers and producers may exchange messages in the same format.
There are some limitations to the types of messages supported in 2.0.
Please see Upgrading from Legacy Quixstreams for more details
Full Changelog: v0.5.7...v2.0alpha2
v0.5.7
New
- Create leading edge buffer where tag is not part of the key by @peter-quix in #180
Bugfix
- Python app should subscribe by default similarly to C# by @peter-quix in #190
Full Changelog: v0.5.6...v0.5.7
v0.5.6
Improvements
- Reduce debug log verbosity when broker connections are reaped due to inactivity or other by @peter-quix in #184
- Add subscribe/unsubscribe to topics by @peter-quix in #186
- Improved file based storage resiliency and speed by @peter-quix in #185
Bugfixes
- infinity and NaN no longer cause exception in timeseries data. Was introduced in 0.5.5 via 3d6a244 @peter-quix in #185
Full Changelog: v0.5.5...v0.5.6
v0.5.5
New
- Add scalar state type by @harisbotic in #137
- Implement LeadingEdgeBuffer by @harisbotic in #146 and #154
Improvements
- C# Performance improvements by @peter-quix in #141
- Expand coverage of idle connection reap message check for consumer by @peter-quix in #165
- Use client.id when using QuixStreamingClient to connect to Confluent Cloud Kafka (#155) by @jordivicedo in #156
Bug fixes
- Handle nulls better in State store by @peter-quix in #159
- Ensure last message is flushed on dispose by @peter-quix in #160
- Fix python state segfaults by @peter-quix in #163
- Fix reporting of available brokers in producer by @peter-quix in #164
- Revert to empty parameter when missing rather than exception by @peter-quix in #167
- Malloc/memleak fixes by @peter-quix in #169
New Contributors
- @jordivicedo made their first contribution in #156
Full Changelog: v0.5.4...v0.5.5