Releases: neo4j/graph-data-science
Graph Data Science 1.6.5
GDS 1.6.5 is compatible with Neo4j 4.0, 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in
gds.beta.graph.generate
, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.graph.create
, where default values for array properties would throw for convertable types. - Fixed a bug in
gds.beta.graphSage
, where the concurrency parameter was not considered. - Fixed a bug where the BitIdMap node mapping builder (on by default in GDS Enterprise Edition) would not correctly count all nodes in certain situations.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train
. This affects thepenality
parameter used in logistic regression.
GDS 1.7.0-Preview
GDS 1.7.0-preview is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.1.6
Breaking changes
- This release does not support Neo4j 4.0.x
- Align returned
modelInfo
entry names ofgds.alpha.ml.linkPrediction.train
andgds.alpha.ml.nodeClassification.train
with the model catalog. Now containingmodelName
andmodelInfo
instead ofname
andinfo
. - Remove the
sharedUpdater
parameter fromgds.alpha.ml.linkPrediction
andgds.alpha.ml.nodeClassification
. gds.beta.graph.export.csv
now exports into a subdirectory calledexport
. Previously, the exported graphs were written directly into the configured directory.- Renamed all
graphalgo
packages togds
New features
- New Algorithm: Approximate Maximum K-Cut
- Includes procedures:
gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate]
.
- Includes procedures:
- Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures:
gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate
.
- Includes procedures:
- Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added
additionalNodeProperties
parameter togds.graph.export
- Added
additionalNodeProperties
parameter togds.graph.export.csv
- Added
- Introduced experimental support for querying the in-memory graph with Cypher
- Added
gds.alpha.create.cypherdb
to allow neo4j to recognize the in-memory graph as a database for Cypher queries
- Added
- To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure,
gds.alpha.systemMonitor,
to provide an overview of the system's workload and available resources. - Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with
gds.beta.listProgress
- GraphSAGE now supports deterministic results with the
randomSeed
configuration parameter togds.beta.graphSage.train
. - Improve performance (up to 20x speedup) of weakly connected components,
gds.wcc,
for undirected graphs by applying a subgraph sampling optimization.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSage
andgds.alpha.spanningTree
. - Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a
NaN
issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train
. This affects thepenality
parameter used in logistic regression.
- Fixed a
- Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
- Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected
gds.nodeSimilarity.write
andgds.alpha.knn.write
when being executed in combination with anodeLabels
filter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug where
gds.nodeSimilarity.[write|mutate]
andgds.beta.knn.[write|mutate]
wrote duplicate relationships if the input graph is undirected.
- KNN:
- Fixed a bug in
gds.beta.knn
where negative values in node properties of type float arrays failed when returning thesimilarityDistribution
.
- Fixed a bug in
- Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
- GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.graphSage
, where the concurrency parameter was not considered.
- Graph Operations:
- Fixed a bug in
gds.graph.removeNodeProperties
whereremovedPropertiesWritten
was too large for properties shared across multiple labels. - Fixed a bug in
gds.beta.graph.generate
, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.create.subgraph
which could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition. - Fixed a bug in
gds.graph.create
, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for
sourceNode
andtargetNode
to all shortest path procedures in the product tier. - Improved runtime of
gds.fastRP
via better workload balancing between threads. - Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of
gds.beta.listProgress
. - Scale down scores computed by
gds.articleRank
.
- Fixed a bug in
GDS 1.6.4
GDS 1.6.3
Release Date: July 22, 2021
Breaking changes
- Remove the
sharedUpdater
parameter fromgds.alpha.ml.linkPrediction
andgds.alpha.ml.nodeClassification
.
New features
Bug fixes
- Fixed a bug which affected
gds.nodeSimilarity.write
andgds.alpha.knn.write
when being executed in combination with anodeLabels
filter.
The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Fixed a bug where
gds.nodeSimilarity.[write|mutate]
andgds.beta.knn.[write|mutate]
wrote duplicate relationships if the input graph is undirected. - Fixed a bug in
gds.beta.knn
where negative values in node properties of type float arrays failed when returning thesimilarityDistribution
.
Improvements
- Lower memory footprint for LinkPrediction and NodeClassification.
GDS 1.6.2
Release Date: July 8, 2021
GDS 1.6.2 is compatible with Neo4j 4.3, 4.2, 4.1, and 4.0. It is not compatible with Neo4j 3.5.x - for a compatible release, please see GDS 1.1.6
Bug Fixes
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.node2vec
where using relationship weights would not work when running concurrently. - Fixed a bug in
gds.graph.create
where the default value could not be changed for array properties.
1.6.1
Release Date: 17 June 2021
GDS 1.6.1 is compatible with Neo4j 4.0, 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSage
andgds.alpha.spanningTree
. - Fix
NaN
issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Progress logging (gds.beta.listProgress)
- Fixed a bug where progress events would not be released if computation was abandoned before completion.
- Fixed a bug with Pregel algorithms logging where progress events would not be released on algorithm completion.
- Fixed a bug regarding mutated node properties that could cause an AIOOB exception.
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
GDS 1.6.0
Release Date: 27 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimate
gds.degree.write.estimate
gds.degree.mutate
gds.degree.mutate.estimate
gds.degree.stats
gds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.stream
Gds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.stream
gds.articleRank.stream.estimate
gds.articleRank.write
gds.articleRank.write.estimate
gds.articleRank.mutate
gds.articleRank.mutate.estimate
gds.articleRank.stats
gds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.stream
gds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.stream
gds.eigenvector.stream.estimate
gds.eigenvector.write
gds.eigenvector.write.estimate
gds.eigenvector.mutate
gds.eigenvector.mutate.estimate
gds.eigenvector.stats
gds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.stream
Gds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.stream
gds.astar.stream.estimate
gds.astar.write
gds.astar.write.estimate
gds.astar.mutate
gds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.stream
gds.beta.astar.stream.estimate
gds.beta.astar.write
gds.beta.astar.write.estimate
gds.beta.astar.mutate
gds.beta.astar.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.stream
gds.yens.stream.estimate
gds.yens.write
gds.yens.write.estimate
gds.yens.mutate
gds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.stream
gds.beta.yens.stream.estimate
gds.beta.yens.write
gds.beta.yens.write.estimate
gds.beta.yens.mutate
gds.beta.yens.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.stream
gds.shortestPath.dijkstra.stream.estimate
gds.shortestPath.dijkstra.write
gds.shortestPath.dijkstra.write.estimate
gds.shortestPath.dijkstra.mutate
gds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.stream
gds.allShortestPath.dijkstra.stream.estimate
gds.allShortestPath.dijkstra.write
gds.allShortestPath.dijkstra.write.estimate
gds.allShortestPath.dijkstra.mutate
gds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.stream
gds.beta.allShortestPath.dijkstra.stream.estimate
gds.beta.allShortestPath.dijkstra.write
gds.beta.allShortestPath.dijkstra.write.estimate
gds.beta.allShortestPath.dijkstra.mutate
gds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.stream
gds.beta.node2vec.stream.estimate
gds.beta.node2vec.write
gds.beta.node2vec.write.estimate
gds.beta.node2vec.mutate
gds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.stream
gds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactor
is renamed topositiveSamplingFactor
- The parameter
contextSamplingExponent
is renamed tonegativeSamplingExponent
maxStreakCount
configuration parameter is renamed topatience
. It is used in the train modes of Node Classification and Link Prediction.maxIterations
andminIterations
configuration parameters are renamed tomaxEpochs
andminEpochs
. It is used in the train modes of Node Classification and Link Prediction.windowSize
configuration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train
configuration parameter classRatio
is renamed to negativeClassWeight
. It is also mandatory now.
degreeAsProperty
configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutate
and use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.stream
gds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph
- Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.stream
gds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.linkPrediction.predict
. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predict
gds.alpha.ml.linkPrediction.stream
gds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimate
gds.alpha.ml.lin...
GDS 1.6 Preview
Release Date: 20 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimate
gds.degree.write.estimate
gds.degree.mutate
gds.degree.mutate.estimate
gds.degree.stats
gds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.stream
Gds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.stream
gds.articleRank.stream.estimate
gds.articleRank.write
gds.articleRank.write.estimate
gds.articleRank.mutate
gds.articleRank.mutate.estimate
gds.articleRank.stats
gds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.stream
gds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.stream
gds.eigenvector.stream.estimate
gds.eigenvector.write
gds.eigenvector.write.estimate
gds.eigenvector.mutate
gds.eigenvector.mutate.estimate
gds.eigenvector.stats
gds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.stream
Gds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.stream
gds.astar.stream.estimate
gds.astar.write
gds.astar.write.estimate
gds.astar.mutate
gds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.stream
gds.beta.astar.stream.estimate
gds.beta.astar.write
gds.beta.astar.write.estimate
gds.beta.astar.mutate
gds.beta.astar.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.stream
gds.yens.stream.estimate
gds.yens.write
gds.yens.write.estimate
gds.yens.mutate
gds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.stream
gds.beta.yens.stream.estimate
gds.beta.yens.write
gds.beta.yens.write.estimate
gds.beta.yens.mutate
gds.beta.yens.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.stream
gds.shortestPath.dijkstra.stream.estimate
gds.shortestPath.dijkstra.write
gds.shortestPath.dijkstra.write.estimate
gds.shortestPath.dijkstra.mutate
gds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.stream
gds.allShortestPath.dijkstra.stream.estimate
gds.allShortestPath.dijkstra.write
gds.allShortestPath.dijkstra.write.estimate
gds.allShortestPath.dijkstra.mutate
gds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.stream
gds.beta.allShortestPath.dijkstra.stream.estimate
gds.beta.allShortestPath.dijkstra.write
gds.beta.allShortestPath.dijkstra.write.estimate
gds.beta.allShortestPath.dijkstra.mutate
gds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.stream
gds.beta.node2vec.stream.estimate
gds.beta.node2vec.write
gds.beta.node2vec.write.estimate
gds.beta.node2vec.mutate
gds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.stream
gds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactor
is renamed topositiveSamplingFactor
- The parameter
contextSamplingExponent
is renamed tonegativeSamplingExponent
maxStreakCount
configuration parameter is renamed topatience
. It is used in the train modes of Node Classification and Link Prediction.maxIterations
andminIterations
configuration parameters are renamed tomaxEpochs
andminEpochs
. It is used in the train modes of Node Classification and Link Prediction.windowSize
configuration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train
configuration parameter classRatio
is renamed to negativeClassWeight
. It is also mandatory now.
degreeAsProperty
configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutate
and use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.stream
gds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph
- Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.stream
gds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.linkPrediction.predict
. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predict
gds.alpha.ml.linkPrediction.stream
gds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimate
gds.alpha.ml.lin...
1.5.2
Release Date: 11 May 2021
GDS 1.5 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in FastRPExtended concerning implementation internals, especially when propertyDimesion == embeddingDimension output contained NaNs.
- Fixed a bug where Alpha similarity algorithms in some cases could fail on division by 0 when writing results back.
- Fixed an issue where gds.graph.drop could take a long time when the graph contained node embeddings.
- Fixed a bug where gds.beta.graphSage.train was failing in the presence of array properties.
1.5.1
Release Date: 3 March, 2021
GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug which caused
gds.graph.list
andgds.graph.drop
to throw an error when specifying a graph with duplicate property keys by failing early. - Fixed potential ArrayIndexOutOfBoundsException when running
gds.triangleCount
on a relationship-filtered graph. - Fixed a bug that can lead to inconsistencies when writing or mutating new relationships created from a label-filtered graph.
Improvements
- Progress logging: Removed a "disabled" log message from the database startup when GDS was running in its default configuration. It is replaced with a more elaborate "enabled" message when the progress tracking feature is enabled.
- We now return the name of the current database in the error message if graph is not found.