Skip to content

Releases: neo4j/graph-data-science

Graph Data Science 2.1.7

29 Jul 16:53
Compare
Choose a tag to compare

GDS 2.1.7 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Breaking Changes

  • Link prediction pipeline training no longer accepts directed graphs. This is because the algorithm & ML techniques used by link prediction pipelines are only defined for undirected graphs.

Bug Fixes

  • Fixed a bug in modularityOptimization could incorrectly update modularity values
  • Fixed a bug where gds.restore did not correctly read values wrapped in quotes

2.1.6

21 Jul 17:05
Compare
Choose a tag to compare

GDS 2.1.6 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Bug Fixes

  • Fixed a bug where relationship types or node labels were not handled correctly when importing previously exported data via Apache Arrow.
  • Fixed a bug where gds.graphSage.[stream|write|mutate] did not use the correct relationship weights when run with concurrency > 1.

Graph Data Science 2.1.5

07 Jul 14:04
Compare
Choose a tag to compare

GDS 2.1.5 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Improvements

  • Better error handling for K-means

Graph Data Science 2.1.4

23 Jun 19:26
Compare
Choose a tag to compare

GDS 2.1.4 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Bug fixes

  • Fixed a bug where Neo4j users with admin role could not see all graphs in the catalog on GDS enterprise.
  • Fixed a bug in random graph generation where the resulting graph can end up with an incorrect relationship schema.
  • Fixed a bug where gds.graph.list and gds.graph.drop throw a NPE if the catalog contains graphs that have been created with Cypher Aggregation.

Graph Data Science 1.8.8

23 Jun 19:22
Compare
Choose a tag to compare

GDS 1.8.8 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Breaking Changes

  • The procedures gds.features.useKernelTracker and gds.features.useKernelTracker.reset have been removed.

Bug fixes

  • Fixed a bug in gds.beta.randomWalk.stream where configuring start nodes could lead to AIOOB exceptions.
  • Fix a bug in gds.graph.export where the configured database directory would not be respected.
  • Fixed a bug with running Triangle Count on filtered graphs.
  • Fixed a bug in gds.beta.graphSage when using activationFunction: 'RELU', where the training did not always compute the correct gradient.
  • Fixed a bug in gds.louvain.stream that occurrred when the consecutiveIds parameter was enabled.
  • Fixed a bug where Neo4j users with admin role could not see all graphs in the catalog on GDS enterprise.

Other Changes

  • Updated version of 'com.google.protobuf' to 3.9.12. This fixes a potential Denial of Service issue (GHSA-wrvw-hg22-4m67).

Graph Data Science 2.1.2

15 Jun 17:04
Compare
Choose a tag to compare

GDS 2.1.2 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.7

Bug fixes

  • Fixed a bug where checking for business rules around running on a Neo4j cluster could cause the cluster to fail to start.

Graph Data Science 2.1.1

13 Jun 22:44
Compare
Choose a tag to compare

GDS 2.1.1 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.7

Other Updates

  • Fixed issue with publishing compatibility artifacts

Graph Data Science 2.0.5

10 Jun 19:55
Compare
Choose a tag to compare

GDS 2.0.5 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.7

Bug Fixes

  • Fixed a bug in CollapsePath where a provided nodeFilter would be ignored. GH 194
  • Fixed a bug in RandomWalk where unconsumed stream results could leave GDS in a state where no further operations were possible
  • Fixed a bug in gds.louvain.stream which could arise when the consecutiveIds parameter was enabled.
  • Improvements

  • Improved error message for gds.beta.graph.project.subgraph when comparing expressions with incompatible types and one of them is a literal expression.
  • Handled exception where centrality scores could not be used to compute the centralityDistribution. Now we only skip the computation of the distribution but the centrality result is accessible.
  • Graph Data Science 2.1.0

    09 Jun 12:50
    Compare
    Choose a tag to compare

    GDS 2.1.0 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.7

    Breaking Changes

    • Removed the redundant information of parameter space and split config from the info of the models trained by gds.beta.pipeline.[nodeClassification|linkPrediction].train. The information is now accessible only via the Pipeline Catalog.
    • Removed the label parameter from gds.graph.removeNodeProperties.
    • Supported config parameters are timeoutInSeconds and concurrency

    New Features

    • (Enterprise Only) Apache Arrow and Flight RPC can now be used to improve certain import and export tasks:
    • New Algorithm: K-Means Clustering. Added the following procedures:
      • gds.alpha.kmeans.mutate
      • gds.alpha.kmeans.stats
      • gds.alpha.kmeans.stream
      • gds.alpha.kmeans.write
    • New Algorithm: Leiden. Added the following procedures:
      • gds.alpha.leiden.mutate
      • gds.alpha.leiden.stats
      • gds.alpha.leiden.stream
      • gds.alpha.leiden.stream
    • Added new similarity variant, Filtered Node Similarity, to alpha tier, accepting source and target node filters
      • gds.alpha.nodeSimilarity.filtered.mutate
      • gds.alpha.nodeSimilarity.filtered.stream
      • gds.alpha.nodeSimilarity.filtered.write
      • gds.alpha.nodeSimilarity.filtered.stats
    • Added new similarity variant Filtered KNN to alpha tier, accepting source and target node filters
      • gds.alpha.knn.filtered.mutate
      • gds.alpha.knn.filtered.stream
      • gds.alpha.knn.filtered.write
      • gds.alpha.knn.filtered.stats
    • Added new procedures for delta stepping:
      • gds.allShortestPaths.delta.stats
      • gds.allShortestPaths.delta.stats.estimate
    • Added new procedures for BFS:
      • gds.bfs.stats
      • gds.bfs.stats.estimate
    • Added Node Regression Pipelines with the following procedures
      • gds.alpha.pipeline.nodeRegression.create
      • gds.alpha.pipeline.nodeRegression.configureAutoTuning
      • gds.alpha.pipeline.nodeRegression.configureSplit
      • gds.alpha.pipeline.nodeRegression.addLinearRegression
      • gds.alpha.pipeline.nodeRegression.addRandomForest
      • gds.alpha.pipeline.nodeRegression.addNodeProperty
      • gds.alpha.pipeline.nodeRegression.selectFeatures
      • gds.alpha.pipeline.nodeRegression.train
      • gds.alpha.pipeline.nodeRegression.predict.stream
      • gds.alpha.pipeline.nodeRegression.predict.mutate
    • Autotuning Support for Machine Learning Pipelines:
      • Added new procedures gds.alpha.pipeline.[nodeClassification|nodeRegression|linkPrediction].configureAutoTuning.
      • Added syntax to specify ranges for parameters in gds.alpha.pipeline.[linkPrediction|nodeClassification|nodeRegression].addRandomForest, gds.beta.pipeline.[linkPrediction|nodeClassification].addLogisticRegression, and gds.alpha.nodeRegression.addLinearRegression
    • Additional Machine Learning Pipeline Functionality:
      • Exposed learningRate for the LogisticRegression models, which can be added using gds.beta.pipeline.[nodeClassification|linkPrediction].addLogisticRegression
      • Exposed minLeafSize for RandomForest models, which can be added using gds.alpha.pipeline.[nodeClassification|linkPrediction].addRandomForest
      • Exposed criterion for RandomForestClassification models, which can be added using gds.alpha.pipeline.[nodeClassification|linkPrediction].addRandomForest. Also added support for the ENTROPY impurity criterion.
      • Updated structure of modelSelectionStats yield in gds.beta.pipeline.[linkPrediction, nodeClassification].train.
      • Support OUT_OF_BAG_ERROR metric in gds.beta.pipeline.[linkPrediction, nodeClassification].train which applies only to RandomForest models.
      • Expose batchesPerIteration in gds.beta.graphSage.train to configure the number of batches considered per iteration.
    • Cypher Aggregation now accepts any INTEGER value for source and target nodes
    • Added ShardedIdMap which adds support for external node ids ranging from 0 to Long.MAX_VALUE.
      • The id map is disabled by default and can be enabled via feature toggle USE_SHARDED_ID_MAP.
    • Added procedures for exporting graph properties to the alpha tier
      • gds.alpha.graph.streamGraphProperty
      • gds.alpha.graph.removeGraphProperty
    • Exposed a new string config parameter jobId for graph projection and algorithm procedures, which allows for easier tracking of a job via e.g. gds.beta.listProgress.

    Bug fixes

    • Fixed a bug in gds.beta.pipeline.[nodeClassification|linkPrediction].addNodeProperty where gds.beta.graphSage.mutate could not be added.
    • Fixed a bug where the procedures gds.beta.pipeline.linkPrediction.predict.[mutate|stream] threw an error when given the argument initialSampler.
    • Fixed a bug with running Triangle Count on filtered graphs that could cause an ArrayIndexOutOfBounds Error.
    • Fixed a bug where graphSage.train incorrectly reported didConverge as false.
    • Fixed a bug in CollapsePath where a provided nodeFilter would be ignored (GH 194)
    • Fixed a bug in gds.louvain.stream when the consecutiveIds parameter was enabled.
    • Fixed a bug in RandomWalk where not consuming all stream results could lead to a state where GDS would become unable to run further procedures

    Improvements

    • When a query is failed by the memory guard, information is logged as well as sent to the user in the raised exception.
    • Added new methods to Pregel contexts which allow translating between internal and original node id space.
    • Machine learning pipelines
      • gds.beta.pipeline.[nodeClassification|linkPrediction].train.estimate now incorporates memory usage of random forest training into account when applicable.
      • gds.beta.pipeline.[nodeClassification|linkPrediction].predict.[mutate,stream,write].estimate now take random forest prediction memory overhead
      • Improve early validation of graph and prediction pipeline in gds.beta.pipeline.[nodeClassification|linkPrediction].predict.
      • Improve memory estimation for gds.beta.pipeline.[nodeClassification|linkPrediction].train.estimate.
      • Improve memory estimation in gds.beta.pipeline.linkPrediction.train.estimate.
      • Add training method specific debug le...
    Read more

    2.1.0-Preview

    02 Jun 21:49
    Compare
    Choose a tag to compare
    2.1.0-Preview Pre-release
    Pre-release

    GDS 2.1.0-preview is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.7

    Breaking Changes

    • Removed the redundant information of parameter space and split config from the info of the models trained by gds.beta.pipeline.[nodeClassification|linkPrediction].train. The information is now accessible only via the Pipeline Catalog.
    • Removed the label parameter from gds.graph.removeNodeProperties.
    • Supported config parameters are timeoutInSeconds and concurrency

    New Features

    • (Enterprise Only) Apache Arrow and Flight RPC can now be used to improve certain import and export tasks:
    • New Algorithm: K-Means Clustering. Added the following procedures:
      • gds.alpha.kmeans.mutate
      • gds.alpha.kmeans.stats
      • gds.alpha.kmeans.stream
    • New Algorithm: Leiden. Added the following procedures:
      • gds.alpha.leiden.mutate
      • gds.alpha.leiden.stats
      • Gds.alpha.leiden.stream
    • Added new similarity variant Filtered Node Similarity to alpha tier, accepting source and target node filters
      • gds.alpha.nodeSimilarity.filtered.mutate
      • gds.alpha.nodeSimilarity.filtered.stream
      • gds.alpha.nodeSimilarity.filtered.write
    • Added new similarity variant Filtered KNN to alpha tier, accepting source and target node filters
      • gds.alpha.knn.filtered.mutate
      • gds.alpha.knn.filtered.stream
    • Added new procedures for delta stepping:
      • gds.allShortestPaths.delta.stats
      • gds.allShortestPaths.delta.stats.estimate
    • Added new procedures for BFS:
      • Gds.bfs.stats
      • gds.bfs.stats.estimate
    • Added Node Regression Pipelines with the following procedures
      • gds.alpha.pipeline.nodeRegression.create
      • gds.alpha.pipeline.nodeRegression.configureAutoTuning
      • gds.alpha.pipeline.nodeRegression.configureSplit
      • gds.alpha.pipeline.nodeRegression.addLinearRegression
      • gds.alpha.pipeline.nodeRegression.addRandomForest
      • gds.alpha.pipeline.nodeRegression.addNodeProperty
      • gds.alpha.pipeline.nodeRegression.selectFeatures
      • gds.alpha.pipeline.nodeRegression.train
      • gds.alpha.pipeline.nodeRegression.predict.stream
      • gds.alpha.pipeline.nodeRegression.predict.mutate
    • Autotuning Support for Machine Learning Pipelines:
      • Added new procedures gds.alpha.pipeline.[nodeClassification|nodeRegression|linkPrediction].configureAutoTuning.
      • Added syntax to specify ranges for parameters in gds.alpha.pipeline.[linkPrediction|nodeClassification|nodeRegression].addRandomForest, gds.beta.pipeline.[linkPrediction|nodeClassification].addLogisticRegression, and gds.alpha.nodeRegression.addLinearRegression
    • Additional Machine Learning Pipeline Functionality:
      • Exposed learningRate for the LogisticRegression models, which can be added using gds.beta.pipeline.[nodeClassification|linkPrediction].addLogisticRegression
      • Exposed minLeafSize for RandomForest models, which can be added using gds.alpha.pipeline.[nodeClassification|linkPrediction].addRandomForest
      • Exposed criterion for RandomForestClassification models, which can be added using gds.alpha.pipeline.[nodeClassification|linkPrediction].addRandomForest. Also added support for the ENTROPY impurity criterion.
      • Updated structure of modelSelectionStats yield in gds.beta.pipeline.[linkPrediction, nodeClassification].train.
      • Support OUT_OF_BAG_ERROR metric in gds.beta.pipeline.[linkPrediction, nodeClassification].train which applies only to RandomForest models.
      • Expose batchesPerIteration in gds.beta.graphSage.train to configure the number of batches considered per iteration.
    • Cypher Aggregation now accepts any INTEGER value for source and target nodes
    • Added ShardedIdMap which adds support for external node ids ranging from 0 to Long.MAX_VALUE.
      • The id map is disabled by default and can be enabled via feature toggle USE_SHARDED_ID_MAP.
    • Added procedures for exporting graph properties to the alpha tier
      • gds.alpha.graph.streamGraphProperty
      • gds.alpha.graph.removeGraphProperty
    • Exposed a new string config parameter jobId for graph projection and algorithm procedures, which allows for easier tracking of a job via e.g. gds.beta.listProgress.

    Bug fixes

    • Fixed a bug in gds.beta.pipeline.[nodeClassification|linkPrediction].addNodeProperty where gds.beta.graphSage.mutate could not be added.
    • Fixed a bug where the procedures gds.beta.pipeline.linkPrediction.predict.[mutate|stream] threw an error when given the argument initialSampler.
    • Fixed a bug with running Triangle Count on filtered graphs that could cause an ArrayIndexOutOfBounds Error.
    • Fixed a bug where graphSage.train incorrectly reported didConverge as false.
    • Fixed a bug in CollapsePath where a provided nodeFilter would be ignored.
    • Fixed a bug in gds.louvain.stream when the consecutiveIds parameter was enabled.
    • Fixed a bug in RandomWalk where not consuming all stream results could lead to a state where GDS would become unable to run further procedures

    Improvements

    • When a query is failed by the memory guard, information is logged as well as sent to the user in the raised exception.
    • Machine learning pipelines
      • gds.beta.pipeline.[nodeClassification|linkPrediction].train.estimate now incorporates memory usage of random forest training into account when applicable.
      • gds.beta.pipeline.[nodeClassification|linkPrediction].predict.[mutate,stream,write].estimate now take random forest prediction memory overhead
      • Improve early validation of graph and prediction pipeline in gds.beta.pipeline.[nodeClassification|linkPrediction].predict.
      • Improve memory estimation for gds.beta.pipeline.[nodeClassification|linkPrediction].train.estimate.
      • Improve memory estimation in gds.beta.pipeline.linkPrediction.train.estimate.
      • Add training method specific debug level logging during the model selection phase of gds.beta.pipeline.linkPrediction.train, gds.beta.pipeline.nodeClassification.train and gds.alpha.pipeline.nodeRegression.train.
      • Improved logging in Link Prediction and Node Classification training.
      • Reduced computational complexity and constant overhead of random forest training, added via gds.alpha.pipeline[linkPrediction|nodeClassification].addRandomFor...
    Read more