Skip to content

Conversation

@antas-marcin
Copy link
Contributor

@antas-marcin antas-marcin commented Nov 17, 2025

This PR adds switches Weaviate Java v5 client to latest client6

@antas-marcin antas-marcin requested a review from a team as a code owner November 17, 2025 15:44
Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@antas-marcin antas-marcin force-pushed the client6-java-switch branch 3 times, most recently from e43d19b to fbbff3e Compare November 17, 2025 16:46
@antas-marcin antas-marcin changed the title Switch to Weaviate Java Client6 Switch spark connector to use Java client6 Nov 17, 2025
@antas-marcin antas-marcin changed the title Switch spark connector to use Java client6 Switch spark connector to use Weaviate Java client6 Nov 17, 2025
@antas-marcin antas-marcin force-pushed the client6-java-switch branch 2 times, most recently from 54875c1 to 694010a Compare November 26, 2025 12:02
assert(weaviateObject.getProperties.get("content") == "Sam")
assert(weaviateObject.getProperties.get("wordCount") == 5)
assert(weaviateObject.getTenant == "TenantA")
assert(weaviateObject.properties().get("title").equals("Sam"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(weaviateObject.properties().get("title").equals("Sam"))
assert(weaviateObject.properties().get("title") == ("Sam"))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about Scala, but in Java you actually want to compare strings with .equals, because it compares the contents themselves, whereas == compares object references (which will always differ).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvmnd, seems Scala doesn't have that catch 👍

Copy link

@g-despot g-despot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

build.sbt Outdated
lazy val scalaCollectionCompatVersion = "2.13.0"
lazy val sparkVersion = "4.0.1"
lazy val grpcNettyShadedVersion = "1.76.0"
lazy val weaviateClient6Version = "6.0.0-RC2"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use 6.0.0 now ✨

Comment on lines +19 to +20
if (result.isEmpty) throw WeaviateClassNotFoundError(s"Collection ${className} was not found.")
val properties = result.get().properties().asScala
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: if you want to lean into Optional syntax hard

Suggested change
if (result.isEmpty) throw WeaviateClassNotFoundError(s"Collection ${className} was not found.")
val properties = result.get().properties().asScala
if (result.isEmpty) throw WeaviateClassNotFoundError(s"Collection ${className} was not found.")
val properties = result
.map(c -> c.properties().asScala)
.orElseThrow(() -> WeaviateClassNotFoundError(s"Collection ${className} was not found."))

Comment on lines 111 to 113
if (weaviateOptions.id == null) {
builder.id(java.util.UUID.randomUUID.toString)
builder.uuid(java.util.UUID.randomUUID.toString)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: this if-block is not strictly necessary

I believe this was originally added because BatchInsert over gRPC does not allow the client to leave out the UUID (unlike REST-based inserts).

It's something I also stumbled upon while working on client6, so in the new version every WeaviateObject gets a random UUID by default.

Comment on lines +115 to 126
val allVectors = ListBuffer.empty[Vectors]
if (vector != null) {
allVectors += Vectors.of(vector)
}
if (vectors.nonEmpty) {
builder.vectors(vectors.map { case (key, arr) => key -> arr.map(Float.box) }.asJava)
val arr = vectors.map { case (key, arr) => Vectors.of(key, arr) }.toArray
allVectors ++= arr
}
if (multiVectors.nonEmpty) {
builder.multiVectors(multiVectors.map { case (key, multiVector) => key -> multiVector.map { vec => { vec.map(Float.box) }} }.toMap.asJava)
val arr = multiVectors.map { case (key, multiVector) => Vectors.of(key, multiVector) }.toArray
allVectors ++= arr
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: there's now Vectors::withVectors method, that should let you do sth like:

val vectors = new Vectors();

if (vector != null) {
  vectors = vectors.withVectors(Vectors.of(vector))
}

if (vectors.nonEmpty) {
  // ... so on ...
}

But if the list-syntax works well for you there's also no real need to change it, bc withVectors creates more intermediate objects (more GC work).

dt = p.dataTypes().get(0)
}
})
if ((dt == "geoCoordinates" || dt == "phoneNumber") && valueFromField.isInstanceOf[String]) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: you can also use data type names from the library itself, e.g. DataType.GEO_COORDINATES or DataType.PHONE_NUMBER


override def abort(): Unit = {
// TODO rollback previously written batch results if issue occured
// TODO rollback previously written batch results if issue occurred
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is that something Weaviate supports? Rolling things back?

assert(weaviateObject.getProperties.get("content") == "Sam")
assert(weaviateObject.getProperties.get("wordCount") == 5)
assert(weaviateObject.getTenant == "TenantA")
assert(weaviateObject.properties().get("title").equals("Sam"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvmnd, seems Scala doesn't have that catch 👍

Comment on lines +65 to +82
val moreNested = new Property.Builder("moreNested", DataType.OBJECT).
nestedProperties(Property.text("a"), Property.number("b")).build()

val nestedObjects = new Property.Builder("nestedObjects", DataType.OBJECT_ARRAY)
.nestedProperties(
Property.bool("nestedBoolLvl2"),
Property.date("nestedDateLvl2"),
Property.numberArray("nestedNumbersLvl2"),
moreNested
).build()

new Property.Builder("objectProperty", dataType)
.nestedProperties(
Property.integer("nestedInt"),
Property.number("nestedNumber"),
Property.text("nestedText"),
nestedObjects
).build()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: you should be able to do

  Property.object("objectProperty", obj -> obj
    .nestedProperties(
      ...,
      Property.numberArray("nestedNumbersLvl2"),
      Property.object("moreNested", more -> more
        .nestedProperties(
          Property.text("a"),
          Property.number("b")
        )
      )
    )
  )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants