From 8e7d3a96e442d8b75eb8c7df4bc654c6b0c816cc Mon Sep 17 00:00:00 2001 From: sbespalov Date: Wed, 18 Mar 2020 10:18:02 +0700 Subject: [PATCH 1/7] issues/1649 Getting started with persistence Documentation --- .../getting-started-with-persistence.md | 218 ++++-------------- 1 file changed, 40 insertions(+), 178 deletions(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index db9d3c90..bdeff32e 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -4,61 +4,36 @@ This page contains explanations and code samples for developers who need to store their entities into the database. -The Strongbox project uses [OrientDB](http://orientdb.com/orientdb/) as its internal persistent storage through the -corresponding `JPA` implementation and `spring-orm` middle tier. Also we use `JTA` for transaction management and -`spring-tx` implementation module from Spring technology stack. +The Strongbox project uses [JanusGraph](https://janusgraph.org/) as its internal persistent storage through the +corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. Also we use `JTA` for transaction management and `spring-tx` implementation module from Spring technology stack. -## OrientDB Studio +## Persistence stack -As you are learning about Strongbox persistence, you may want to explore the existing persistence implementation. -For development environments, Strongbox includes an embedded OrientDB server as well as an embedded instance of -OrientDB Studio. By default, when you run the application from the source tree, you'll use the embedded database -server. However, OrientDB Studio is disabled by default. +We using following technology stack to deal with persistence: -### Running OrientDB Studio From Source Tree + - Embedded Cassandra as direct storage (`CassandraDaemon` allows to have the Cassandra instance inside same JVM as the application) + - JanusGraph as Graph DBMS (it is not directly a data storage, it just allows you to have access to data in the form of a graph) + - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database (mainly Gremlin meant here) + - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries means `@org.springframework.data.neo4j.annotation.Query` annotation) + - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevents us to use it for `neo4j-ogm` CRUD operations, these issues will be explained below) + - [neo4j-ogm](https://github.com/neo4j/neo4j-ogm) to map Java POJOs into Vertices and Edges of Graph + - we also use custom `EntityTraversalAdapters`, which implements anonimous Gremlin traversals for CRUD operations under `neo4j-ogm` entities. -To enable OrientDB Studio, you need only to set the property `strongbox.orientdb.studio.enabled` to `true`. You -can do this on the Maven command line by running Strongbox as follows: +# Vertices and Edges -``` -$ mvn spring-boot:run -Dspring-boot.run.jvmArguments="-Dstrongbox.orientdb.studio.enabled=true" -``` - -There are two additional properties that can be used to configure OrientDB Studio: - -- `strongbox.orientdb.studio.ip.address` -- `strongbox.orientdb.studio.port` - -### Running OrientDB Studio From The Distribution - -If you're running from the `tar.gz`, or `rpm` distributions, you can start Strongbox as follows to enable OrientDB Studio: - -``` -$ cd /opt/strongbox -$ STRONGBOX_VAULT=/opt/strongbox-vault STRONGBOX_ORIENTDB_STUDIO_ENABLED=true ./bin/strongbox console -``` - -Please, note that the `STRONGBOX_VAULT` environment variable needs to be pointing to an absolute path for this to work. +Unlike a Relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So in terms of Graph every persistent entity should be stored as Vertex or Edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by separate edge instead of just foreign key column in table. In addition to vertices, persistence objects can also be an edges, as an example the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices. -As with the source distribution, you can set additional environment variables to further configure OrientDB Studio: +# Issues of `cypher-for-gremlin` and `neo4j-ogm` -``` -$ export STRONGBOX_ORIENTDB_STUDIO_IP_ADDRESS=0.0.0.0 -$ export STRONGBOX_ORIENTDB_STUDIO_PORT=2480 -``` - -Once the application is running, you can login to OrientDB Studio by visiting -http://127.0.0.1:2480/studio/index.html in your browser. The initial credentials are `admin` and `password`. +First issue was the fact that `cypher-for-gremlin` not fully suport all Cypher syntax that produced by `neo4j-ogm` for CRUD operations. In more detail on every CRUD operation `neo4j-ogm` generate Cypher query which then translates into Gremlin by `cypher-for-gremlin`. As a workadound we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). -![Login Screen](/assets/screenshots/orientdb-studio/login-screen.png) +Another issue is that `cypher-for-gremlin` have some doubtful concept to work with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents JanusGraph engine to match expected indexes, this causes heavy fullscan on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason of why we can't use `neo4j-ogm` for CRUD operations. +Anyway we still using it for custom Cypher queries with `@org.springframework.data.neo4j.annotation.Query` annotation. This is good option to have Cypher queries instead of Gremlin because it looks more clear and takes less time to read existing and write new queries. -After your login, you'll land on the Browse Screen, which allows you to query the embedded database. +## Gremlin Server -![Browse Screen](/assets/screenshots/orientdb-studio/browse-screen.png) +`TODO` -Finally, you can explore the schema defined in the database by clicking `SCHEMA`. - -![Schema Screen](/assets/screenshots/orientdb-studio/schema-screen.png) ## Adding Dependencies @@ -75,173 +50,60 @@ following code snippet to your module's `pom.xml` under the `` sec ``` -Notice that there is no need to define any direct dependencies on OrientDB or Spring Data - it's already done via +Notice that there is no need to define any direct dependencies on JanusGraph or Spring Data - it's already done via the `strongbox-data-service` module. ## Creating Your Entity Class Let's now assume that you have a POJO and you need to save it to the database (and that you probably have at least -CRUD operation's implemented in it as well). Place your code under the `org.carlspring.strongbox.domain.yourstuff` -package. For the sake of the example, let's pick `MyEntity` as the name of your entity. +CRUD operation's implemented in it as well). Place your code under the `org.carlspring.strongbox.domain` +package. For the sake of the example, let's pick `PetEntity` as the name of your entity. If you want to store that entity properly you need to adopt the following rules: -* Extend the `org.carlspring.strongbox.data.domain.GenericEntity` class to inherit all required fields and logic from - the superclass. -* Define getters and setters according to the `JavaBeans` coding convention for all non-transient properties in your - class. -* Define a default empty constructor for safety (even if the compiler will create one for you, if you don't define any - other constructors) and follow the `JPA` and `java.io.Serializable` standards. -* Override the `equals() `and `hashCode()` methods according to java `hashCode` contract (because your entity could be - used in collection classes such as `java.util.Set` and if you don't define such methods properly other developers or - yourself will be not able to use your entity). -* _Optional_ - define a `toString()` implementation to let yourself and other developers see something meaningful in - the debug messages. +* Create the interface for your entity with all getters and setters that required to interact with the entity according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. The need for an interface is due to hide the implementation specific to underlying database, such as inheritance strategy. +* Create the entity class which implements the above interface and have the `org.carlspring.strongbox.data.domain.DomainEntity` as the superclass. +* Define a default empty constructor, this would need to create entity instance from `neo4j-ogm` internals. The complete source code example that follows all requirements should look something like this: ```java package org.carlspring.strongbox.domain; -import org.carlspring.strongbox.data.domain.GenericEntity; - -import com.google.common.base.Objects; - -public class MyEntity - extends GenericEntity +public class PetEntity + extends DomainEntity + implements Pet { - private String property; - - public MyEntity() - { - } - - public String getProperty() - { - return property; - } + private Integer age; - public void setProperty(String property) + public PetEntity() { - this.property = property; } @Override - public boolean equals(Object o) + public Integer getAge() { - if (this == o) - { - return true; - } - if (o == null || getClass() != o.getClass()) - { - return false; - } - - MyEntity myEntity = (MyEntity) o; - - return Objects.equal(property, myEntity.property); + return age; } @Override - public int hashCode() + public void setAge(Integer age) { - return Objects.hashCode(property); + this.age = age; } - @Override - public String toString() - { - final StringBuilder sb = new StringBuilder("MyEntity{"); - sb.append("property='").append(property).append('\''); - sb.append('}'); - - return sb.toString(); - } } ``` -## Creating a DAO Layer - -First of all you will need to extend the `CrudService` with the second type parameter that corresponds to your ID's data type. Usually it's just strings. - - -!!! tip "To read more about ID's in OrientDB, check the manual" - -```java -package org.carlspring.strongbox.users.service; - -import org.carlspring.strongbox.data.service.CrudService; -import org.carlspring.strongbox.users.domain.MyEntity; - -import org.springframework.transaction.annotation.Transactional; - -/** - * CRUD service for managing {@link MyEntity} entities. - * - * @author Alex Oreshkevich - */ -@Transactional -public interface MyEntityService - extends CrudService -{ - - MyEntity findByProperty(String property); +# Gremlin Repositories -} -``` +As mentioned above besides `neo4j-ogm` we were forced to have custom CRUD implementation based on Gremlin. This has its advantages as it allow for us to optimize OGM entities and make them faster then common `neo4j-ogm` provide out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update, read/delete operations. The concrete `EntityTraversalAdapter` provide anonymous traversals for each of the operations on specific entity type. These traversals used in Gremlin based repositories to perform common CRUD operations. The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations. -After that you will need to define an implementation of your service class. - -Follow these rules for the service implementation: - -* Inherit your CRUD service from `CommonCrudService` class; -* Name it like your service interface with an `Impl` suffix, for example `MyEntityServiceImpl`; -* Annotate your class with the Spring `@Service` and `@Transactional` annotations; -* Do **not** define your service class as public and use interface instead of class for injection (with `@Autowired`); - this follows the best practice principles from Joshua Bloch 'Effective Java' book called Programming to Interface; -* _Optional_ - define any methods you need to work with your `MyEntity` class; these methods mostly should be based on - common API form `javax.persistence.EntityManager`, or custom queries (see example below); - -* !!! warning "Avoid query parameters construction through string concatenation!" - Please avoid using query parameter construction through string concatenation! - This usually leads to [SQL Injection](https://en.wikipedia.org/wiki/SQL_injection) issues! - Bad query example: - `String sQuery = "select * from MyEntity where proprety='" + propertyValue + "'"`; - What you should do instead is to create a service which does properly assigns the parameters. - Here's an example service: - ```java - @Transactional - public class MyEntityServiceImpl - extends CommonCrudService implements MyEntityService - { - public MyEntity findByProperty(String property) - { - String sQuery = "select * from MyEntity where property = :propertyValue"; - - OSQLSynchQuery oQuery = new OSQLSynchQuery(sQuery); - oQuery.setLimit(1); - - HashMap params = new HashMap(); - params.put("propertyValue", property); - - List resultList = getDelegate().command(oQuery).execute(params); - return !resultList.isEmpty() ? resultList.iterator().next() : null; - } - } - ``` - -## Register entity schema in EntityManager -Before using entities you will need to register them. Consider the following example: +## Creating a `EntityTraversalAdapter` +`TODO` -```java -@Inject -private OEntityManager oEntityManager; +## Creating a `Repository` +`TODO` -@PostConstruct -public void init() -{ - oEntityManager.registerEntityClass(MyEntity.class); -} ``` From bac3f727e78cab8d7b0148b2f77367380d161808 Mon Sep 17 00:00:00 2001 From: sbespalov Date: Wed, 18 Mar 2020 15:30:23 +0700 Subject: [PATCH 2/7] issues/1649 Getting started with persistence Documentation --- .../getting-started-with-persistence.md | 158 ++++++++++++++++-- 1 file changed, 142 insertions(+), 16 deletions(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index bdeff32e..485fb244 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -13,7 +13,7 @@ We using following technology stack to deal with persistence: - Embedded Cassandra as direct storage (`CassandraDaemon` allows to have the Cassandra instance inside same JVM as the application) - JanusGraph as Graph DBMS (it is not directly a data storage, it just allows you to have access to data in the form of a graph) - - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database (mainly Gremlin meant here) + - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries means `@org.springframework.data.neo4j.annotation.Query` annotation) - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevents us to use it for `neo4j-ogm` CRUD operations, these issues will be explained below) - [neo4j-ogm](https://github.com/neo4j/neo4j-ogm) to map Java POJOs into Vertices and Edges of Graph @@ -23,13 +23,6 @@ We using following technology stack to deal with persistence: Unlike a Relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So in terms of Graph every persistent entity should be stored as Vertex or Edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by separate edge instead of just foreign key column in table. In addition to vertices, persistence objects can also be an edges, as an example the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices. -# Issues of `cypher-for-gremlin` and `neo4j-ogm` - -First issue was the fact that `cypher-for-gremlin` not fully suport all Cypher syntax that produced by `neo4j-ogm` for CRUD operations. In more detail on every CRUD operation `neo4j-ogm` generate Cypher query which then translates into Gremlin by `cypher-for-gremlin`. As a workadound we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). - -Another issue is that `cypher-for-gremlin` have some doubtful concept to work with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents JanusGraph engine to match expected indexes, this causes heavy fullscan on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason of why we can't use `neo4j-ogm` for CRUD operations. -Anyway we still using it for custom Cypher queries with `@org.springframework.data.neo4j.annotation.Query` annotation. This is good option to have Cypher queries instead of Gremlin because it looks more clear and takes less time to read existing and write new queries. - ## Gremlin Server `TODO` @@ -56,13 +49,14 @@ the `strongbox-data-service` module. ## Creating Your Entity Class Let's now assume that you have a POJO and you need to save it to the database (and that you probably have at least -CRUD operation's implemented in it as well). Place your code under the `org.carlspring.strongbox.domain` +CRUD operations implemented in it as well). Place your code under the `org.carlspring.strongbox.domain` package. For the sake of the example, let's pick `PetEntity` as the name of your entity. If you want to store that entity properly you need to adopt the following rules: -* Create the interface for your entity with all getters and setters that required to interact with the entity according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. The need for an interface is due to hide the implementation specific to underlying database, such as inheritance strategy. -* Create the entity class which implements the above interface and have the `org.carlspring.strongbox.data.domain.DomainEntity` as the superclass. +* Create the interface for your entity with all getters and setters that required to interact with the entity, according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. The need for an interface is due to hide the implementation specific details depending on underlying database, such as inheritance strategy. +* Create the entity class which implements the above interface and have the `org.carlspring.strongbox.data.domain.DomainEntity` as the superclass. +* Declare entity class with `@NodeEntity` or `@RelationshipEntity` * Define a default empty constructor, this would need to create entity instance from `neo4j-ogm` internals. The complete source code example that follows all requirements should look something like this: @@ -70,6 +64,7 @@ The complete source code example that follows all requirements should look somet ```java package org.carlspring.strongbox.domain; +@NodeEntity("Pet") public class PetEntity extends DomainEntity implements Pet @@ -96,14 +91,145 @@ public class PetEntity } ``` -# Gremlin Repositories +## Creating a `EntityTraversalAdapter` -As mentioned above besides `neo4j-ogm` we were forced to have custom CRUD implementation based on Gremlin. This has its advantages as it allow for us to optimize OGM entities and make them faster then common `neo4j-ogm` provide out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update, read/delete operations. The concrete `EntityTraversalAdapter` provide anonymous traversals for each of the operations on specific entity type. These traversals used in Gremlin based repositories to perform common CRUD operations. The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations. +As mentioned above besides `neo4j-ogm` and `spring-data-neo4j` we were forced to use custom CRUD implementations based on Gremlin. This has its advantages as it allow for us to optimize OGM entities and make them faster then common `neo4j-ogm` provide out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update, read/delete operations. The concrete `EntityTraversalAdapter` provide [Anonymous traversals](http://tinkerpop.apache.org/docs/current/tutorials/gremlins-anatomy/) for each of the operations on specific entity type. These traversals used in Gremlin based repositories to perform common CRUD operations: -## Creating a `EntityTraversalAdapter` -`TODO` +- `fold` to construct entity instance based on Vertex/Edge and it's properties +- `unfold` to extract entity properties into Vertex/Edge and it's properties +- `cascade` to cascade other Vertices/Edges within delete if needed + +The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations. + +Below is the code example of `EntityTraversalAdapter` implementation for `PetEntity`: + +```java +package org.carlspring.strongbox.gremlin.adapters; + +import static org.carlspring.strongbox.gremlin.adapters.EntityTraversalUtils.extractObject; + +import java.util.Collections; +import java.util.Map; +import java.util.Set; + +import org.apache.tinkerpop.gremlin.process.traversal.Traverser; +import org.apache.tinkerpop.gremlin.structure.Element; +import org.apache.tinkerpop.gremlin.structure.Vertex; +import org.carlspring.strongbox.domain.Pet; +import org.carlspring.strongbox.domain.PetEntity; +import org.carlspring.strongbox.gremlin.dsl.EntityTraversal; +import org.carlspring.strongbox.gremlin.dsl.__; +import org.springframework.stereotype.Component; + +@Component +public class PetAdapter extends VertexEntityTraversalAdapter +{ + + @Override + public Set labels() + { + return Collections.singleton("Pet"); + } + + @Override + public EntityTraversal fold() + { + return __.project("uuid", "age") + .by(__.enrichPropertyValue("uuid")) + .by(__.enrichPropertyValue("age")) + .map(this::map); + } + + private Pet map(Traverser> t) + { + PetEntity result = new PetEntity(); + result.setUuid(extractObject(String.class, t.get().get("uuid"))); + result.setAge(extractObject(Integer.class, t.get().get("age"))); + + return result; + } + + @Override + public UnfoldEntityTraversal unfold(Pet entity) + { + EntityTraversal t = __.identity(); + if (entity.getAge() != null) + { + t = t.property(single, "age", entity.getAge()); + } + + return new UnfoldEntityTraversal<>("Pet", t); + } + + @Override + public EntityTraversal cascade() + { + return __.identity(); + } + +} + +``` ## Creating a `Repository` -`TODO` + +All the database interactions should be done through repositories. For the compatibility with `spring-data` we use `org.springframework.data.repository.CrudRepository` as a basis for our repositories. The base class for implementing `EntityTraversalAdapter`-based repositories is `org.carlspring.strongbox.gremlin.repositories.GremlinRepository`. Further repository implementation depends on the type of entity, for Vertex backed entities it should be `GremlinVertexRepository`. +In addition to CRUD operations, there is also need the ability to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` and `@org.springframework.data.neo4j.annotation.Query` annotation. So the final repository should be a class that extends `GremlinRepository` and delegates custom `Cypher` queries into `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`. + +Putting all above together the repository for the `PetEntity` will looks like below: + +``` +package org.carlspring.strongbox.repositories; + +import javax.inject.Inject; + +import org.carlspring.strongbox.domain.Pet; +import org.carlspring.strongbox.gremlin.adapters.PetAdapter; +import org.carlspring.strongbox.gremlin.repositories.GremlinVertexRepository; +import org.springframework.stereotype.Repository; + +@Repository +public class PetRepository extends GremlinVertexRepository + implements PetQueries +{ + + @Inject + PetAdapter adapter; + + @Inject + PetQueries queries; + + @Override + protected PetAdapter adapter() + { + return adapter; + } + + List findByAgeGreater(Integer age) + { + return queries.findByAgeGreater(age); + } + +} + +@Repository +interface PetQueries + extends org.springframework.data.repository.Repository +{ + + @Query("MATCH (pet:Pet) " + + "WHERE pet.age > $age " + + "RETURN pet") + List findByAgeGreater(@Param("age") Integer age); + +} +``` + +# Issues of `cypher-for-gremlin` and `neo4j-ogm` + +First issue was the fact that `cypher-for-gremlin` not fully suport all Cypher syntax that produced by `neo4j-ogm` for CRUD operations. In more detail on every CRUD operation `neo4j-ogm` generate Cypher query which then translates into Gremlin by `cypher-for-gremlin`. As a workadound we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). + +Another issue is that `cypher-for-gremlin` have some doubtful concept to work with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents JanusGraph engine to match expected indexes, this causes heavy fullscan on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason of why we can't use `neo4j-ogm` for CRUD operations. +Anyway we still using it for custom Cypher queries with `@org.springframework.data.neo4j.annotation.Query` annotation. This is good option to have Cypher queries instead of Gremlin because it looks more clear and takes less time to read existing and write new queries. ``` From ada57b73c71f95fe53f409df88c936a2298cb5ff Mon Sep 17 00:00:00 2001 From: sbespalov Date: Wed, 18 Mar 2020 15:35:02 +0700 Subject: [PATCH 3/7] issues/1649 Getting started with persistence Documentation --- docs/developer-guide/getting-started-with-persistence.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index 485fb244..346e0e02 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -174,7 +174,7 @@ public class PetAdapter extends VertexEntityTraversalAdapter ## Creating a `Repository` All the database interactions should be done through repositories. For the compatibility with `spring-data` we use `org.springframework.data.repository.CrudRepository` as a basis for our repositories. The base class for implementing `EntityTraversalAdapter`-based repositories is `org.carlspring.strongbox.gremlin.repositories.GremlinRepository`. Further repository implementation depends on the type of entity, for Vertex backed entities it should be `GremlinVertexRepository`. -In addition to CRUD operations, there is also need the ability to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` and `@org.springframework.data.neo4j.annotation.Query` annotation. So the final repository should be a class that extends `GremlinRepository` and delegates custom `Cypher` queries into `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`. +In addition to CRUD operations, there is also need the ability to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` and `@org.springframework.data.neo4j.annotation.Query` annotation. So the final repository should be a mixin that extends `GremlinRepository` and delegates custom `Cypher` queries into `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`. Putting all above together the repository for the `PetEntity` will looks like below: From 2ecd8ab35727fde57a8fa0cdfcfe2918e526e57e Mon Sep 17 00:00:00 2001 From: Martin Todorov Date: Wed, 18 Mar 2020 19:55:54 +0000 Subject: [PATCH 4/7] Update getting-started-with-persistence.md --- .../getting-started-with-persistence.md | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index 346e0e02..5c584a67 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -5,23 +5,23 @@ This page contains explanations and code samples for developers who need to store their entities into the database. The Strongbox project uses [JanusGraph](https://janusgraph.org/) as its internal persistent storage through the -corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. Also we use `JTA` for transaction management and `spring-tx` implementation module from Spring technology stack. +corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. Also we use `JTA` for transaction management and the `spring-tx` implementation module from the Spring technology stack. ## Persistence stack -We using following technology stack to deal with persistence: +We're using the following technology stack to deal with persistence: - - Embedded Cassandra as direct storage (`CassandraDaemon` allows to have the Cassandra instance inside same JVM as the application) - - JanusGraph as Graph DBMS (it is not directly a data storage, it just allows you to have access to data in the form of a graph) - - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database - - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries means `@org.springframework.data.neo4j.annotation.Query` annotation) - - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevents us to use it for `neo4j-ogm` CRUD operations, these issues will be explained below) + - Embedded Cassandra as direct storage (`CassandraDaemon` allows us to have the Cassandra instance inside the same JVM as the application) + - JanusGraph as our graph DBMS (it is not directly a data storage, it just allows you to have access to data in the form of a graph) + - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database + - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries via the `@org.springframework.data.neo4j.annotation.Query` annotation) + - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevent us from using it for `neo4j-ogm` CRUD operations, these issues will be explained below) - [neo4j-ogm](https://github.com/neo4j/neo4j-ogm) to map Java POJOs into Vertices and Edges of Graph - - we also use custom `EntityTraversalAdapters`, which implements anonimous Gremlin traversals for CRUD operations under `neo4j-ogm` entities. + - We also use custom `EntityTraversalAdapters`, which implement anonimous Gremlin traversals for CRUD operations under `neo4j-ogm` entities. # Vertices and Edges -Unlike a Relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So in terms of Graph every persistent entity should be stored as Vertex or Edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by separate edge instead of just foreign key column in table. In addition to vertices, persistence objects can also be an edges, as an example the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices. +Unlike a relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So, in terms of Graph, every persistent entity should be stored as vertex or edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by a separate edge, instead of just a foreign key column in a table. In addition to vertices, persistence objects can also be edges -- for example, the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices. ## Gremlin Server @@ -30,7 +30,7 @@ Unlike a Relational DBMS, Graph DBMS have vertices and edges, not rows and table ## Adding Dependencies -Let's assume that you, as a Strongbox developer, need to create a new module or write some persistence code in an +Let's assume that you, as a Strongbox developer, need to create a new module, or write some persistence code in an existing module that does not contain any persistence dependencies yet. (Otherwise you will already have the proper `` section in your `pom.xml`, similar to the one in the example below). You will need to add the following code snippet to your module's `pom.xml` under the `` section: @@ -54,9 +54,9 @@ package. For the sake of the example, let's pick `PetEntity` as the name of your If you want to store that entity properly you need to adopt the following rules: -* Create the interface for your entity with all getters and setters that required to interact with the entity, according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. The need for an interface is due to hide the implementation specific details depending on underlying database, such as inheritance strategy. -* Create the entity class which implements the above interface and have the `org.carlspring.strongbox.data.domain.DomainEntity` as the superclass. -* Declare entity class with `@NodeEntity` or `@RelationshipEntity` +* Create the interface for your entity with all the getters and setters that are required to interact with the entity, according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. We need an interface in order to hide the implementation-specific details that depend on the underlying database, such as inheritance strategy. +* Create the entity class which implements the above interface and extend to `org.carlspring.strongbox.data.domain.DomainEntity`. +* Declare an entity class with `@NodeEntity` or `@RelationshipEntity`. * Define a default empty constructor, this would need to create entity instance from `neo4j-ogm` internals. The complete source code example that follows all requirements should look something like this: From 45675da0bf51d859e0da7b392fcf0385f26584df Mon Sep 17 00:00:00 2001 From: Martin Todorov Date: Wed, 18 Mar 2020 21:01:00 +0000 Subject: [PATCH 5/7] Update getting-started-with-persistence.md --- .../getting-started-with-persistence.md | 22 +++++++++---------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index 5c584a67..24bdbed7 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -93,11 +93,11 @@ public class PetEntity ## Creating a `EntityTraversalAdapter` -As mentioned above besides `neo4j-ogm` and `spring-data-neo4j` we were forced to use custom CRUD implementations based on Gremlin. This has its advantages as it allow for us to optimize OGM entities and make them faster then common `neo4j-ogm` provide out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update, read/delete operations. The concrete `EntityTraversalAdapter` provide [Anonymous traversals](http://tinkerpop.apache.org/docs/current/tutorials/gremlins-anatomy/) for each of the operations on specific entity type. These traversals used in Gremlin based repositories to perform common CRUD operations: +As mentioned above, besides `neo4j-ogm` and `spring-data-neo4j`, we were forced to use custom CRUD implementations based on Gremlin. This has its advantages, as it allows us to optimize OGM entities and make them faster than what the common `neo4j-ogm` provides out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update/read/delete operations. The concrete `EntityTraversalAdapter` provides [Anonymous Traversals](http://tinkerpop.apache.org/docs/current/tutorials/gremlins-anatomy/) for each operation of the specific entity type. These traversals are used in Gremlin-based repositories to perform common CRUD operations: -- `fold` to construct entity instance based on Vertex/Edge and it's properties -- `unfold` to extract entity properties into Vertex/Edge and it's properties -- `cascade` to cascade other Vertices/Edges within delete if needed +- `fold` : to construct entity instance based on vertex/edge and its properties +- `unfold` : to extract entity properties into vertex/edge and its properties +- `cascade` : to cascade other vertices/edges within delete if needed The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations. @@ -173,10 +173,10 @@ public class PetAdapter extends VertexEntityTraversalAdapter ## Creating a `Repository` -All the database interactions should be done through repositories. For the compatibility with `spring-data` we use `org.springframework.data.repository.CrudRepository` as a basis for our repositories. The base class for implementing `EntityTraversalAdapter`-based repositories is `org.carlspring.strongbox.gremlin.repositories.GremlinRepository`. Further repository implementation depends on the type of entity, for Vertex backed entities it should be `GremlinVertexRepository`. -In addition to CRUD operations, there is also need the ability to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` and `@org.springframework.data.neo4j.annotation.Query` annotation. So the final repository should be a mixin that extends `GremlinRepository` and delegates custom `Cypher` queries into `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`. +All the database interactions should be done through repositories. For the compatibility with `spring-data`, we use `org.springframework.data.repository.CrudRepository` as a basis for our repositories. The base class for implementing `EntityTraversalAdapter`-based repositories is `org.carlspring.strongbox.gremlin.repositories.GremlinRepository`. Further repository implementation depends on the type of entity; for vertex-backed entities, it should be `GremlinVertexRepository`. +In addition to CRUD operations, there is also the need to be able to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` using the `@org.springframework.data.neo4j.annotation.Query` annotation. So, the final repository should be a mixin that extends `GremlinRepository` and delegates custom `Cypher` queries to the `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`. -Putting all above together the repository for the `PetEntity` will looks like below: +Putting together all the above, the repository for the `PetEntity` will look like below: ``` package org.carlspring.strongbox.repositories; @@ -227,9 +227,9 @@ interface PetQueries # Issues of `cypher-for-gremlin` and `neo4j-ogm` -First issue was the fact that `cypher-for-gremlin` not fully suport all Cypher syntax that produced by `neo4j-ogm` for CRUD operations. In more detail on every CRUD operation `neo4j-ogm` generate Cypher query which then translates into Gremlin by `cypher-for-gremlin`. As a workadound we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). +The first issue that we have, is the fact that `cypher-for-gremlin` does not fully suport all Cypher syntax that is produced by `neo4j-ogm` for CRUD operations. To be more specific, on every CRUD operation, `neo4j-ogm` generates a Cypher query which is then translated to Gremlin by `cypher-for-gremlin`. As a workadound, we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). -Another issue is that `cypher-for-gremlin` have some doubtful concept to work with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents JanusGraph engine to match expected indexes, this causes heavy fullscan on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason of why we can't use `neo4j-ogm` for CRUD operations. -Anyway we still using it for custom Cypher queries with `@org.springframework.data.neo4j.annotation.Query` annotation. This is good option to have Cypher queries instead of Gremlin because it looks more clear and takes less time to read existing and write new queries. +Another issue is that `cypher-for-gremlin` has an ambiguous concept for working with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents the JanusGraph engine from matching expected indexes. This, in term, causes heavy full-scans on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason why we couldn't use the `neo4j-ogm` for CRUD operations. + +Either way, we are still using it for custom Cypher queries via the `@org.springframework.data.neo4j.annotation.Query` annotation. This is a good option to have Cypher queries, instead of Gremlin ones, because it looks more clear and takes less time to read and write queries. -``` From d8eed5051e703bfee0a59831e3e2be393d8487d1 Mon Sep 17 00:00:00 2001 From: Steve Todorov Date: Wed, 18 Mar 2020 23:31:21 +0200 Subject: [PATCH 6/7] Fixing typos and formatting --- .../getting-started-with-persistence.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index 24bdbed7..71b0eeed 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -5,7 +5,7 @@ This page contains explanations and code samples for developers who need to store their entities into the database. The Strongbox project uses [JanusGraph](https://janusgraph.org/) as its internal persistent storage through the -corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. Also we use `JTA` for transaction management and the `spring-tx` implementation module from the Spring technology stack. +corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. We also use `JTA` for transaction management and the `spring-tx` implementation module from the Spring technology stack. ## Persistence stack @@ -17,9 +17,9 @@ We're using the following technology stack to deal with persistence: - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries via the `@org.springframework.data.neo4j.annotation.Query` annotation) - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevent us from using it for `neo4j-ogm` CRUD operations, these issues will be explained below) - [neo4j-ogm](https://github.com/neo4j/neo4j-ogm) to map Java POJOs into Vertices and Edges of Graph - - We also use custom `EntityTraversalAdapters`, which implement anonimous Gremlin traversals for CRUD operations under `neo4j-ogm` entities. + - We also use custom `EntityTraversalAdapters`, which implement anonymous Gremlin traversals for CRUD operations under `neo4j-ogm` entities. -# Vertices and Edges +## Vertices and Edges Unlike a relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So, in terms of Graph, every persistent entity should be stored as vertex or edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by a separate edge, instead of just a foreign key column in a table. In addition to vertices, persistence objects can also be edges -- for example, the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices. @@ -178,7 +178,7 @@ In addition to CRUD operations, there is also the need to be able to select data Putting together all the above, the repository for the `PetEntity` will look like below: -``` +```java package org.carlspring.strongbox.repositories; import javax.inject.Inject; @@ -225,7 +225,7 @@ interface PetQueries } ``` -# Issues of `cypher-for-gremlin` and `neo4j-ogm` +## Issues of `cypher-for-gremlin` and `neo4j-ogm` The first issue that we have, is the fact that `cypher-for-gremlin` does not fully suport all Cypher syntax that is produced by `neo4j-ogm` for CRUD operations. To be more specific, on every CRUD operation, `neo4j-ogm` generates a Cypher query which is then translated to Gremlin by `cypher-for-gremlin`. As a workadound, we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). From 153ae6b6ed57774a1757219819d137343f41d9c4 Mon Sep 17 00:00:00 2001 From: sbespalov Date: Thu, 19 Mar 2020 08:38:39 +0700 Subject: [PATCH 7/7] issues/1649 Review comments fixed --- docs/developer-guide/getting-started-with-persistence.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md index 71b0eeed..c57e0b1a 100644 --- a/docs/developer-guide/getting-started-with-persistence.md +++ b/docs/developer-guide/getting-started-with-persistence.md @@ -57,7 +57,7 @@ If you want to store that entity properly you need to adopt the following rules: * Create the interface for your entity with all the getters and setters that are required to interact with the entity, according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. We need an interface in order to hide the implementation-specific details that depend on the underlying database, such as inheritance strategy. * Create the entity class which implements the above interface and extend to `org.carlspring.strongbox.data.domain.DomainEntity`. * Declare an entity class with `@NodeEntity` or `@RelationshipEntity`. -* Define a default empty constructor, this would need to create entity instance from `neo4j-ogm` internals. +* Define a default empty constructor, as this would be required in order to create entity instances from `neo4j-ogm` internals. The complete source code example that follows all requirements should look something like this: @@ -99,6 +99,8 @@ As mentioned above, besides `neo4j-ogm` and `spring-data-neo4j`, we were forced - `unfold` : to extract entity properties into vertex/edge and its properties - `cascade` : to cascade other vertices/edges within delete if needed +Basically these all these operations are implemented using special `__` class, which represent anonymous traversal in Gremlin. + The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations. Below is the code example of `EntityTraversalAdapter` implementation for `PetEntity`: