-
Notifications
You must be signed in to change notification settings - Fork 69
Adding Document Field DP #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
## Intent | ||
Showing the document field is a part of a document. | ||
|
||
Comes in hand when representing documents with multiple fields, so that we can link more than one ICE to the same IBE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference to IBE is a distraction. An IBE can already carry as many ICEs by simply saying ICEx generically depends on IBEy.
I suggest that the sentence ends at "multiple fields," (comma changed to period)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us rephrase: Document Field shows that you can have different parts of an IBE that each stores a different ICE. Would that be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in my view. I think the cases where one will be working with a single physical document are relatively rare, and so having this as a design pattern is a magnet for making mistake. We've seen that in practice in a variety of cases.
What would be more acceptable, if you really want to talk about IBEs is to have two design patterns, one which uses only ICEs and one IBEs, as this one is, and make very clear in the documentation what the consequences of using one versus the other are. And if that is really desirable, allow the data property to be used directly from the IBE or ICE.
But again, I'm against this. The design patterns will be seen by new users as suggestions on how to build their RDF data, and so we want to be promoting patterns that are likely to be used correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I struggle to find any cases where you would want to use the IBE patterns. I should. be clear: These aren't just patterns. It's not just a preference about what pattern should be used. They are assertions of fact, and they should properly reflect the reality.
In any case of a digital form, it will not apply, because that form will certainly be displayed or stored on more than one IBEs, and the facts will be the same in all cases. That is ICE behavior.
In the case of written forms, almost any form these days will be scanned or otherwise copied, and then we're back to the digital case/multiple IBEs that are entirely interchangeable for the purpose of recording information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep trying to understand why people keep using this pattern. The only thing I can think of is that they think when they give an IRI for an IBE they interpret that as meaning "some IBE", thinking it valid because, of course, every ICE will have some IBE bearer.
But of course that's not what it means. When you give an IRI of a particular it means that specific particular. If you say something is an IBE, that means that sheet of paper, even if that paper has long been shredded and all we have are copies that, from the point of view of the assertion being made, do not necessarily have anything to do with each other. Unless what they have to do with each other is part of the semantics of the term. Such is the case with ICEs.
Another way: Just because you've said an IBE bears an ICE, that doesn't mean that anything said about the information on the one IBE is true of another IBE. That's because, at least, IBEs can carry more than one ICE at a time or across time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alan, am I wrong in understanding that currently the domain of properties such as 'has text value' or 'uses measurement unit' is still limited to IBE? If we are to build a design pattern that is able to connect to data and we have to display the usage of data properties in CCO, then we still have to go through the IBE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alan, am I wrong in understanding that currently the domain of properties such as 'has text value' or 'uses measurement unit' is still limited to IBE? If we are to build a design pattern that is able to connect to data and we have to display the usage of data properties in CCO, then we still have to go through the IBE.
The proper fix is to the properties. From an ontological point of view, all data goes through the content. There's also an outstanding suggestion that these properties be deprecated in favor of a single 'has value' property. In discussion.
I'd like to suggest that we do not publish information entity patterns until the information refactoring is done. All it will do is introduce unnecessary churn. Surely there is enough other noncontroversial content for which patterns can be published.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giacomodecolle I'd also like to point out that the language "go through the IBE" is not an ontological way to describe a resolution of a problem, which should focus on what is being asserted and the sensibility/correctness of such an assertion. We aren't solely defining "design patterns", we're making statements about the world. Each such statement should be so evaluated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alanruttenberg a part of the reason why I see use in modelling information is to use data properties to connect our ontological models to data stored in formats such as datetimes or integers. Currently, the way in which CCO does this is by using data properties such as "has text value", whose domain is an IBE. Previous CCO documentation adopts this design pattern, which requires the user to generate an ICE, connect it to an IBE using a property such as "generically depends on", and then connect the IBE to data using a data property.
My newly formed understanding is that the refactoring changed the taxonomies and labels of ICEs, but didn't change the data properties that would allow us to connect data to the ICE. This now puts us in a conundrum, as people using the development version still have to use the previously existing data property that connects data to the IBE, while the taxonomy has moved to favor the use of ICEs as primary modelling targets. Part of the reason why this design pattern is following the old model is that it focuses on showing how to connect data to a graph using data properties, which still needs to be done by connecting them to the IBE.
I can pause the publication of design patterns on information until this is resolved if the governance board agrees on this course of action, but I would like to have a comment on how long you (the board) think this will take. Besides the design patterns and the work of numerous volunteers in that group, a number of projects I am currently work on depend on this modeling, and I would like to avoid having to refactor everything (including the design patterns) in a few months.
documentation/user-guides/design-patterns/Design Pattern 1/CQ3.sparql
Outdated
Show resolved
Hide resolved
PREFIX obo: <http://purl.obolibrary.org/obo/> | ||
PREFIX owl: <http://www.w3.org/2002/07/owl#> | ||
|
||
SELECT ?data ?document_field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will return all document fields independent of document, which is not something I think would be asked much. It should be qualified by a document that the fields are all part of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added another triple in the WHERE clause that gets the document that the field is a part of.
|
||
2) How many document fields are in the document? | ||
|
||
3) What is the content on these document fields? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relates to my comment on the SPARQL query. "the document" is used here and is presupposed to the document in question 2, but that isn't explicit enough. A better style might be: given document field df1, what document d is df1 part of. 2) How many document fields are there in d. 3) What is the content of the document fields of document d.
Notice also varying language: "part of" is used in (1) "in" is used in (2). Suggest uniform.
The use of the word "on" in (3) seems to presuppose that you are looking at a concretization of d. Perhaps "associated with" might be more neutral.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"content" is also very general and could equally apply to the ICEs. "What is the literal content of...".
## Structure | ||
The document has as parts two document fields. One of them carries one ICE, specifically a designative Name. | ||
|
||
Despite using names as an example, this pattern can be used with any other ICE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what part of the pattern you mean that can be used with ICEs. Not all ICEs have document fields as part. If you mean part of the pattern can be used with other ICEs then specify which part. (for instance, a qr code doesn't have document fields)
|
||
SELECT ?document ?document_field | ||
WHERE { | ||
?document obo:BFO_0000178 ?document_field . #has continuant part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In writing #317 (comment), a proposal for how SPARQL queries are written, I realized that I consider the current pattern an anti-pattern. Instead of using IBEs (Material copy of a document, Material copy of a document field), the corresponding ICEs should be used. What is being asserted in the design pattern is overwhelmingly more likely to be used with ICEs. As it is, it is making a statement about a single piece of paper (or display of same on one computer screen, at one time). Make a photocopy, or display the same thing the next day and we know nothing about those entities. But of course we expect to.
If you really mean this to apply to one and only one copy, please make that abundantly clear in the documentation that this is what you are doing by explaining where this pattern is to be applied. Using the labels in develop will help. But don't.
Aside from fixing this, I note that this will be pulled into develop presumably but is not using the labels from develop.
Also consider adopting the style of SPARQL I proposed in #317 (comment)
The proposed modeling, which has two document fields related to the literals "Mary" and "John" respectively, might suggest that document fields are text IBEs, like paragraphs or titles, but I am not sure anymore this is the case. Given the definition of Document Field, it seems more like a blank space, or a frame in which to write information. The cco:MaterialCopyOfDocumentField seems more a cco:InformationMediumArtifact, and it would be nice if the CCO custodians could clarity on this. Unfortunately, the only alternatives that I found in CCO that could be plausible parts of Documents are cco:MaterialCopyOfAInformationLine and cco:MaterialCopyOfAnImage, despite the definition of cco:MaterialCopyOfADocument mentioning paragraphs and diagrams. Secondly, we need to update the labels, like "Material copy of a Document" instead of "Document", given Otte's PR of July 2025. For completion, I report here the CCO definitions: MaterialEntity/MaterialArtifact/IBEhttps://www.commoncoreontologies.org/ont00001298 https://www.commoncoreontologies.org/ont00001243 rdfs:label "Material Copy of a Information Line"@en ; rdfs:label "Material Copy of an Image"@en ; rdfs:label "Information Medium Artifact"@en ; |
Added folder 'Design Pattern 1' which includes the mermaid for document field, an image, description, and SPARQL queries.
Also changed readme to have a directory, to show what the folder contains.