Extending SMTK Resource Query Syntax

Bob_Obara · October 30, 2019, 1:16am

Extending SMTK’s Query Mechanism

Current Formats:

Model Syntax
Attribute Syntax - attribute[type =‘foo’]

So examples used in an association rule could look like:

 <Resource Name="smtk::attribute::Resource" Filter="attribute[type='testDef']" />
 <Resource Name="smtk::model::Resource" Filter="face[string{'pedigree'='zz'}]" />

What Needs to be Extended

The ability to for queries on the resource side of the query so we can search based on properties on the resource
The ability to have more than one predicate on the component side of the query
Expand beyond always assuming the property is a vector
The ability to make sure of parent child relationships that may exist among the components.
Do we need the ability to search resources based on its name instead of its type?
Side Note: In the attribute XML (and probably in the JSON format as well) - we use a Name key when in reality it should be a Type key

Possible XPATH derived Syntax

Instead of creating a new grammar John suggested that we should look at existing query grammars such as XPATH which is used to search HTML/XML structures. There is definitely some similarities between a SMTK resource structure and those for which XPATH was designed:

Nodes in XML could be analogous to Persistent Objects
Attributes (w/r XML) could be analogous to SMTK Properties

In terms of differences:

XPATH was designed for trees and not general graphs - though if you treat the root as the resource itself then at least both do have a single root
Assuming we do treat the resource as the root - then unlike XPATH we might want to put constraints on the root node (for example only consider those roots with certain properties
XPATH attributes are uniquely defined by a name while SMTK Properties are uniquely defined by data structure and name so the simple @ syntax will not be enough.

Should the resource be part of the query?

Currently the Resource Manager is handed a string that refers to the type of resource. Using this, the manager matches the resource type either directly or by asking the resource if it is derived from it using the manager’s find method. Instead we could pass the manager a query string itself and have it return the result directly. This would involve delegating part of the query to the resource itself.

Else we will need to extend resource::Manager::find to be a resource query method.

Possible Syntax

/[…] - refers to a resource that matches a set of predicates
- name=’…’ - resource name constraint
- type=’…’ - resource type constraint
- @[…] - property constraint
  - name=’…’ property with name constraint
  - type=’…’ property with type constraint
  - val=’…’ property with value constraint
- constraints can be combined using and / or
- /res - resource of name = res - equivalent to /[name=‘res’]
- /res[…] - equivalent to /[name=‘res’ and …]
/res//[…] - returns a set of components that match a set of constraints that are part of a resource named res
- name=’…’ - component name constraint
- type=’…’ - component type constraint
- @[…] - property constraint
  - name=’…’ property with name constraint
  - type=’…’ property with type constraint
  - val=’…’ property with value constraint
- constraints can be combined using and / or
a/b - refers to all components named b that is a child of component named a

Possible Examples

/res - refers to the resource named res
/res//alpha - the component named alpha under resource named res
/res//alpha/beta - the component named beta whose parent is named alpha that is part of resource res
/[type=‘smtk::model::Resource’]//[type=‘face’] - all faces that are part of resources of type model
/[type=‘smtk::model::Resource’ and @[name=‘alpha’ and type=‘string’ and val=‘a’]//[type=‘face’] - all faces that are part of resources of type model that contains a string property call alpha whose value is a.
/[type=‘smtk::attribute::Resource’ and @[name=‘alpha’ and type=‘string’ and val=‘a’]//[type=‘attribute’ and @[name=‘beta’] - all attributes that have a property called beta and that are part of resources of type attribute that contains a string property call alpha whose value is a.

johnt · October 31, 2019, 9:52pm

With regard to the first bullet item at the top (“extending the resource side of the query”) and the section “Should the resource be part of the query?”, I would describe it slightly different. I would say that its general/underlying requirement is to specify a resource query and a component query in cascade. Might be a distinction without a difference, but I think our design should reflect the “two queries in daisy-chain” perspective.

With that in mind, I would prefer keeping the resource and component strings separate (although, yes, you can easily split or join strings).

My preference would be to use the resource/component type as the query’s “test” expression, in front of any predicate logic. This presumes that every query must specify the resource and component type, (unless/until we choose to also support “any”).

In the query string itself, I would also recommend changing “Resource” to “Query” and “Filter” to “Component”, mostly to be more consistent with the notion of two queries in cascade.

Adapting from the current syntax, an example might be:

<Query Resource="smtk::model::Resource[integer{'analysis'=1}]" Component="attribute[type='material'] />

This might be implied, but just in case: I think we should only adopt the @ symbol as syntactic sugar for a textual keyword, presumably “property”. As for combining property selectors, maybe we can use comma as an alias for “and”? So these 4 examples would be equivalent:

property[name='alpha' and type='string' and val='a']
@[name='alpha' and type='string' and val='a']

property[name='alpha',type='string',val='a']
@[name='alpha',type='string',val='a']

As for searching our more extensive/structured properties, we might get some insight from some of the json-query libraries (search “json path”) or even GraphQL. Not sure if either will be sufficient for our feature set.

Bob_Obara · November 1, 2019, 1:44am

It terms of one or two query strings - Since the proposed format uses:
/“resource related query”//“component related query”

//“component related query” is in itself a valid query meaning that you do not have to both pieces to be a valid query. Also you can give a query string with both pieces to the resource (and not to the resource manager). In that case the resource would confirm its matches the resource part of the query.

johnt · November 1, 2019, 4:03pm

Fair enough. From outside the box, it’s seems awkward to specify the resource type in the first string and then put resource predicates in the second string. But if that format is more practicable for us to implement, that’s fine.

So if we go that way, I would also take back my recommendation to change “Filter” to “Component”, and instead stick with the more generic “Filter”.

As for the keyword to reference the resource in the filter string. I would recommend using the full word “resource” instead of “res”.

Regarding the “//” symbol, it looks to me like its function is to set/change the search axis from resource to components. I presume will could potentially generalize that to support attribute or maybe other axes too. Do we want to consider using the explicit “axis::test” syntax? It might just be more clutter, because we can infer the axis from the keyword. Nevertheless, here is a before and after example:

resource[integer{'analysis'=1}]//face[integer{'pedigreeId'=2}]
resource[integer{'analysis'=1}]/component::face[integer{'pedigreeId'=2}]

Related to that, would the proposed, fully-enumerated version of the first line translate to the following?

resource[property[type=integer and name='analysis' and value=1]]//face[property[type=integer and name='pedigreeId' and value=2]

dcthomp · November 5, 2019, 12:27pm

How does this fit in with relative paths in attribute queries (assuming we are aiming to use the same filtering grammar for path queries as well)? Would even relative path queries start with // or is there a third form, assumed to act on components that begins with ./ or ../?

dcthomp · November 5, 2019, 1:03pm

My high-level concerns are in two areas:

Switching to the XPath syntax will require supporting our existing syntax and the new one for a time. That is fine, but we should have a clear plan for the transition.
- Are we going to rewrite the old queries as we read files in (so that saving will produce a new-style query and only the JSON and XML readers refer to the old grammar)?
- Are we going to try to infer which grammar a given filter string uses or require file versioning to be consistent?
The DOM and SMTK’s resource/component model are not precisely identical; or at least the proposal above does not consider all the possibilities:
- Links between resources and/or components, especially if our use of roles expands or we want to support applications that define their own roles.
- Tags appear to only be used by operation specification attributes, but there is no constraint that they must be used this way. Will we ever want to query attributes by the tags present on their definitions?
- Component relationships (find edges bounding faces with property X) are possible as are DOM element relationships, but we have not carefully considered potential differences. For instance, the XPath queries for nth() items make a lot of sense when processing lists of things, but one-to-many relationships in geometric models do not always attach a lot of meaning to order. They do attach meaning to orientation and sense of uses. They also attach meaning in non-binary ways (instead of asking for the third edge of a face, you might ask for the edge of face A that also bounds face B).
- Non-POD-valued properties are possible. Should the query language allow resources to support them. For example, the RGG session might store duct and pin specifications in a struct. If the RGG session wants to allow query strings that filter based on this, how can the grammar support it?

dcthomp · January 13, 2021, 4:11pm

We have been looking at XPath, but CSS also has selectors and might be easier to implement. Note that one of the “advantages” of XPath is that it allows one to refer to parent elements. CSS did not previously allow this but the current working draft for CSS selectors level 4 does via the :has() pseudo-class.

johnt · January 21, 2021, 3:56pm

This might be redundant, but as I am refreshing my mind on this topic, I want to make sure that we are (still) locked in to a two-string format:

“Name” (with possible change to “Type”) for the resource
“Filter” for the component.

and that there is NO practical way to refactor “Filter” into separate “ResourceFilter” and “ComponentFilter” strings, for example.

Also there is no practical way to expand “Name” to include a resource filter string suffix of some kind.

dcthomp · January 26, 2021, 5:25am

I am ambivalent about changing the attribute-system XML syntax from

  <Accepts>
    <Resource Name="_resourceFilterRule_" Filter="_componentFilterRule_"/>
  </Accepts>

to

  <Accepts>
    <Accept ResourceFilter="_resourceFilterRule_" ComponentFilter="_componentFilterRule_"/>
    <!-- and/or -->
    <Accept Filter="_combinedResourceAndComponentFilterRule_"/>
  </Accepts>

I do think it is important to support

rules that select resources,
rules that select components, and
rules that select a both resources and components from them all at once.

Furthermore, the parser should not have to be told which type of rule is being provided; it should be able to parse any of the above. The XPath-like syntax suggested by @Bob_Obara uses // to separate resource and component specifiers so that it is unambiguous. Other syntaxes would need to either use the same strategy or indicate somehow whether a rule was supposed to select resources or components.

I’m not sure what this means? Do you mean that the resource type-specifier should accept query strings that match multiple types? If so, that is what we are talking about doing.

johnt · January 26, 2021, 3:04pm

Yes, we’re on the same page. Regarding my comment about expanding “Name” to include a resource filter string, that’s the same idea as your _resourceFilterRule_. For example, if we reuse our current “property” syntax, something like this would be useful:

<Accept
  ResourceFilter="smtk::model::Resource[string{'project-tag'='heat-transfer-mesh'}]"
  ComponentFilter="face[integer{'interior'=1}]">
</Accept>

and the equivalent

<Accept
  Filter="smtk::model::Resource[string{'project-tag'='heat-transfer-mesh'}]//face[integer{'interior'=1}]">
</Accept>

Though I prefer the separate filters, if we need the single filter string, perhaps we should just go with that.

dcthomp · July 28, 2021, 3:52pm

This came up again. Should we move to CSS or XPath, we need to consider how the following SMTK concepts would be exposed:

Component/resource inheritance. XML elements do not have an inheritance hierarchy, so specifying h1 as a selector is unambiguous. But smtk::graph::Component is not… did we mean to select all graph components (including subclasses) or only precise matches (no subclasses)?
Do we want to allow filtering by links with roles? (e.g., find components linked to A via role 1). We can either have smtk::resource::Resource provide something generic or we can force resource-subclasses that use links to provide grammar rules for them.
For graph resources, CSS and XPath paths only expose a single hierarchy but arcs may have different types. I propose having (1) a “ChildrenOf” query that, if resources provide an implementation, is used to determine children when no arc is specified and (2) arcs appear in selectors as if they were nodes (that just happen to never have properties).
We are assuming SMTK properties map to XML attributes… but SMTK properties are strongly typed and it is possible to have an SMTK int property with the same name as a string property. This could cause complications in CSS/XPath queries. I propose augmenting the CSS syntax slightly by accepting a type-specifier before non-string values. If no type-specifier is provided, we can either reject the selector or attempt to match all property types.
What about resource-specific queries? For example, the attribute resource might want to allow users to query based on advanced-level, or whether an item/association is enabled. We can try to make the syntax easy to re-use so subclasses can have their own custom syntax, but I’m not sure how to go beyond that.

Some of the above might be handled with CSS pseudo-classes/elements, but will all of them?

Also, as the grammar is changed, we will need to consider evaluating invalidity due to version differences (i.e., no mixing v1 and v2 grammars in the same selector).

dcthomp · September 7, 2021, 10:57pm

With the above in mind, here is a CSS version of the “Possible Examples” from the top post.

Possible Examples, part deux: CSS style

Example	Description
`res`	A resource or component of type `res`. An example would be `"smtk::attribute::Resource"` or `"smtk::session::opencascade::edge"`. Any time colons are used in the typename, the typename must be quoted to avoid confusion with the pseudo-element/pseudo-class operator. Components may omit leading namespaces as long as they are unambiguous. For example, `edge` works as long as only the opencascade session defines that as a resource, component, or arc type.
`res alpha`	A resource or component of type `alpha` whose parent resource `res` (according to a yet-to-be-implemented query – say `smtk::resource::query::ChildrenOf`) includes a component of type `alpha`.
`res [name="alpha"]`	A component named `alpha` that is a child of a resource of type `res`.
`res alpha>beta`	A component of type beta whose immediate parent is of type alpha that is part of resource `res` (with `alpha` not necessarily a top-level component).
`"smtk::model::Resource" face`	All faces that are part of model resources.
`"smtk::model::Resource" face[alpha="a"]`	All faces that are part of resources of type model that contains a string property call alpha whose value is a.
`"smtk::model::Resource" face[<"std::vector<int>">beta=[1,2,3]]`	All faces with a property named `beta` stored as an array of integers and having value equal to [1, 2, 3] and which are part of model resources.
`"smtk::attribute::Resource"[alpha="a"] *[<string>beta]`	All attributes that have a string property called `beta` and that are part of attribute resources that have a string property call `alpha` whose value is “a”.
`Attribute[type="FluxBC"]:linked-by(AssociationRole, "opencascade::face")`	Any attribute component whose Definition is “FluxBC” that is associated to a component of type `opencascade::face`. Note that the `attribute::Resource` class would have to register `AssociationRole` and `ReferenceRole` with the parser somehow.
`"opencascade::face":linked-from(AssociationRole, Attribute[type="FluxBC"]`	Any opencascade face that is associated to an attribute component whose Definition is “FluxBC”. Note that the `linked-by` pseudo-class is a slow query since it operates in the reverse direction of the link.
`"graph::Resource" compA>arcB>compC`	A graph-resource component of type `compC` that is the destination of an arc of type `arcB` whose source is a component of type `compA`. Arcs should always use CSS’s child-combinator ("`>`") instead of the descendant-combinator (" "). We might also consider defining a new arrow-combinator for arcs: `compA→arcB→compC`.

These examples contain some extensions to handle differences between SMTK resources and the HTML document-object model (DOM):

Strong property types are provided before the property name in angle brackets.
The :linked-by(role, target) and :linked-from(role, source) pseudo-classes are not part of CSS but provide one possible way for arbitrary Link roles to be exposed with some help from a registrar for role IDs.
The new arrow-combinator for arcs that connect graph components (e.g. compA→arcB→compC) would be less ambiguous than re-using the child-combinator.
We might use a unicode character to distinguish between exact typename matches and subclass matches (e.g., consider the opencascade shape component that has subclasses face, edge, vertex, etc… A selector written ≡shape would force exact matches while ≈shape would allow subclass-inclusive matches). This is not shown above.

Bob_Obara · September 8, 2021, 1:50pm

Nice Summary!

In XPATH you can refer to partial paths, for example /A//B means all paths that start with A and ends with B. Can you do the something similar in CSS?

Also in XPATH you can have comparators: /bookstore/book[price>35.00] is something like this supported in CSS?

Note that I don’t have any immediate use cases for the second but the first would correspond to:

/[type=‘smtk::model::Resource’]//[type=‘face’] - all faces that are part of resources of type model.

dcthomp · September 8, 2021, 2:45pm

Yes. The descendant combinator (space) does this.

No, because CSS doesn’t treat attributes as anything but strings. It would not be difficult to add. The equivalent of /bookstore/book[price>35.0] in modified CSS would be either


`bookstore book[price>35.0]`	descendant combinator – any book inside a bookstore matches.
`bookstore>book[price>35.0]`	child combinator – only books that are immediate children of a bookstore match.

johnt · September 10, 2021, 5:55pm

In a template file, would I need to use escape characters? for example, would I need to write <Filter="\"smtk::attribute::Resource\""> ? or would our readers/parsers be smart enough to insert double quotes where needed?

I need some help understanding the nomenclature:

The res alpha example returns a resource of type res that includes a component of type alpha
But "smtk::model::Resource" face returns the faces included in the model.

Both examples look like the same format to me (resource-type descendant-type) but they return different things.

This makes me wonder: Does the proposed syntax (only) apply to queries to a single resource?

dcthomp · September 10, 2021, 9:13pm

Either single-quotes or escaped doube-quotes will work fine.

Oops, res alpha selects components of type alpha. I’ll fix the table above. If you wanted to get res out, we would need to implement the CSS selector level 4 syntax that allows querying parents.

johnt · September 13, 2021, 4:32pm

Just double-checking that the asterisk is a wildcard for “any component”, so this would be equivalent:

"smtk::attribute::Resource"[alpha="a"] attribute[<string>beta]

Is there a particular reason you named the selector linked-by instead of linked-to? I would prefer the latter, but perhaps I’m overlooking something.
Suggest modifying to linked-from and linked-to selectors to make the component the first argument and make the association role optional and second. So this might supported too: attribute[type="FluxBC"]:linked-to("opencascade::face")
I would vote to use lower-case selector keywords: Attribute → attribute, AssocationRole → association-role