Projects, Roles, & Tasks

Bob_Obara · May 13, 2021, 10:04pm

A new implementation of Project support was recently merged into SMTK and will be part of SMTK 21.05 and the functionality exposed in CMB 21.05. There are some concepts/relationships that are still being ironed out, especially with respect to the next strep: Workflow & Task Support.

Basic Anatomy of a Project

A project is basically a container of SMTK Resources that grouped together so that user no longer needs to ensure that all of the necessary resources required for his/her task/workflow are loaded into / saved from ModelBuilder.

SimpleProject

Here you see a project consisting of three resources: a model, a mesh , and an attribute resource. In the new implementation, SMTK Project is derived from SMTK Resource, which means that you can refer to projects in attribute resources and define operations that take in projects as parameters.

Beyond a Simple Bag of Resources

Lets assume you wanted to define an operator that would process the above project. To access Model 1, the operator could ask the project for its model resources. It could do similar actions for mesh and attribute resources in order to access Mesh 1 and Attribute 1 respectively.

But what if the project had more that one model? Well the operation could ask for the model resource by name, but that would mean that name is hardwired into operation and it’s up to the user to name the resource appropriately. Not a very elegant solution with no UI support. Alternatively, the operation could require the user to specify the required resources directly and by-pass the project all together, though that would reduce the potentially benefits of having operations take in projects.

To deal with this issue, SMTK Project introduces the concept of a role. A project role represents what a resource (unique role) or set of resources (non-unique role) represents with respect to the project. Below is an example of a casting project with roles.

The roles provide a conceptualization of the resources stored in a project. Model 1 represents the mold to be used in the casting, while Model 2 represents both the mold and the void region that the material will be poured into. The project also contains 4 different attribute resources:

Attribute 1 - represents the simulation specification for calculating the required view factors
Attribute 2 - represents the simulation specification for preheating the mold
Attribute 3 - represents the simulation specification for pouring the material into the mold
Attribute 4 - represents the simulation specification of cooling the mold and solidifying the material

Now an operation that takes in a project could request a resource based on its role. Instead of asking for Model 1, it can now ask the project for the “Mold” resource. This now eliminates the user from having to either guess the names of the required resources or having to pass them in as parameters.

Other (Potential) Aspects of Roles

Besides providing a labeling/structure mechanism for the resources within a project, a role can potentially have the following aspects:

Constraints - the ability to restrict a role with respects to the resources that can be assigned to it. This could include:
- Resource Type
- Properties associated to the Resource
- Role Exclusivity - if a resource is already associated to Role X it can not be associated to Role Y
- Role Perquisite - a resource can be associated to Role Y iff it is associated to Role X
- Functional Constraints
Number of Required Resources - how many resources must to associated to the Role
Is Extensible - can the Role have a range of Resources associated with it

Roles vs Properties

Similar to Roles, a SMTK Resource can have set of Properties associated with it. Besides the above potential aspects of Roles, the other major difference would be that assigning a Resource to a Role does not effect the state of the Resource itself, while adding a Property to a Resource does. When a Resource is assigned to a Role, it is not modified, only the Project would need to be saved.

Roles vs Attribute Reference Items

Roles do have a very close resemblance with the Attribute Resource’ s Reference Item.

Task - A Workflow’s Building Block

Now lets assume we want to do something with the contents of a Project. As we have already mentioned we could associate a set of operations with the project, but what about user-driven tasks. For example, the user needs to fill out the Pre-Heat Specification before they can run the Pre-Heat Simulation. Capturing this will be the responsibility of the Task.

For simplicity sake, lets assume we have a Task that will define and run a Truchas Simulation. The Task will have a set of requirements that need to be met in order start the Task. In this example lets assume that the generic Truchas Task has the following requirements:

Heat Transfer Model
Truchas-based Attribute Resource
Optionally an Induction Heating Model

TruchasGenericTask

The Project designer (or user) would then “connect” the Project’s roles (or resources) with the Task’s requirements as shown below:

Here you see the same task instanced 3x’s, with each instance being given different resources for its requirements. As a result, each instance produces different results needed by the overall workflow. The clouds represent information not stored explicitly by the project. These assets can include the following:

Input files generated by the task
Job submission information (though there is work in progress to have this information stored in the Project
Simulation Results
Post Processing Results

In many cases the information generated from one task is needed by another. The location and filenames of the results from one task sometimes need to be passed into another. One possible approach would be to have the task “fill in” the relevant information into a requirement that represents the specification of a dependent task as shown below.

Here you can see the tasks optionally taking in an attribute resource that represents the specification of a dependent task.

Note that all of the linkages shown between the Tasks and Project Resources would be persistent.

Multiple Runs/Variations

In real life, simulations / workflows tend not to be executed once. Researchers/Engineers will create variations of a Project’s workflow. If we look at the problem from this point of view, then many of the details we have attributed to Project is really attributed to a Workflow instance.

Here we have extend the Project to have multiple resources assigned to the same Role. For each workflow instance the user is allow to choose a resource from the role that will then be properly assigned to the Task’s Requirements. In this conceptual figure, the specifications are “owned” by each Workflow instance but we could instead also treat them as we do the model resources and have then belong explicitly to the Project’s roles. The reason I think this might be better is that during the workflow, the attributes are associated with components of the models. If the specifications were selectable, then in certain cases the user would not be able to also select the models.

Bob_Obara · May 13, 2021, 10:05pm

@johnt @C_Wetterer-Nelson @Ryan_Krattiger @dcthomp @aron.helser

Comments are definitely welcome!

C_Wetterer-Nelson · May 14, 2021, 4:06am

Hey Bob, this is great! The figures are especially helpful. I feel like the graph structure of the project and workflow is readily apparent, and sort of implies a graph editor GUI a la those found in Blender or Unreal Engine… But I don’t want to get ahead of myself.

I’m curious about the representation of Tasks in the pantheon of data structures present in SMTK. From the onset, a Task smells an awful lot like an SMTK Operation with its inputs and outputs, but with the key difference that a Task is persistent, holding on to its links and states (and of course, ought to be serialized when a Project is saved to disk). Do you imagine a Task being able to live on its own, outside a Project (a Resource), or something intrinsically tied to a Project (more like a component), or do you think there is space for something like an Operation Manager/Registrar for Tasks that takes into account all the caveats above?

Thanks!

johnt · May 14, 2021, 1:38pm

This is all good and captures where I think we should be heading. Yes the recent discussions about smtk::project are clearing wading into the “workflow” space. I hope we can use them to steer the project roadmap without getting too mired into workflow management.

With regard to project roles, I initially see them as an optional feature for users, essentially an alias or tag they can apply to some or all of their project resources for semantic reference. Using roles to predefine workflows or connect workflow elements is a longer-term feature.
In the “Task - A Worflow’s Building Block” section, you describe an approach to have the task “fill in” the information to a downstream task(s). That’s a fair leap, moving from one reusable software component for the genreric Truchas task to 3 specialized codes for Truchas-Preheat, Truchas-Pour, and Truchas-Solidify. I would expect these specializations to be unique to individual users and organizations. Coming up with a conceptualization for what I call simulation assets – user data that is not represented by smtk resources – would be a first step toward support something like that.
In the last section “Multiple Runs/Variations”, I don’t foresee there being a multiplicity of concurrent workflows as a primary use case, but instead a series of variations to one workflow based on the end user objectives (but maybe that is what the figure is implying). In that sense, simulation development is analogous in many ways to the software development we do. And it would be useful to have analogous tools to capture and annotate these variations, the same way we use git and release notes to track software evolution.
As for Corey’s question, we have started discussing how to prototype a Task class to begin expanding smtk::project. I think we agree it makes sense to use an attribute resource for internal storage for many of the same reasons that smtk operations do. But it is not clear whether a Task class should be implemented as (a) regular c++ class like operations, (b) an smtk::resource::Component subclass owned by an smtk::project::Project, or (c) a subclass of smtk::attribute::Resource itself.

Bob_Obara · May 14, 2021, 1:42pm

You definitely hit the nail on the head in term of the similarity between a Task and an Operation. The main conceptual difference is that a Task is carried out by the user while an Operation is a computational action. For example in a Truchas Workflow, currently there is an implicit single “Task” - Define and Export a Truchas Simulation. Based on the conceptualization so far, a Task could exist outside of a Project and a Project does not have to contain explicit Tasks as exemplified by the current version of ModelBuilder. However, combining the two should dramatically improve the User Experience.

C_Wetterer-Nelson · May 14, 2021, 9:41pm

I’d be interested to get opinions from simulation practitioners who aren’t also developing their simulation codes. I feel like having versioning tool a la Git make a lot of sense to me, one who uses Git daily, but I’m nervous about falling into the “we have a hammer so every problem is a nail” situation. It’s possible many users of simulation codes don’t organize themselves and their work in a way that a Git style workflow makes sense for, and that may be a cumbersome learning curve to orient themselves as such. It seems like the workflow concept presented here could support a mimetic Git work style, but we may not need all the trappings that such a style implies to support other, potentially more freeform working styles as well.

Bob_Obara · May 14, 2021, 10:16pm

@chart3388 @amuhsin @Aaron @waj334 - FYI

Neil_Carlson · May 14, 2021, 10:29pm

I’m pleased to see Truchas providing some fodder for your development.