Design Proposal for Changing Tasks Workflows (Major Changes)

Bob_Obara · March 18, 2024, 9:18pm

This discourse is to discuss possible major changes related to how tasks are connected via adaptors.

How Task Workflows are currently modeled in SMTK.

In SMTK 24.01, a workflow is represented as a set of Tasks that are connected using dependencies and adaptors as shown below.

In this case the red arrow represents a dependency while the thick blue arrow represents an adaptor. The thickness of the arcs is used to indicate that dependencies are relatively light weight and simply transfer the "Task State" (Completed, Completable, Unavailable, etc..) of Task 1 to Task 2. On the other hand, an adaptor can do a lot more work by taking a portion of Task 1's Data State and then modifies Task 2's Data State. As a result there can be many types of Adaptors since each pair of Task Types will require their own adaptor class.

Some Downsides of the Current Design

Data Transfer is Unclear

Since the logic as to how the Data State of Task 1 influences the Data State of Task 2 is defined by the Adaptor, is unclear as to how Task 1 effect Task 2.

Impossible to have Simple “Wire” Adaptor Class

There are times when a designer only wants to define a simple relationship between two Tasks. For example to “send” an Attribute Resource being manipulated by one Task to another Task that will also manipulate the resource. In the current design, Tasks do not provide any API for manipulating their Data State that is common for all types of Tasks.

Dealing with Group Tasks

Perhaps the biggest issue is with respects to Group Tasks. Let’s assume that Task 2 in the above example is a Group Task.

Let’s consider Adaptor A1 that connects Task 1 and Task 2. In this case it configures the Data State of Task 2. Now Task 2 is connected to a child Task 2A via Adaptor A2 which the configures that task and so on until Task 2B which agains configures Task 2 via A3. Finally Task 3 is configured by A4. Note that in theory, each time Task 2’s Data State is manipulated (via A1 or A3), Adaptors A2 and A4 execute.

Note that all of the above shortcomings get magnified when multiple adaptors are coming into the same Task.

New Approach - Adding Interfaces to Tasks

One of the drawbacks to the current design is that Tasks do not common interface API and as a result adaptors have to know the type of Tasks they are modifying (as well as potentially the context in which they are modifying them). This approach adds the concept of Task Ports which all Tasks can have. A Port (which can be either an input, output, or proxy (more proxy ports later) provides access to certain (read/write) information of the Task it is connected to. Let’s reimagine the first figure using Ports.

In this case Task 1 has an output Port named P1 and Task 2 has an input Port named P2 and an output port named P3. The logic to how a Task is modified based on information sent to its Port is now contained with in the Task, allowing Adaptors in some cases to be simply "wire" connections between Ports. This does not rule out the possibility of Adaptors that perform more complex connections. For example an Adaptor could convert the information provided by an output Port into a format required by another Task's input Port. Note that dependancies still connect Tasks and not Ports.

Proposed Structure of Task Ports

Ports have the following structure:

Name - unique w/r to the Task it belongs to
Label - string to be displayed in graphical UI’s
Raw Task Pointer - points to the Task it belongs to
Direction (Input, Output)
Can support multiple connections (probably input ports only)
Should they know the adaptors using them?

In addition, there should be a way to do the following:

Connect Ports with Adaptors
Process Data by the Task given to an input Port
Send Data to Adaptors connected to an output Port
Ability to determine if 2 Ports can be connected by a specific type of Adaptor

Ports and Group Tasks

When adding Ports to a Group Tasks, they would get added as a pair of Ports (one for the outside world and one for the Tasks within the Group. Data given to given to a Group Task’s input port would be forwarded to the internal output Port which can be connected to the input Ports of the children Tasks as shown below.

Here the children Task Ports are colored instead of labeled. As you can see information flow is better understood here than in the earlier figure. The user can see that output Port of Task 2A is connected to the input Ports of Task 2C and Task 2B while the output of Task 2D is connected to a different Port of Task 2B. The pairs of Ports (P2&P2’ and P3&P3’) are shown with dotted lines connecting them. Note that unlabeled Ports of the same color represents they share the same direction but are not the same Port.

Modeling information/data-flow through ports

In this design, ports do not have any state information w/r to the data that was either given to or sent out from a port. Data is either consumed by the task itself or sent via an adaptor to another task’s port to be consumed. However, we do need a mechanism to represent the information be “sent” through the Port. We outline two possible approaches below.

Representing port data with Attributes

Attributes are already used to represent both simulation information as well as operation parameters so it may not be that large of a stretch to have them used to represent Port Data. In theory the project could hold an internal attribute resource containing all of the different types of Port Data represented as a set of Attribute Definitions. A Project could be given a default set of Port Data Definitions that could then be changed by the project designer. When a new Port is added to a Task it would be given one of the Port Data Definitions. Note that when data is generated for a Port, a new Attribute would be generated from the Port’s Definition, filled in and then detached from the internal Attribute Resource of the Project so when no longer needed it would be destroyed.

Determining if ports can be connected

Using a Port’s Attribute Definition we could determine if two Ports can be connected by seeing if the Input Port’s Definition IsA a derived Definition from the output Port. Similar we could solve a potential problem with multiple Task Group Task Children Input Ports being connected to the same Proxy Port - In this case the Attribute Definition would have to be the most derived type that it is a isA of all of its siblings’ Port Data.

Checking port data validity

In this case we would just use the Attribute Validity Mechanism

Representing port data with ad hoc classes

In this case we would create a new PortData base class; each subclass would be derived from it and could contain information in any form. Unlike attributes, this would allow the data to include pointers, geometric search structures, and helper methods. On the other hand, it does not enforce a consistent schema that can be validated. The Task class would provide methods to push/pull PortData* objects and dynamic-cast the pointer to the subclass it expects.

class PortData
{
  smtkTypeMacroBase(PortData);
};

class ConcretePortData : public PortData
{ // Any type of data can go here:
  smtkTypeMacro(ConcretePortData);
  smtk::resource::Resource* m_resource;
  vtkSmartPointer<vtkDataObject> m_shape;
};

class Port : public smtk::resource::Component
{
  Task* m_parent; // … and name, UUID, etc.
  std::unordered_set<smtk::string::Token> m_dataTypes;
};

class Task
{
  …
  PortData* produceDataForPort(Port* port) const;
  bool ingestDataForPort(Port* port, PortData* data);
  const std::vector<Port*>& ports();
};

Task subclasses would own

Determining if ports can be connected

Because smtkTypeMacro() now defines methods matchesType(), classHierarchy(), and generationsFromBase(), we can check whether an upstream port’s expected PortData type is exactly or inherits the downstream’s type.

Tasks that have a fixed set of ports will explicitly populate them with the types of data they can accept. Tasks (such as the Group task) that can add/remove ports at run-time would only construct ports given either a sibling or child task, from which the Port’s m_dataTypes values could be set.

Open Issues

Operation Tasks

One of the issues that need to be resolved is how Operation Tasks would work within this new workflow design. Currently, an adaptor can take the attribute information from one Task (say a FillOutAttribute Task) and modifies the parameters of the internal operation within an Operation Task.

We could simply create a Port per Operation Parameter but that could make things relatively complex.

Possible Solution - Attribute Mapping Adaptors

In the new design the adaptor would probably connect the attribute resource output Port of the FillOutAttribute Task to the input Port of the Operation Task. However, unlike the previous examples where the adaptor would simply take the PortData of the output Port and call the setPortData of the other Task using the input Port, the adaptor would need to have access to the input PortData in order to manipulate it directly.

A new type of adaptor that would map attributes/items from one Attribute Resource to another Attribute Resource would need to be designed.

Note: In this design an Input Port maybe connected to multiple output Ports with each connection setting a subset of the operation Task’s parameters.

Bob_Obara · March 19, 2024, 9:43pm

@rohith @justin.2.wilson @Aaron FYI

dcthomp · March 20, 2024, 11:43am

There are a variety of similar, workable ways to define ports.

The proposed port definition above assigns a name, label, and direction (or orientation) to each port.

class Port : public smtk::resource::Component
{
public:
  std::string name() const override;
  std::string label() const;

  enum PortOrientation
  {
    Input,
    Output
  };
  PortOrientation orientation() const;

  // What type of data is accepted (either
  // an attribute::Definition type-name or
  // smtk::common::typeName<T>() for the
  // PortData subclass accepted.
  smtk::string::Token dataType() const;
};

The proposal above doesn’t specify how connections among ports are modeled. If we have a port orientation/direction, then we need to provide some method to traverse the connections:

class Port
{
  …
  const std::set<PersistentObject*>& connections() const;
};

However, it is also possible that instead of an orientation, every port could simply provide two sets of connections (input and output). The task that owned a particular port would simply only pay attention to one of them (either then input or the output).

class AlternativePort : public smtk::resource::Component
{
  std::string name() const override;
  std::string label() const;
  smtk::string::Token dataType() const;
  enum Orientation
  {
    Input,
    Output
  };
  const std::set<PersistentObject*>& connections(Orientation o) const;
};

This would make the Group task easier to model as a single port rather than a pair of ports.