Supporting Units in SMTK's Attribute Resource

Current “Support” of Units

In SMTK 23.01, a designer can set the units of any ValueItemDefinition which is the base class used to represent String, Integer, and DoubleItem Definitions. The units are represented as a string and is used in the UI to indicate the units the value is assumed to be specified in.

There is now a call to take this to the next level and formally support units and unit conversion.

How Units Could Be Supported

When setting the value of the DoubleItem with the current setValue(double) method, it would be assumed that the value is in the default units specified by the designer. If the value is specified with its own units then if the units are different from the default, a unit conversion would take place. In either case, the value() method would always return the value in the default units. In the case of unit conversion there would be the need to access the value the user entered as well as the units specified.

Specifying a value with units

We could support either specifying the value and units using two separate line edit widgets (line edit and combobox) or with a single line edit with a validator and completer.

Screenshot 2023-02-22 at 4.43.16 PM

Note that the second approach would require the UI to somehow display the default units (in this case I’ve added them to the label),

The UI should also provide a mechanism to display the value in the default/preferred units. This could be as simple as having over the widget and using a tooltip or providing a checkbox to display the converted form. The former would take up less space.
Screenshot 2023-02-22 at 5.03.23 PM

Note: if there are no default units specified then SMTK would assume the value is unites and no unit information would be displayed.

Integration of a unit conversion library

SMTK would delegate the task of doing the conversion to an existing Open-Source library such as:

Proposed API Changes

  1. Move the units specification to DoubleItemDefinition instead of ValueItemDefinition, unless folks can think of use cases where Integers and/or Strings can also benefit from having unit support.
  2. Add one of the two possible APIs to DoubleItem to specify a value with units:
    • setValue(double val, const std::string& units)
    • setValue(const std::string& valueWithUnits)
  3. Add one of the two possible API’s to DoubleItem to retrieve the original information:
    • double value(std::string& units) const
    • std::string value() const - Note that we could also use the existing valueAsString method
  4. Add the ability to save the original entries in the XML/JSON file formats

Update May 30, 2023

Developing New Units Library

In the original post we had said that we would be using an existing open source units library. We took a closer look at:

Both these libraries had conversion issues when dealing with fractional powers. Though the cases in which these are needs are very few, the fact that they process these cases incorrectly means that if a fractional power is specified in error, the software will in some cases not catch it.

In the case of Boost::units, the focus is on compiler-time support and not run-time which is what is needed in this situation.

As a result we are writing our own open source units library.

API Changes

  • DoubleItemDefinition - no changes are needed since the concept of units is supported by ValueItemDefinition
  • DoubleItem
    • value(int i) - will return the item’s ith value in the units specified in its definition. If the definition’s units is empty then units are not supported and no conversion of the value that was specified is performed.
    • specifiedValue(int i, std::string& units) - will return the item’s ith value in the units specified in setValue(…) methods. In this case no conversion is performed.
    • convertValue(int i, const std::string& units, double& val) - will attempt to return the value in the specified units. If no conversion is possible it will return false
    • setValue(int i, double val) - assumes that the ith value being specified is in the same units as its definition
    • setValue(int i, double val, const std::string& units) - will convert the value if needed to appropriate units. If there is not possible conversion between the units specified in the definition and those passed into setValue, no assignment is performed and the method will return false
    • setStringValue(int i, const std::string& val, const std::string& units)
    • setValueAsString(int i, const std::string& valAndUnits)
      • Edge Cases
        • Either Definition or Specified units are not specified- in this case the method will fail

In terms of member variables:

  • m_value (current) - the value of the item in the units of the definition.
  • m_specifiedString (new) - the specified value as a string
  • m_specifiedUnits (new) - the specified value of the units (maybe)

@Aaron @justin.2.wilson @rohith @dcthomp @johnt FYI

A more compact form might be:

Screenshot from 2023-02-22 17-53-45

After a user enters a number with units, the right-hand label would (optionally, based on item-configuration) display the converted value:

Screenshot from 2023-02-22 17-51-54

If configured not to display a preview of the converted number, then a tooltip would always be available:

Screenshot from 2023-02-22 17-57-04

The right-hand label could also be cycled by a user click to display/hide the converted value.

Regarding the 2 options

  • I think 2 separate fields would be more intuitive for users. Among the reasons is that it provides better clarity when the value and/or units are invalid.
  • I would also prefer that users get a dropdown list (or some equivalent fixed list) from which to choose the unit type, at least to start. (And maybe later we can add the “enter your own” as a future option based on user demand.) This would require new syntax in our xml and smtk file formats. We should come up with a scheme so that template authors don’t have to copy/paste alot of lists throughout the xml file.
  • If we go with separate value & unit fields, should we update the value when the user changes units? I presume so, otherwise range checking would get “interesting”. But I might be overlooking some other consequences.
  • I am not sure we need to show the user the equivalent value in solver-input units, but the suggestion of a context menu or tooltip works for me. Anything that doesn’t eat real estate.

Also

  • It might be handy at startup if the user could optionally set a preferred unit system and smtk updates the attdef defaults.
  • I agree that we should move units from value items to just doubles. The only integer example I can think of might be a “time” item that could be set to an integer number of minutes, hours, or days, etc. We can handle as a special case (2 items) if it ever comes up.

Another alternative that came up during conversations is to use the placeholder text to show units. This would eliminate the need for a second QLabel to the right of the QLineEdit.

Screenshot from 2023-02-28 01-42-16

The downside is that as soon as you start typing, the placeholder text disappears; if you forget the default units you will not be reminded of them. Either a context menu would have to show this or you would have to erase your entered value to see the placeholder text again. Here’s an example of what that might look like:

unit-proto-placeholder+context

Here is a preview of how a double item might look with a units dropdown.

  • To save time, the units list was provided by a Python library called “Pint”.
  • This library finds all aliases with the same dimensionality, hence the “statmho” item which, I now know, is a unit of electrical conductance with base dimensions that resolve to distance/time.
  • The lists includes some other items that are probably not applicable to most use cases as well (“speed_of_light” maybe). We should provide developers a way to exclude items from the list for a given dimension.
  • Some units libraries also have APIs to add new units/quantities. We could consider adding that feature too.

units-preview-mph-tab

One more follow-up, mostly fyi. I wrote a script to traverse the udunits2-common.xml file and extract all non-SI aliases, organized by “quantity” according to xml comment nodes in the file. I never came up with a good way to format the results, but settled on the following.

| Plane Angle | circle | cycle | degreeE | degreeN | degreeT | degreeW | degree_E | degree_N | degree_T | degree_W | degree_east | degree_north | degree_true | degree_west | degreesE | degreesN | degreesT | degreesW | degrees_E | degrees_N | degrees_T | degrees_W | degrees_east | degrees_north | degrees_true | degrees_west | grade | revolution | rotation | turn |

| Mass | apdram | apothecary_ounce | apothecary_pound | apounce | appound | assay_ton | avoirdupois_ounce | avoirdupois_pound | bag | carat | dr | dram | gr | grain | lb | long_hundredweight | long_ton | pennyweight | pound | scruple | short_hundredweight | short_ton | slug | ton | troy_ounce | troy_pound |

| Length | US_statute_mile | US_survey_feet | US_survey_foot | US_survey_mile | US_survey_yard | arpentlin | barleycorn | big_point | chain | fathom | feet | fermi | foot | ft | furlong | in | inch | international_feet | international_foot | international_inch | international_mile | international_yard | light_year | mi | micron | mil | mile | nmile | parsec | perch | pica | pole | printers_pica | printers_point | rod | yard | yd |

| Angular Velocity | cps | rotation_per_second | rotations_per_second | rpm | rps |

| Mass per unit time (includes flow) | perm_0C | perm_23C | perms_0C | perms_23C |

| Area | acre | circular_mil | darcy |

| Volume | BZ | Canadian_liquid_gallon | Tbl | Tblsp | Tbsp | UK_fluid_ounce | UK_liquid_cup | UK_liquid_gallon | UK_liquid_gill | UK_liquid_ounce | UK_liquid_pint | UK_liquid_quart | US_dry_gallon | US_dry_pint | US_dry_quart | US_fluid_ounce | US_liquid_cup | US_liquid_gallon | US_liquid_gill | US_liquid_ounce | US_liquid_pint | US_liquid_quart | acre_feet | acre_foot | barrel | bbl | board_feet | board_foot | bu | bushel | cc | cup | dry_pint | dry_quart | firkin | fldr | floz | fluid_dram | fluid_ounce | gallon | gill | liquid_cup | liquid_gallon | liquid_gill | liquid_ounce | liquid_pint | liquid_quart | oz | peck | pint | pk | pt | quart | register_ton | stere | tablespoon | tblsp | tbsp | teaspoon | tsp |

| Time | Gregorian_year | Julian_year | common_year | eon | fortnight | jiffy | leap_year | lunar_month | month | shake | sidereal_day | sidereal_hour | sidereal_minute | sidereal_month | sidereal_second | sidereal_year | tropical_month | tropical_year | week | year | yr |

| Volume per time | sverdrup |

| Acceleration | gravity | standard_free_fall |

| Force | dyne | force | force_gram | force_kilogram | force_ounce | force_pound | force_ton | gf | gram_force | grams_force | kgf | kilogram_force | kilograms_force | kip | lbf | ounce_force | ounces_force | ozf | pond | pound_force | poundal | pounds_force | ton_force | tons_force |

| Pressure, Stress | B_SPL | at | atm | atmosphere | barie | barye | cmH2O | cmHg | cm_H2O | cm_Hg | feetH2O | feet_H2O | feet_water | footH2O | foot_H2O | foot_water | ftH2O | fth2o | inHg | in_Hg | inch_H2O_39F | inch_H2O_60F | inch_Hg | inch_Hg_32F | inch_Hg_60F | inches_H2O_39F | inches_H2O_60F | inches_Hg | inches_Hg_32F | inches_Hg_60F | ksi | millimeter_Hg | millimeter_Hg_0C | millimeters_Hg | millimeters_Hg_0C | mmHg | mm_Hg | mm_hg | mmhg | psi | standard_atmosphere | technical_atmosphere | torr |

| Viscosity | St | poise | rhe | stokes |

| Energy, Work, Quantity of Heat | Btu | Btus | EC_therm | IT_Btu | IT_Btus | IT_calorie | TNT | US_therm | bev | cal | calorie | erg | therm | thermochemical_calorie | thm | ton_TNT | tons_TNT | watthour |

| Power, Radiant Flux | BW | Bm | UK_horsepower | VA | boiler_horsepower | electric_horsepower | horsepower | hp | metric_horsepower | refrigeration_ton | shaft_horsepower | ton_of_refrigeration | tons_of_refrigeration | voltampere | water_horsepower |

| Heat | clo |

| Thermodynamic Temperature | degF | degR | deg_F | deg_R | degreeF | degreeR | degree_F | degree_R | degree_fahrenheit | degree_rankine | degreesF | degreesR | degrees_F | degrees_R | degrees_fahrenheit | degrees_rankine | degsF | degsR | degs_F | degs_R | fahrenheit | °F | °R | ℉ |

For the record, I skipped a few quantities, labeled: Lineic Mass (related to fibers), Some “units” that make subsequent definitions easier, Electricity and Magnetism, Illumination, and Miscellaneous.

Further thoughts and opinions:

  • Although the Pint library is really good, I think we have to drop it from consideration for the SMTK units library. Using Pint would require embedding Python and pybind11 into the smtkCore C++ library, which is a Rubicon we shouldn’t be crossing.
  • I think there is general agreement that the application should be able to control the list of compatible unit names displayed in the dropdown box for a given DoubleItem. To do this, I suggest that applications optionally provide a .csv file that is read in at runtime. API and format are TBD, but for a strawman:
    • The .csv file could be applied to individual projects, but I would start with a global file.
    • I would use one row per quantity (e.g., length, mass, speed, force, …)
    • Maybe the first column could be the quantity label
    • And maybe the second column could be the canonical SI representation (e.g., meter-second^-1) even if that is not a supported alias. This would allow the smtk parsing code to check that each of the remaining columns match its dimensionality.
    • It might make sense to have 2 .csv files for (i) alias names and (ii) alias symbols TBD.
    • For convenience, SMTK should probably provide a function to dump out all of the unit names and symbols it recognizes in the same .csv format.
  • Software design specifics haven’t been given alot of attention, but for a strawman:
    • I would start with a singleton smtk::common::DimensionalUnitsUtility class that would initialize the third-party units library and probably read/write the .csv files mentioned above.
    • It might make sense to also have a smtk::common::DimensionUnits class that would encapsulate the units functionality for both smtk::attribute::DoubleItem and whatever code we implement for the “specify output units” feature. We can decide when we get there…
    • On the UI side, it might also make sense to have a smtk::extension::qtDimensionalUnitsWidget class to handle the units display (editable combobox).
  • The overload(s) of DoubleItem::value() that include unit conversion need some way to return whether the conversion was valid or not. One possibility:
    double smtk::attribute::DoubleItem::value(
        int element, const std::string& units, bool& isValid) const

would require the calling code to check the isValid flag before using the return value.

  • Maybe there should be a function to return the value in whatever the DoubleItemDefinition units are specified. This, presumably would always be a valid conversion.
    double smtk::attribute::DoubleItem::valueInDefinitionUnits(
        int element) const

Instead of CSV can we do something more structured like JSON? It is super-easy to deserialize into structures with sets and maps, so having something like:

// A base holding names.
struct Entity
{
  smtk::string::Token m_name;
  smtk::string::Token m_symbol;
  std::unordered_set<smtk::string::Token> m_aliases;
  smtk::string::Token m_description;
};

// One of the seven or so base dimensions.
struct Dimension : Entity { };

// Dimensions mapped to their exponents:
using ExponentMap = std::map<Dimension*, double>;

// Any unit (composite or otherwise.)
struct Unit : Entity
{
  ExponentMap m_dimensions;
};

// The entire unit configuration file.
struct UnitSystem
{
  std::unordered_map<smtk::string::Token, Dimension> m_dimensions;
  std::unordered_map<smtk::string::Token, Unit> m_derivedUnits;
  std::map<ExponentMap, std::unordered_set<Unit*>> m_indexByDimension;
};

The UnitSystem’s m_derivedUnits map would index all the units by name, symbol, and alias. It could even be a multi-map. The JSON equivalents to all these would be very straightforward.

I wrote a simple PEGTL parser written on Saturday that can be used to prevent ambiguous/confusing expressions that udunits2’s parser accepts. It is here for anyone curious.

You can compile it by placing it in SMTK’s source tree and pointing it to your cmb-superbuild like so:

 c++ -g -O0 -Wall -Wextra -std=c++14 \
  -I. -I../src -I /path/to/superbuild/install/include/ \
  -I /usr/include/udunits2 \
  -isystem /usr/include/qt5 \
  -isystem /usr/include/qt5/QtWidgets \
  -isystem /usr/include/qt5/QtGui \
  -isystem /usr/include/qt5/QtCore \
  -L /usr/lib64/qt5 -lQt5Gui -lQt5Widgets -lQt5Core \
  -ludunits2 -o unit-parser-proto \
  ../src/unit-parser-proto.cxx

As you can see when you run it, it will attempt to parse what you type into a QLineEdit and change the background color to

  • pink when there’s a parse failure
  • red when units are parsed but do not conform to length.

It also prints a bunch of debug crud to the console.

I proposed .csv format so that developers could configure their units features in a spreadsheet. Maybe we could support both a “simple” version with .csv and and “advanced” version with json?

This works for single units (m, kg, s) but it’s not obvious how to extend, for example, from “m” to “m/s”. Would we need the parser to extract each unit and prefix? might get tricky.

The parser already recognizes units and operations, it just does nothing with them since the demo uses udunits to re-parse the validated string and perform conversion from whatever to meters. Really, we are not that far from having our own unit library.

I still have gaps in understanding udunits. If it’s incremental, can you update your prototype to change “meter” to “meter/sec”?

Sure, just remove the lines that create the destinationUnit variable and replace with

auto* destinationUnit = ut_parse(unitSystem, "m/s", UT_UTF8);

As we have been evaluating the various libraries available for unit conversion, I have been underwhelmed by the feature set compared to our desired features. Since the unit libraries we are considering would be add dependencies to smtkCore, I am also unhappy with some of the luggage. Finally, the example code in this reply above implements a parser that, in effect, provides a shadow unit system on top of udunits2 to prevent its native parser from accepting confusing inputs.

So, I took a little personal time last week and started my own unit library. With 2-3 more days of work, it could do everything we need (parsing dimensions, parsing units defined in terms of dimensions, parsing measurements of quantities that include units, unit conversion, and completion/prompting of units that are conformal to some desired/destination unit or dimension).

Some progress on the homebrewed units library:

Hopefully, unit conversion will be implemented by the end of the week.

I am considering an eventual dependency on Eigen to provide tools for Buckingham Pi theory. If you have an opinion on whether that should be a separate (very small) repository or library rather than included in the units repo/library, let me know.

Update (2023-06-11):
The units repo now provides unit conversion and some tests. There is also

  • a Converter object so that you can perform bulk conversion of values that all share the same units.
  • a command-line conversion tool named unit-converter.

The library has a dependency on Eigen in order to simplify the process of finding the sequence of relevant conversions needed for a user request. More details are in this brief slide set that cover the basic design decisions of the unit library.

Update (2023-07-06):
With this upcoming MR, the units library will properly handle multiple parenthetical expressions. Because it required a large change to the parser, there are additional features. The full list of changes is

  • Multiple parenthetical expressions are now handled. Examples
    • kg (m / s)^2 / (hr / in)
    • ((m/s)^2 kg)^2/N
  • units and prefixes may have spaces in their names/symbols and still be parsed properly (e.g., it is possible to add a pounds per square inch with symbol psi and use the long name in expressions).
  • A measurement may include addition and subtraction as long as all the operands are convertible.
    • The output units of the expression will match the first (left-most) term in the series. For example:
      2.3 Watt / meter^2/ Kelvin + 2.40326e-06 hp / inch^2 / degF
      will be output as 7.3 W·K^-1·m^-2.
    • If you enter a measurement with incompatible or unconvertible units, the parser will report failure and the returned Measurement will have a NaN value with empty (dimensionless) units.
    • Measurements may also be multiplied or divided. For example, parsing the string 5 m * 10 m will generate a measurement of 50 m^2.
  • Within any unit expression (i.e., the outermost group of terms or any parenthetical sub-expression), once the division operator (/) appears, all following operands are treated as part of the denominator in that (sub)expression.
    • This means kg m / s / s and kg m / s * s and kg m / s^2 all evaluate to the same output unit (Newtons).
    • You can use parenthetical expressions to have multiple numerator and denominator groups: (kg m / s^2) (m / s) evaluates to kg m^2 s^-3.
  • See the testParenthesisParsing() function in units/testing/testBasics.cxx for more examples.