MISMO V3 Notes: MISMO Myth Buster: We cannot use MISMO Because…

MISMO Myth Buster: We cannot use MISMO Because…

Introduction

“We cannot use MISMO because we are modern people and prefer Jason. XML is so year 2000.”

“We cannot use MISMO because we are people rooted in proven traditions of COBOL copybooks and DASD.”

“We cannot use MISMO because we are people who know and love database systems”

Over the next few blogs we will cover each of these variants of the myth in detail. Before we do the database reason we need some background

Background

MISMO publishes the Reference model (RM) as an XSD file because in that form it is directly usable by the most MISMO constituents. It could just as easily be published in UML or the proprietary model format of some modeling platform. We surveyed the largest organizations in the industry and found no consensus on a modeling platform. Not all modeling platforms accept UML. Most small organization had only passed familiarity with UML if any. So MISMO published its model as and XML Schema.

What is an RM? Here are the things that a reference model should do:

Define data points giving them names, definitions and data types.
Gather together data points that support a similar business concept using normalization rules.
Detail the cardinality of the sets of data points ( Optional, Required, May Occur many times)
Detail the relationship between business concepts.

The MISMO RM does all of this.

One thing that make the XML model a little less general than say a relational model is that an XML model is a tree. Therefore the MISMO RM could be implemented in any implementation platform that supports a tree structure. A tree structure is so traditional you can build it in COBOL copy books on DASD. A tree structure is a proper subset of the relational model so it can be implemented in an SQL relational database. Jason can express tree structures.

The Database case

Suppose you we going to create a brand new database based on the MISMO RM. For now we will accept that you want to do that and leave to a later blog the topic of choosing to use the DB to store the XML or choosing to shred the XML data points into columns of tables. The support for XML columns in Oracle and MS SQL server and others is very good and faster than you probably expect. But for now we will discuss the shredding approach.

Defining the logical level of a database implementation involves these steps.

Define data points giving them names, definitions and data types.
- In SQL this means writing DDL (Data Definition Language)
Gather together data points that support a similar business concept using normalization rules.
- The DDL defines tables and the columns with their data types.
- All of the MISMO data types have SQL equivalent data types
- The MISMO RM is normalized. The XML container elements that contain data points can be viewed as tables.
- Relational model normalization requires that each row of a table have a primary key.
  - If you must have natural keys create a composite key from the natural key for the table. This may take some figuring out.
  - If you are OK with synthetic keys add an auto-increment column.
- An XSLT filter can be written to read the MISMO RM in XSD form and produce the DDL to create all the tables with synthetic keys. (Do not expect MISMO to publish this XSLT or DDL)
Detail the cardinality of the sets of data points ( Optional, Required, May Occur many times)
- In the relational model foreign keys connect table.
- The MISMO RM XML Tree uses 1:1 and 1:many relationships by containment. These are expressed as contained elements that contain other containers.
- Build the foreign key DDL by following the XML container relationships.
- An XSLT can be written to read the MISMO RM XSD representation and produce the DDL.
Detail the relationship between business concepts.
- As above the XML tree defined the relationships between tables.

Implementation specific Roadblocks and Refinements

Column Name Length

Depending on your choice of SQL database there may be some additional work to do. MS SQL can support the 80 character maximum column names in MISMO. Oracle has a smaller limit. In the Oracle case you will need a two column file (XML or CSV does not matter) where one column is the MISMO data point name and the other the Oracle Column name you want. The XSLT will use that as a reference when making the Oracle DDL.

Reused containers

The MISMO RM uses some container repeatedly in several contexts. For example the ADDRESS container is used in 329 contexts in version 3.4 for example here are eight.

//DEAL/ASSETS/ASSET/ASSET_HOLDER/ADDRESS

//DEAL/ASSETS/ASSET/OWNED_PROPERTY/PROPERTY/ADDRESS

//DEAL/ASSETS/ASSET/OWNED_PROPERTY/PROPERTY/HOMEOWNERS_ASSOCIATIONS/HOMEOWNERS_ASSOCIATION/ADDRESS

//DEAL/ASSETS/ASSET/OWNED_PROPERTY/PROPERTY/PROPERTY_UNITS/PROPERTY_UNIT/ADDRESS

//DEAL/COLLATERALS/COLLATERAL/PLEDGED_ASSET/ASSET_HOLDER/ADDRESS

//DEAL/COLLATERALS/COLLATERAL/PLEDGED_ASSET/OWNED_PROPERTY/PROPERTY/ADDRESS

//DEAL/COLLATERALS/COLLATERAL/PLEDGED_ASSET/OWNED_PROPERTY/PROPERTY/HOMEOWNERS_ASSOCIATIONS/HOMEOWNERS_ASSOCIATION/ADDRESS

//DEAL/COLLATERALS/COLLATERAL/PLEDGED_ASSET/OWNED_PROPERTY/PROPERTY/PROPERTY_UNITS/PROPERTY_UNIT/ADDRESS

In a good database design you would not want 329 ADDRESS tables. You would want one with 329 different foreign keyed tables pointing to rows in it. The DDL would need to be further manipulated to accomplish this. It is unclear whether this step could be automated. Probably.

Conclusion

The MISMO Reference model can be implemented in a relational database of your choice.

Next time we will dive deeper into making a DB from the MISMO model.

MISMO V3 Notes

Monday, May 9, 2016

MISMO Myth Buster: We cannot use MISMO Because…

MISMO Myth Buster: We cannot use MISMO Because…

Introduction

“We cannot use MISMO because we are modern people and prefer Jason. XML is so year 2000.”

“We cannot use MISMO because we are people rooted in proven traditions of COBOL copybooks and DASD.”

“We cannot use MISMO because we are people who know and love database systems”

Over the next few blogs we will cover each of these variants of the myth in detail. Before we do the database reason we need some background

Background

What is an RM? Here are the things that a reference model should do:

The MISMO RM does all of this.

The Database case

Implementation specific Roadblocks and Refinements

Column Name Length

Reused containers

Conclusion

No comments:

Post a Comment

Pages

Contributors