Master Data, Supply Chain Master Data and Instance Data

We need to make a clear distinction between traditional Master Data (MD), Supply Chain Master Data (SCMD), and Instance Data (IData). This will help us understand some important differences in various supply chain track and trace technologies.

Master Data

Wikipedia defines “Master Data” like this today:

“…Master Data is that persistent, non-transactional data that defines a business entity for which there is, or should be, an agreed upon view across the organization.”

This isn’t detailed enough for me. MD must include a data element that serves as an identifier. An identifier that refers to a given MD record must be unique within the organization.

Good candidates for MD are customer information, location information, product information and employee information. The characteristic these all have in common is that the data behind them rarely change. For example, I have been issued an employee number by my company. My employee number is the unique identifier for the MD that describes me to the company. My mailing address, phone number, marital status, social security number rarely change.

Most organizations make use of MD so that they can maintain the definition of these entities in a single place, and they can simply refer to these definitions through the corresponding unique identifier. The identifier provides a quick way to get to the full set of information. In many cases, the identifier can serve as a stand-in for the full set of information.

Supply Chain Master Data

Wikipedia doesn’t yet have a definition for Supply Chain Master Data. I’ve coined the term to describe something that is similar, but distinctly different than Master Data as described above. I’ll define it like this:

“Supply Chain Master Data is that persistent, non-transactional data that defines a business entity for which there is, or should be, an agreed upon view across the supply chain.”

The only difference from the definition of MD above is that the definition of the business entity spans the supply chain, not just a single organization. For that to work, you need a standard so that everyone agrees on the definition of the data elements, the identifier and the rules for maintaining the data.

GS1 has defined the GLN (Global Location Number) standard for location identifiers and GTIN (Global Trade Item Number) for product identifiers. When you combine these standards with their GDSN (Global Data Synchronization Network) standard, you have what I have defined as SCMD. The GLN and GTIN standards are carefully defined to ensure that every identifier created by any entity in the supply chain is unique from every other one.

To learn the full details of these GS1 standards, including the rules that surround them, you have to read the GS1 General Specification. GS1 likes to sell it to you, but you can usually find it for free download by searching for it on the internet.

An important characteristic of SCMD that differentiates it from simple MD is that it has a property of ownership. That is, instances of SCMD may be used by entities throughout the supply chain, but each instance is owned (controlled) by only a single entity–the one that created it in the first place.

For example, the manufacturer of a product will generate (create) a GTIN identifier for that product and they will fill in the pertinent data fields that describe it. The manufacturer will use a GDSN service (or some other means) to distribute the SCMD to its trading partners for their use, but only the manufacturer has the right and the responsibility to maintain the content of the data associated with the GTIN. This is pretty easy because, like MD, SCMD should rarely change.

Also like MD, supply chain transactions will often refer to the product only by its GTIN as a shorthand way of referring to the full set of information contained in GDSN for that product.

Understanding the concept of SCMD and its characteristics is very important when discussing the characteristics of various pedigree models. I will finally return to the discussion of those in one of my next posts.

Instance Data

Instance Data also doesn’t have a definition in Wikipedia yet, but the name and the concept has recently been raised in some of the work groups within GS1. Here is my definition:

“Instance Data is data that is specific to a small set of instances of a particular serialized object class.”

IData is not a type of master data because there is no identifier involved and so there can be no separate set of information that describes it. Instead, it is simply data that can vary across each instance. For example, a serialized pharmaceutical always has a lot/batch number associated with each unit. A finite set of the serialized instances share the same lot/batch number (an “instance” is simply a single unit of the product).

I can’t think of another real example of instance data, but I suspect that once item-serialization becomes widespread there will be more types of IData defined. One possibility would be for a manufacturer to vary some characteristic of their covert anti-counterfeiting mechanism so that each unit of a product would be more unique than it is today. This IData would indicate which units received which polarity, or which color, or whatever characteristic that is being varied. This type of IData would not be shared with supply chain partners but would be retained by the manufacturer for use in an authentication scheme. Of course, IData can only exist if every unit has a unique serial number on it.

I hope this hasn’t been too technical for you. If you’re still reading, well done. You are now prepared for the next level of discussion of pedigree models.