Published

Ribose Report

Rept 11023
Comments on the "Publication Identifier Syntax for NIST Technical Series Publications"
Standardization
Ribose Report

Published 2021-12-22

Classification: unrestricted

Recipient: techpubs@nist.gov




Foreword

Ribose is an award-winning global developer of asymmetric security and standardization technologies trusted by industries with heightened cybersecurity needs. Ribose is a Deloitte Technology FAST 20 and Red Herring Top 100 Global company, and recipient of the CSA APAC Enterprise Award and multiple Stevie® Awards for its innovations.

Organizations that depend on Ribose solutions include Mozilla, the International Telecommunications Union, the International Standards Organization, the Internet Engineering Task Force, the British Standards Institution, the Ministry of Defense (UK) and other government agencies.

Ribose is the only cloud service provider (CSP) triple-assured by the Cloud Security Alliance, the first CSP to receive BSI’s Kitemark for Secure Digital Transactions, and the first to achieve certification to the highest security tiers in NIST CSF and MTCS. It is also certified to ISO 9001, ISO 14001, ISO/IEC 20000, ISO 22301, ISO/IEC 27001, ISO/IEC 27017, ISO/IEC 27018, ISO/IEC 27701, ISO 45001 and ISO 50001.


Executive summary

This document provides comments in response to the “Publication Identifier Syntax for NIST Technical Series Publications” document issued August 2021.

Ribose is a technology leader committed to open-source in the publication of machine-readable standards and normative content through its work with international, national and industry standards development organizations.

We laud the innovative and structured approach of the PubID syntax, and have implemented the NIST PubID scheme in our open-source Relaton bibliographic management software. The NIST version of Ribose’s Metanorma open-source publication toolchain now supports citations using PubID.

We have back-tested the implementation of the NIST PubID scheme against the full NIST Library catalog of 19,283 NIST Tech Publications published between 1901 to 2021, dating back to those published by NIST’s predecessor NBS, the National Bureau of Standards.

A list of recommended improvements are detailed in the document, including:

  1. Extend the functionality of the PubID scheme such that it generates a corresponding, deterministic identifier that is easily machine-parseable for machine consumption, for purposes such as DOI;

  2. Extend PubID rendering to support readable full-form identifiers and abbreviated identifiers currently used within NIST Tech Pubs, in addition to the current short-form PubID described in the document;

  3. Extend PubID coverage to the full NIST Tech Pubs history, including to documents published by the predecessor of NIST — NBS;

  4. Support additional “part types” values including insert, addendum and errata that exist in the NIST Tech Pubs history.

We strongly support this work, and welcome any questions or potential forms of collaboration given that it is one where we have an involved history with.

Comments on the "Publication Identifier Syntax for NIST Technical Series Publications"

1.  Scope

This document provides comments and proposals in response to the “Publication Identifier Syntax for NIST Technical Series Publications” document issued August 2021.

2.  Normative references

The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

NIST PubID, NIST. Publication Identifier Syntax for NIST Technical Series Publications. 2021-08.

ISO 690:2021, Information and documentation — Guidelines for bibliographic references and citations to information resources

3.  Terms and definitions

For the purposes of this document, the following terms and definitions apply.

3.1. document identifier

string of text that uniquely identifies a document

3.2. metadata-enhanced document identifier

unique document identifier [term defined in 3.1] that embeds metadata information about the document itself

EXAMPLE

A NIST PubID, such as “NIST SP (IPD) 800-53r5” embeds the document series, document stage and revision information.

3.3. machine-readable data structure

data structure suited for machine interpretation where its individual data elements are easily discernable

4.  Recommendations

4.1.  Recommendation 1: Generate machine-readable identifiers

4.1.1.  General

Machines are increasingly used to parse and correlate content given in digital formats, and that applies to document identifiers as well.

Machine-readable, or format restricted document identifiers are immensely useful, for example, the naming of DOI identifiers and encoding within databases.

The perpetual enemy of machine-readability is ambiguity. With the PubID data elements, a machine-readable identifier will provide a machine a “single-step” understanding where individual data elements of the identifier can be parsed and understood with minimal effort.

The last bullet of the last bullet in NIST PubID, Clause 3.1 demonstrates consideration of characters allowed in DOI suffixes, and hints that the PubID scheme complies with the requirements of a DOI suffix, by stating that these characters are allowed: -._,()/.

However, DOIs are to be used within URLs. The standard for URLs, RFC 3986, Clause 2.3 clearly states that only the following characters do not get “percent-encoded”:

For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers.

This means that these characters: comma ,, open and close parentheses (), should be percent-encoded, and hence causing an unnecessary rewrite of the URL for the user.

4.1.2.  Proposal

Figure 1 — PubID core data elements and its rendered outputs

Figure 1 shows how the PubID data elements, by providing a machine-readable data structure, enables a stable core for these various output formats to be generated.

We propose extending the functionality of the PubID scheme such that it generates a corresponding, deterministic identifier that is easily machine-parseable for machine consumption, for purposes such as DOI, where it only utilizes the ASCII characters in the set of ALPHA, DIGIT, hyphen and period.

In particular, the machine-readable PubID (“MRID”) utilizes periods instead of spaces as element delimiters.

The proposed scheme, by design, is meant to be very similar to existing assignments of DOI to NIST Tech Pubs.

The scheme is presented in Annex B.5.

4.2.  Recommendation 2: Support rendering PubID into long and abbreviated formats

4.2.1.  Background

The NIST PubID scheme (NIST PubID) defines a metadata-enhanced document identifier scheme that allows generation of a unique reference using a set of defined data elements.

A traditional NIST Tech Pub practice, especially in CSRC publications, is to provide the following variant forms of document identifiers within the document itself:

  • Full form, used in the title and the bibliography for citations

    EXAMPLE 1

    “National Institute of Standards and Technology Special Publication 800-27, Revision A”

  • Abbreviated form, used in the “Authority” section

    EXAMPLE 2

    “Natl. Inst. Stand. Technol. Spec. Publ. 800-57 Part 1, Revision 4”

  • Short form, used for inline citations

    EXAMPLE 3

    “In Section 3.2 of SP 800-187…​”

4.2.2.  Proposal

NIST PubID today provides an identifier that can be used as a short form, given the compactness of its syntax.

As the “full form” and “abbreviated form” identifiers already exist within a document, and that the PubID data elements already provide sufficient information to generate such output, we recommend to allow the generation of the “full form” and “abbreviated form” identifiers as well.

This change can be enacted by creating “full form” and “abbreviated form” generation templates — extend every data element to also provide their “full form” and “abbreviated form” representation.

The proposed list of data elements (with alternative form representations) is provided in Annex A.

4.3.  Recommendation 3: Extend PubID coverage to the full NIST Tech Pubs history

4.3.1.  Background

The NIST PubID scheme (NIST PubID) defines a metadata-enhanced document identifier scheme that allows generation of a unique reference to a NIST Tech Pub document without ambiguity.

There are at least two major “pattern series” of identifiers before the introduction of PubID due to historical reasons:

  • NIST publications produced prior to the PubID scheme (1988-)

  • NBS publications, produced under the National Bureau of Standards (the previous name of NIST, 1901 to 1988)

Ribose has implemented the PubID scheme in its Relaton bibliographic software, allowing for the lossless translation of an existing NIST Tech Pub document identifier into (and from) a NIST PubID.

NOTE 1  Relaton is an open-source toolchain developed by Ribose for the publication, retrieval and citation of information resources. It adopts an internal data model from ISO 690:2021 which allows a machine-readable citation to be built from a defined set of data elements. The Relaton format is used to serve bibliographic data from the CalConnect Standards Registry, the IETF BibXML service, and from the NIST CSRC Metanorma endpoint.

The open-source Relaton web service routinely imports two datasets from NIST in order to facilitate citations for NIST authors, including:

  1. the NIST CSRC Metanorma endpoint, which serves CSRC publications;

  2. the NIST Library’s Tech Pubs GitHub repository, which serves an export of CrossRef bibliographic data for NIST Tech Pubs, curated by the NIST Library.

NOTE 2  The CSRC endpoint provides additional bibliographic detail not provided by the CrossRef set, and also provides data on pre-publication stage documents while the latter set only provides data on published documents.

NOTE 3  The Relaton NIST bibliographic dataset was built and offered to the public sanctioned by the relevant parties at NIST.

4.3.2.  Proposal

Ribose has back-tested the full database of 19,283 NIST Tech Publications with the PubID scheme, including with documents published under NBS, and the scheme applies well given minor tweaks. These publications span the publication years of 1901 to 2021.

We have built up a full list of NIST and NBS Tech Pubs series in Annex A.3, and all of them have shown to work with the PubID scheme.

Adoption of this recommendation requires the creation of a new “publisher” data element (Annex A.2), where it can be “NIST” or “NBS”.

4.4.  Recommendation 4: Support extra part types

4.4.1.  Background

In NIST PubID the part types supported include “Part”, “Volume”, “Section”, “Supplement” and “Index”.

However, in the NIST Library’s collection of NIST Tech Pubs, there are also the following types of documents:

  1. Addendum. “NIST SP 800-38A” has a separately published addendum;

  2. Insert. “NBS CIRC 25 insert” is an insert;

  3. Errata. “NIST SP 801-errata” is a published errata.

4.4.2.  Recommendation

Support these additional part types with the following encoding:

  1. Addendum. “add” followed by potentially a number;

  2. Insert. “ins”;

  3. Errata. “err”“.

The full list of part types is given in Annex A.6.

5.  Summary

Ribose has provided a list of recommendations to enrich the NIST PubID scheme detailed in the “Publication Identifier Syntax for NIST Technical Series Publications” document.

Having been involved with the PubID idea, we believe it is an exemplary work that other organizations could adopt as best practice, and we are committed to be early adopter of the NIST PubID scheme:

Ribose expresses appreciation to NIST for the opportunity to submit these comments. The authors would especially like to thank Jim Foti of the Computer Security Division, Kathryn Miller and Kate Bucher of the Information Services Office for continuously improving NIST’s approach to developing and publishing the NIST Technical Series of publications.


Annex A
(informative)

Amended elements of the PubID

A.1.  General

This section provides a set of elements that provides additional information compared with the list provided in NIST PubID.

Notably, it provides data elements their corresponding rendering text for “full form”, “abbreviated form” and “machine-readable” formats.

A.2.  Publisher

Table A.1 — Publisher values

NameAbbrevShort
National Institute of Standards and TechnologyNatl. Inst. Stand. Technol.NIST
National Bureau of StandardsNatl. Bur. Stand.NBS

A.3.  Series

When a series has not seen usage of an “abbreviated form”, the value of “N/A” is used.

Table A.2 — Series values

PublisherPrefixNameAbbrevMR (with Publisher)Example
NISTNIST AMSAdvanced Manufacturing SeriesAdv. Man. Ser.NIST.AMSNIST AMS 200-2
NISTNIST BSSBuilding Science SeriesBldg. Sci. Ser.NIST.BSSNIST BSS 181
NBSNBS BSSBuilding Science SeriesBldg. Sci. Ser.NBS.BSSNBS BSS 94
NBSNBS BMSBuilding Material Structures ReportN/ANBS.BMSNBS BMS 140 Ed. 2
NBSNBS BRPD-CRPL-DBasic Radio Propagation Predictions SeriesN/ANBS.BRPD-CRPL-DNBS BRPD-CRPL-D 209
NBSNBS BHBuilding and Housing ReportsN/ANBS.BHNBS BH 18
NBSNBS CRPLCentral Radio Propagation Laboratory ReportsN/ANBS.CRPLNBS CRPL 6-3
NBSNBS CRPL-F-ACRPL Ionospheric DataN/ANBS.CRPL-F-ANBS CRPL-F-A 245
NBSNBS CRPL-F-BCRPL Solar-Geophysical DataN/ANBS CRPL-F-B245NBS CRPL-F-B245
NBSNBS IPCRPL Ionospheric PredictionsN/ANBS.IPNBS IP 25
NBSNBS CIRCCircularsN/ANBS.CIRCNBS CIRC 460sup1962
NBSNBS CISConsumer Information SeriesN/ANBS.CISNBS CIS 10
NBSNBS CSCommercial StandardsN/ANBS.CSNBS CS 113-51
NBSNBS CSMCommercial Standards MonthlyN/ANBS.CSMNBS CSM v9n10
NISTFIPS PUBFederal Information Processing Standards PublicationFederal Inf. Process. Stds.NIST.FIPSFIPS PUB 202
NISTNISTGCRGrant/Contract ReportsN/ANIST.GCRNIST GCR 17-917-45
NBSNBS GCRGrant/Contract ReportsN/ANBS.GCRNBS GCR 77-82
NISTNIST HBHandbookHandb.NIST.HBNIST Handbook 150-872
NBSNBS HBHandbookHandb.NBS.HBNBS Handbook 137
NBSNBS HRHydraulic Research in the United StatesN/ANBS.HRNBS HR 14A
NBSNBS IRPLInterservice Radio Propagation LaboratoryN/ANBS.IRPLNBS IRPL 27
NISTITL BulletinITL BulletinN/ANIST.ITLBNIST ITL Bulletin August 2020
NISTNIST LCLetter CircularN/ANIST.LCIRCNIST LC 1136
NBSNBS LCLetter CircularN/ANBS.LCIRCNBS LC 1128
NISTNIST MNMonographMonogr.NIST.MNNIST Monograph 175
NBSNBS MNMonographMonogr.NBS.MNNIST Monograph 125, NIST Monograph 125, Supp. 1
NBSNBS MPMiscellaneous PublicationsN/ANBS.MPNBS MP 260e1968
NISTNIST NCSTARNational Construction Safety Team ReportNatl. Constr. Tm. Act Rpt.NIST.NCSTARNIST NCSTAR 1-1A
NISTNIST NSRDSNational Standard Reference Data SeriesNatl. Stand. Ret. Data Ser.NIST.NSRDSNIST NSRDS 100-2021
NBSNSRDS-NBSNational Standard Reference Data SeriesNatl. Stand. Ret. Data Ser.NBS.NSRDSNSRDS-NBS 1
NISTNISTIRInteragency or Internal ReportN/ANIST.IRNISTIR 8347
NBSNBSIRInteragency or Internal ReportN/ANBS.IRNBSIR 79-1776
NISTNIST OWMWPOffice of Weights and Measures White PapersN/ANIST.OWMWPNIST OWMWP 06-13-2018
NBSNBS PCPhotographic CircularsN/ANBS.PCNBS RPT 10394
NBSNBS RPTReportsN/ANBS.RPTNBS PC 1
NISTNIST PSVoluntary Product StandardsProd. Stand.NIST.PSNIST PS 20-20
NBSNBS SIBSSpecial Interior Ballistics StudiesN/ANBS.SIBSNBS SIBS 1
NBSNBS PSVoluntary Product StandardsProd. Stand.NBS.PSNBS PS 15-69
NISTNIST SPSpecial PublicationSpec. Publ.NIST.SPNIST SP 800-115
NBSNBS SPSpecial PublicationSpec. Publ.NBS.SPNBS SP 500-137
NISTNIST TNTechnical NoteTech. NoteNIST.TNNIST TN 2156
NBSNBS TNTechnical NoteTech. NoteNBS.TNNBS TN 876
NBSNBS TIBMTechnical Information on Building MaterialsN/ANBS.TIBMNBS TIBM 61
NISTNIST TTBTechnology Transfer BriefN/ANIST.TTBNIST TTB 2
NISTNIST DCIData Collection InstrumentsData Collect. Instr.NIST.DCINIST DCI 002
NISTNIST EABEconomic Analysis BriefN/ANIST.EABNIST EAB 3
NISTNIST OtherOtherOtherNIST.OReport to the President
NISTCSRC White PaperCybersecurity Resource Center White PaperCSWPNIST.CSWPNIST.CSWP.04282021
NISTCSRC BookCybersecurity Resource Center BookCSRC BookNIST.CSBExecutive Guide to Computer Security, Metrics to Security
NISTCSRC Use CaseCybersecurity Resource Center Use CaseCSRC Use CaseNIST.CSUCWireless Medical Infusion Pumps: Medical Device Security
NISTCSRC Building BlockCybersecurity Resource Center Building BlockCSRC Building BlockNIST.CSBBDomain Name System-Based Security for Electronic Mail
NISTJPCRDJournal of Physical and Chemical Reference DataJ. Phys. & Chem. Ref. DataJPCRD(excluded from PubID scheme)
NISTJRESJournal of Research of NISTJ. Res. Natl. Inst. Stan.NIST.JRES(excluded from PubID scheme)

A.4.  Stage

The stage code element only applies to non-final publications.

In most series, documents are only released as final publications, and therefore their PubIDs will not contain a stage code.

Only some series support stage codes, e.g. SP 800 and SP 1800.

Table A.3 — Stage values

NameValue
Initial Public DraftIPD
Second Public Draft (to the Nth Public Draft)2PD (…​ nPD)
Final Public DraftFPD
Work-in-Progress DraftWD
Preliminary DraftPreD

A.5.  Report number

The contents and pattern of the report number are dependent on the series.

Possible values:

  • {sequence number}

  • {subseries}-{sequence number}

  • {sequence number}-{volume}

  • {sequence number}-2

  • {subseries}-{sequence number}-2

  • etc.

A “Part” can also be indicated by an appended alphabetic character to the end.

A.6.  Part

All part types allow a suffix number.

Table A.4 — Part values

NameAbbrev and ShortMR
PartPt.pt
VolumeVol.v
SectionSec.sec
SupplementSuppl.sup
IndexIndexindx
AddendumAdd.add (TBC with NIST)
InsertIns.ins (TBC with NIST)
ErrataErr.err (TBC with NIST)

A.7.  Edition

Table A.5 — Edition values

NameAbbrev and ShortMR
RevisionRev.r
EditionEd.e
VersionVer.ver

A.8.  Translation

An ISO 639-2 3-letter code that represents a translated document from English.

If a document is translated from English, suffix the document with a 3-letter ISO 639-2 code within parentheses.

Raw values seen in legacy DOIs include:

Table A.6 — Translation sample values

NameCorrect valueMRLegacy values seen in DOI
Spanish(ESP)espes
Vietnamese(VIE)vieviet
Portuguese(POR)porport
Chinese(ZHO)zhochi

NOTE  Only these 4 languages were seen in the full NIST Tech Pubs database.

A.9.  Update

A.9.1.  General

When a document is updated with an errata, the original edition may be reissued to include the errata.

These documents will display the text “includes updates as of…​”.

In this case the document identifier will include the element “Update”.

Table A.7 — Update values

NameAbbrev and ShortMR
UpdateUpd.u

A.9.2.  Update number

A sequential integer numbering of the update counting from the original document.

The first update is numbered 1, and so forth.

A.9.3.  Update year

The year last updated, shown as a suffix to the identifier.

  • {identifier}:{update-year}


Annex B
(informative)

PubID patterns

B.1.  Presentation

Generally in this order:

  • No update: {series} {stage} {report number}{part}2({translation})

  • With update: {series} {stage} {report number}{part}2({translation})/{update} {update number}:{update year}

B.2.  Full PubID

This is the fully expressed, unambiguous form of the Publication ID.

{publisher} {series} {reportnumber} {part | volume)}, {revision} {(draft), optional}

Figure B.1

EXAMPLE 1

National Institute of Standards and Technology Federal Information Processing Standards Publication 199

EXAMPLE 2

National Institute of Standards and Technology Special Publication 800-27, Revision A

EXAMPLE 3

National Institute of Standards and Technology Special Publication 800-39 (Second Public Draft)

B.3.  Abbreviated PubID

This form is used in the Authority section.

{abbrev(publisher)} {abbrev(series)} {reportnumber} {part | abbrev(volume)}, {abbrev(revision)} {(abbrev(draft)), optional}

Figure B.2

  • abbrev(series) represent the abbreviation of the Series title

    NOTE 1  The “update date” ({update-date}) is not represented in this syntax and shall be considered in the final scheme.

EXAMPLE 1

“Natl. Inst. Stand. Technol. Spec. Publ. 800-78-4”

EXAMPLE 2

“Natl. Inst. Stand. Technol. Spec. Publ. 800-116”

EXAMPLE 3

“Natl. Inst. Stand. Technol. Spec. Publ. 800-57 Part 1, Revision 4”

B.4.  Short PubID

The “short form” is used to cite the documents within text.

It is used in these situations:

  1. Locality references. “In Section 3.2 of SP 800-187…​” (the “SP 800-187” is a link).

    NOTE 1  NIST pubs are composed of “Sections” not “Clauses”

  2. A generic document reference. “SP 800-53 describes…​”. This form does not specify a revision or update date.

  3. “All parts”. “The SP 800-57 subseries describes key management…​”.

    NOTE 2  Currently, citations within documents can take form of say, “NISTIR 6885 2003 Edition (February 2003)”, which is rather long in length causing disruption in reading flow.

The syntax for a Short PubID could be:

{abbrev(series)} {reportnumber} {abbrev(volume)} {abbrev(revision)} {(draft), optional} {edition, optional}

Figure B.3

NOTE 3  For FIPS, {reportnumber} is the full FIPS number, including revision, e.g., 140-2.

Short form date:

  • Month YYYY

EXAMPLE 1

NIST SP 800-53r4 (20152201) supersedes NIST SP 800-53r4 (20140115)

EXAMPLE 2

NIST SP 800 63A (December 2017) supersedes NIST SP 800-63A

EXAMPLE 3

NIST SP 800 57 Part 1 Revision 4 supersedes NIST SP 800-57 Part 1 Revision 3 (“Rev.” is also accepted, and converted to “Revision”)

EXAMPLE 4

NIST SP 800 160 Volume 1 supersedes NIST SP 800-160 (20180103) (“Vol.” is also accepted, and converted to “Volume”)

EXAMPLE 5

Undated form “NIST SP 800 53r4”

Strip Revision and Date from title, only if the Revision and Date are unique for each document number. These are identified as “Rev. {number}”, “Revision {number}” and “(Month YYYY)”, whichever comes first.

B.5.  Machine-readable PubID

PubID form intended for machine parsing. Special care is taken to eliminate empty spaces and limit the character set to alphanumeric characters.

The syntax could be:

{publisher}.{series}.[{stage}.]{reportnumber}.{part}.{revision}.[{lang}.]{update-date}

Figure B.4

{publisher}.{series}.[{stage}.]{reportnumber}.{part}.{revision}.[{lang}.][{update}]{update-date}

Figure B.5

Generally, this rule should be able to uniquely identify an edition of a document.

  • {part}

    • Part

      • A “Part 1” document is encoded as “pt1”;

      • When a letter part is indicated, e.g. “800-63A”, we should keep it as part of the reportnumber, but also explicitly indicate the “pt”, e.g. NIST.SP.800-38A.pt-A

    • Volumes

      • “Volume 1” is encoded as “v1”;

  • {revision}

    • “Revision 1” is encoded as “r1”

    • If a superseding edition is a full revision, it will get the next Rev. #.

    • If a superseding edition is just an errata update, we use the update date from the title page (“includes updates as of …​”) to uniquely identify this edition. Preferably in the -yyyymmdd format.

  • {update}

    • “Update 1” is encoded as “u1”

Some examples:

EXAMPLE 1

NIST.SP.800-53r4-20152201 supersedes NIST.SP.800-53r4-20140115

EXAMPLE 2

NIST.SP.800-63A-20171201 supersedes NIST.SP.800-63A

EXAMPLE 3

NIST.SP.800-57pt1r4 supersedes NIST.SP.800-57pt1r3

EXAMPLE 4

NIST.SP.800-160v1 supersedes NIST.SP.800-160-20180103

EXAMPLE 5

NIST.IR.8204.u1-2019 supersedes NIST.IR.8204

EXAMPLE 6

The undated form is NIST.SP.800-53r4


Bibliography

[1]  RFC 3986, Uniform Resource Identifier (URI): Generic Syntax