Skip to content

ABNF suggestions #858

@ahouseholder

Description

@ahouseholder

(Comments from @tschmidtb51)

Given my thought process over time and during that review: I'm now full convinced that we should differentiate between language and x-name as this improves parsing and validation significantly. Therefore, I suggest - in addition with some other improvements:

  • all lower case except for BCP 47
  • allow for fragment parts also in unregistered namespaces
  • correct extentions in BCP 47
  • explicitly state reverse DNS (except for the length rule - that need to be in the regex)
!!! info "ABNF Notation"

    ```abnf
    namespace = base-ns [extensions]
    ; Overall namespace must be 3–1000 characters
    ; (Enforced via regex length lookahead)
    
    base-ns = x-base / std-base
    x-base  = "x_" x-name 
    std-base = ns-core
    
    ; ns-core starts with a lowercase letter and may have '.' or '-' separators.
    ; Consecutive '.' or '-' are not allowed.
    ns-core = LOWER ALNUMLOW *("." / "-" 1*ALNUMLOW)
    
    x-name = reverse-dns [ "#" fragment-seg ]
    
    ; reverse-dns provides an specification for a reverse DNS name
    ; can't be longer than 253 characters
    reverse-dns = label 1*("." label)
    
    ; Each label must be between 1 and 63 characters long
    label = ALNUMLOW [ *61(ALNUMLOWDASH)  ALNUMLOW ]
    
    fragment-seg = 1*ALNUMLOW *( ("." / "-") 1*ALNUMLOW )
    
    extensions = lang-ext [ 1*("/" ext-seg) ]
    
    ; Language extension: either / (empty language extension)
    ; or /<bcp47>/ (BCP-47 language code)
    lang-ext = "/" /  ( "/" bcp47 )
    
    ; Extension segment between slashes.
    ; - Must start with ALPHA
    ; - May have '.' or '-' separators
    ; - Optional '#' section, at most one per segment
    ; - No consecutive '.' or '-'
    ext-seg = bcp47 / "." x-name
    
    ; BCP-47 tag (based on the regex expansion)
    bcp47 = ( 2*3ALPHA
                [ "-" 3ALPHA *2( "-" 3ALPHA ) ]
            / 4*8ALPHA )
            [ "-" 4ALPHA ]
            [ "-" ( 2ALPHA / 3DIGIT ) ]
            * ( "-" ( 5*8ALNUM / DIGIT 3ALNUM ) )
            * ( "-"  singleton 1*("-" (2*83ALNUM)))
            [ "-" "x" 1*( "-" 1*8ALNUM ) ]
          / "x" 1*( "-" 1*8ALNUM )
          / "i-default"
          / "i-mingo"

                                     ; Single alphanumerics
                                     ; "x" reserved for private use
     singleton     = DIGIT               ; 0 - 9
                             / %x41-57             ; A - W
                             / %x59-5A             ; Y - Z
                             / %x61-77             ; a - w
                             / %x79-7A             ; y - z
    
    ; Character sets
    LOWER = %x61-7A   ; a-z
    ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
    DIGIT = %x30-39   ; 0-9
    ALNUM = ALPHA / DIGIT
    ALNUMLOW = LOWER / DIGIT
    SPECIALCHAR = DOT / DASH
    DOT = %x2E; .
    DASH = %x2D ; -
    ALNUMLOWSC = ALNUMLOW / SPECIALCHAR
    ALNUMLOWDASH = ALNUMLOW / DASH
    
    ; Constraints:
    ; - No consecutive "." or "-" in ns-core or ext-seg; exception in label
    ; - Each ext-seg can contain at most one "#".
    ; - Overall namespace is 3–1000 chars.
    ```

The question whether a-a-----a or a---a is valid as label is still open. Based on the RFC 1035, I would say yes. For the latter one, dig complains about some IDNA2008...

@ahouseholder @sei-vsarvepalli: I hope I corrected all mistakes I discovered earlier but please double check.

Originally posted by @tschmidtb51 in #821 (comment)

Sub-issues

Metadata

Metadata

Labels

enhancementNew feature or requestintegrationRelated to integration of SSVC into another framework or systemintegration/blockerSomething that is blocking integration with another framework or systemtech/backendBack-end tools, code, infrastructuretech/dataData implementation (content of /data, data object instances, etc.)

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions