Uploaded image for project: 'Fabric'
  1. Fabric
  2. FAB-16753

Improve configuration consistency across fabric components and enable generation of sample config documents

    XMLWordPrintable

Details

    • Epic
    • Status: Backlog
    • Medium
    • Resolution: Unresolved
    • v2.0.0
    • Future
    • fabric-common

    Description

      The following is a point in time copy of https://gist.github.com/sykesm/3a841471ab24ff711540dd78af2bf01d and may be out of date.

      ------

      Fabric Configuration

      Fabric configuration is currently implemented around a configuration library
      called [viper][viper]. Viper reads configuration from files, environment
      variables, and flags, and exposes an API that uses dot qualified keys to
      reference the configuration values (think System Properties on steroids).

      When configuration is read from files, the segments of the configuration key
      are used to walk config file stanzas to the data. Values read from the
      configuration file can be overridden by setting an environment variable that
      maps to the configuration key. Config values can also be sourced from flags.
      Flags take precedence over environment variables and values source from files.

      General Issues with Fabric Configuration

      Most of the issues we have with our configuration aren't problems with viper
      as much as how we use it.

      Viper Everywhere

      Viper provides a function based API that makes it easy for Fabric components
      to retrieve configuration values. Unfortunately, the API is so easy to use
      that it has proliferated throughout the code base like a virus. What was
      initially a light touch, easy to use pattern has become a core component with
      several issues:

      • Config is a global singleton that impacts concurrent testing of
        multiple config values
      • Easy access to global configuration resulted in code that could not be
        explicitly configured without using the viper API
      • Creation of "utility" layers on top of viper resulted in multiple entry
        points to the configuration data
      • Config information is spread throughout the system without good
        documentation or a single source for the config schema

      Default Configuration

      While viper enables programs to set default configuration values by calling
      SetDefault, Fabric has chosen to use "sample configuration" documents as
      configuration defaults. This results in poor configuration defaults and less
      than ideal sample config documents.

      In many cases, default values were chosen based on assumptions in test cases
      rather than utility in production scenarios.

      Inconsistencies

      Even though the orderer and the peer use viper for configuration, are
      documented together, and live in the same source tree, they use similar, but
      different patterns to obtain configuration information.

      For example, [{{core.yaml}}][core.yaml] uses camelCase for configuration keys
      while the orderer's [{{orderer.yaml}}][orderer.yaml] uses PascalCase.
      While the difference isn't particularly significant[^1], the arbitrary case
      difference makes it harder to template shared configuration values between
      the two.

      In TLS configuration, the orderer allows users to specify a PEM encoded
      certificate block or a path to a file. While this is very flexible, using
      "magic" instead of separate keys requires non-standard config processing and
      error handling.

      Fabric-CA also uses viper but, instead of using camelCase or
      UpperCamelCase, it uses lowercase for configuration keys. It's default
      configuration is also handled differently. If a config file does not exist, it
      will write one to the working directory for future customization.

      There are also places where the names of required configuration values are
      different for no really good reason. An example of this is the MSP
      configuration path. In the peer it's called mspConfigPath while in the
      orderer it's called LocalMSPDir.

      Reflect Based Extensions

      The orderer configuration attempted to avoid some of the viper proliferation
      problems that the peer suffers from. As part of that, a new viperutil
      package with an [{{EnhancedExactUnmarshal}}][enhanced-unmarshal] function
      was created. This package uses a combination of reflection, viper, and
      mapstructure to discover viper configuration keys.

      Unfortunately, the road to hell is paved with good intentions. In addition to
      violating the "[clear is better than clever][clear-proverb]" proverb, the
      config parsing implementation of the orderer ended up relying on
      case-preserving behavior in viper that was deemed to be a bug. This
      bug-as-feature behavior pinned us to an ancient version of viper.

      Environment Variable Overrides

      Historically, fabric has primarily been distributed as docker images. Since
      these images only contain the sample, default configuration values,
      environment variables were used to override the defaults. This is a perfectly
      reasonable mechanism, however, these overrides quickly got out of hand. Take,
      for example, a command we document in our "bring your first network" sample:

      CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin@org1.example.com/msp
      CORE_PEER_ADDRESS=peer0.org1.example.com:7051
      CORE_PEER_LOCALMSPID="Org1MSP"
      CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt
      

      In addition to the odd gopath references, the reliance on explicit, repeated,
      configuration with differing values while working across peers is error prone
      and confusing - especially when using command line tools.

      Our configuration model and viper further complicate things when mapping
      configuration values to environment variables. For example, the list of client
      root CA certificates for the operations service looks like this in config
      documents:

      operations:
        tls:
          # paths to PEM encoded ca certificates to trust for client authentication
          clientRootCAs:
            files: []
      

      The variable associated with this config is
      prefix_OPERATIONS_TLS_CLIENTROOTCAS_FILES. Since the key is associated with a list,
      how should the list be encoded? Well, in the peer, it looks like this:

      CORE_OPERATIONS_TLS_CLIENTROOTCAS_FILES: '/certs/tls/cacerts/cacert.pem /certs/msp/operationscerts/operationscert-1.pem'
      

      and in the orderer it looks like this:

      ORDERER_OPERATIONS_TLS_CLIENTROOTCAS: '[/certs/tls/cacerts/cacert.pem,/certs/msp/operationscerts/operationscert-1.pem]'
      

      Ignoring the slight variation in environment key name, the encoding of the
      values are quite different. The peer requires a space separated list of files
      while the orderer requires a comma separated list of values enclosed in square
      brackets.

      Another problem area is where we use variable values as keys in a map. The
      peer system chaincode element is a good example:

          # system chaincodes whitelist. To add system chaincode "myscc" to the
          # whitelist, add "myscc: enable" to the list below, and register in
          # chaincode/importsysccs.go
          system: 
              _lifecycle: enable
              cscc: enable
              lscc: enable
              escc: enable
              vscc: enable
              qscc: enable
      

      When the keys are variable, it may not be possible to map it to a valid
      environment variable. This was highlighted when we called the new lifecycle
      chaincode lifecycle because {} is not part of the POSIX portable character
      set definition for environment variables.

      Goals

      These are the goals of the configuration work:

      • Provide a consistent, YAML serialization of configuration for the peer,
        orderer, and CA.
      • Consistent use of camelCase (or snake_case1) for configuration keys.
      • Identical configuration schema for TLS and MSP for all Fabric programs.
      • Remove automatic mapping of environment variable overrides for config
        elements and move to explicitly named overrides where appropriate.
      • Provide a simple mechanism to reference environment variables as values in
        the configuration files.
      • Enable a portable configuration and runtime tree
      • Use the current working directory as the default configuration directory.
      • Ensure all file references within configuration are relative to the
        configuration document.
      • Enable automated generation of documentation and default configuration
        documents directly from code.
      • Rename core.yaml to peer.yaml

      Stretch Goals

      • Extract peer subcommands to a new CLI (called fabric or fabcli) and
        associated with a correspondingly named file for configuration. Config and
        data for this tool should be sourced from [XDG][XDG] compliant locations by
        default. (e.g. $HOME/.config/fabcli.yaml)
      • Implement an alternative local MSP that is sourced from a single YAML
        document instead of a configuration tree.
      • Stop using [{{mapstructure}}][mapstructure] when decoding config
      • Remove viper from Fabric

      Out of Scope

      These are explicitly out of scope:

      • Dynamic reloading of configuration
      • Updates to the channel config transaction tooling
      • General UX improvements to the command line

      Rules of the Road

      YAML Configuration Documents

      We continue to use YAML for our configuration documents. In addition to
      allowing JSON elements, it supports comments and multi-line block values.

      Default Config Location

      Long running processes will no longer use /etc/hyperledger/fabric as the
      default config location; instead, the current working directory will be used.
      An explicit configuration directory can still be specified by setting
      FABRIC_CFG_PATH in the environment.

      Command line flags should be provided to specify a specific configuration
      file. The flag will override any default location or environment variable
      value.

      Relative Paths are Context Sensitive

      When relative paths are used in a configuration file, the paths are to be
      interpreted relative to the directory containing the configuration file.

      When relative paths are used on command line flags, the paths are to be
      interpreted relative to the working directory.

      Consistent Naming Convention

      Configuration keys and command line flags will use a consistent naming
      convention. The choice of convention matters much less than ensuring it's used
      consistently.

      The leading candidates for config file keys are camelCase, snake_case, and
      PascalCase.

      The leading candidates for long command line flags are nodelimeter,
      hyphen-delimited, and camelCase.

      Config Keys are Constants

      When mapping a set of configuraiton values to some entity, avoid patterns
      where an input value is treated as a key:

        mapping: 
          1.2.3.4: 5.6.7.8
          9.8.7.6: 4.3.2.1
      

      Instead, use lists of objects with explicit field names:

        mappings: 
        - from: 1.2.3.4
          to: 5.6.7.8
        - from 9.8.7.6
          to: 4.3.2.1
      

      The latter form promotes type safety and extensibility.

      The major exception to this rule is when the config model is carrying generic
      or opaque configuration elements. In these cases, an opaque
      map[string]interface is appropriate.

      Certificates and Keys

      Certificate and key values in configuration will be PEM encoded blocks. If
      certificate chains are used, a multi-block value should be used. The blocks
      must be concatenated such that each certificate certifies the preceding it;
      the root CA shall be the last certificate in the list.

      Separate keys must be used for certificates and key values and for files that
      contain certificate and key values. For example:

        tls: 
          cert: |
            -----BEGIN CERTIFICATE-----
            Base64–encoded certificate
            -----END CERTIFICATE-----
          certfile: ~
          key: ~
          keyfile: tls/private.key
      

      Notice cert and certfile are elements of the tls configuration. When an
      inline certificate is used, it should be the value associated with cert;
      when a file reference is used, the path should be associated with certfile.

      It is an error to provide values for inline and file references for the same
      configuration element. In the example above, certfile is explicitly nil.

      Certificate pools should always support multiple PEM encoded blocks.

      Referencing Environment Variables

      Instead of exposing all configuration as environment variables, we will allow
      configuration values to be sourced from environment variables by using a value
      of ${env.ENVIRONMENT_NAME} as a value in the configuration file.

        id: ${env.MY_ENV_VAR}
      

      The ${env.NAME} format is will not interpreted if used within a value. For
      example, whem MESSAGE_ENV_VAR is set to "msg":

        msg: ${env.MESSAGE_ENV_VAR}
        message: my message is ${env.MESSAGE_ENV_VAR}.
      

      will result in message="my message is ${env.MESSAGE_ENV_VAR}" and
      msg="msg".

      Referencing environment variables is only supported for the basic types used
      in leaf nodes of the configuration.

      Durations

      Durations must always include units compatible with go's
      [time.ParseDuration][parse-duration] function. A value without a suffix is
      expressed in nanoseconds and is rarely appropriate.

      This means, by extension, that configuration keys should not contain units for
      the duration. (e.g. timeoutInSeconds should not be used)

      Structure Defines Schema

      To support automatic generation of documentation, configuration will be
      expressed as a set of structures in code.

      A config package will be provided decode configuration documents. The
      package will expose types and functions to resolve relative paths and apply
      configuration defaults to values omitted from the config.

      The config package will also expose functions to write example configuration
      documents with comments. Field level godoc will be used to document the config
      keys and tags will be used to provide the default value and, when required,
      the name of the environment variable that can be used to override what is read
      from configuration.

      // MyConfig is my configuration.
      type MyConfig struct {
          // ID provides a unique identifier.
          ID string `yaml:"id" example:"example-id" env:"MY_CONFIG_ID"`
      
          // Timeout is the maximum time the service will wait for a response.
          Timeout time.Duration `yaml:"timeout" default:"30s"`
      
          // Subsystem configures the subsystem.
          Subsystem *SubsystemConfig `yaml:"subsystem"`
      }
      
      // SubystemConfig is the configuration structure for subsystem.
      type SubsystemConfig struct {
          // Timeout is the maximum amount of time the subysstem will wait for a
          // response.
          Timeout time.Duration `yaml:"timeout" default:"10s"`
      
          // WorkingStoragePath points to the directory to store temporary data.
          WorkingStoragePath config.Path `yaml:"workingStoragePath" default:"subsys"`
      }
      

      The generated documentation will replace occurrences of the struct field name
      with the appropriate YAML config element name.

      The default configuration for MyConfig would look like this:

      ---
      # id provides a unique identifier
      id: example-id
      
      # timeout is the maximum time the service will wait for a response.
      timeout: 30s
      
      # subsystem configures the subsystem
      subsystem: 
        # timeout is the maximum amount of time the subsystem will wait for a
        # response.
        timeout: 10s
      
        # workingStoragePath points the directory to store temporary data.
        workingStoragePath: subsys
      

      [clear-proverb]: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=14m35s
      [core.yaml]: https://raw.githubusercontent.com/hyperledger/fabric/release-1.4/sampleconfig/core.yaml
      [enhanced-unmarshal]: https://github.com/hyperledger/fabric/blob/0b3146451b92658a6caf45b104d20dc89b46c33f/common/viperutil/config_util.go#L311-L346
      [mapstructure]: https://github.com/mitchellh/mapstructure
      [orderer.yaml]: https://raw.githubusercontent.com/hyperledger/fabric/release-1.4/sampleconfig/orderer.yaml
      [parse-duration]: https://golang.org/pkg/time/#ParseDuration
      [viper]: https://github.com/spf13/viper
      [XDG]: https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html

      [^1]: In the case of the orderer, yaml tags were not specified so the case
      doesn't matter when decoding but does matter when using the viper API.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sykesm Matthew Sykes
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: