Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

You are supposing that these will be modeled as class parameters in the first place. Certainly they are data that characterize the configuration to be deployed, and it is possible to model them as class parameters, but that is not the only – and maybe not the best – alternative available to you. Class parametrization is a protocol for declaring and documenting that data on which a class relies, and it enables mechanisms for obtaining that data that non-parametrized classes cannot use, but the same configuration objectives can be accomplished without them.

How to solve a 'macro substitution of the puppet code' issue (or whether we need to solve it at all)

Current Bigtop code uses quite a bit of dynamic lookups to solve a problem of multiple classes creating a pretty rich state in class-local variables and then calling onto a common piece of code to instantiate configuration files based on that information. Here's an example:

rpelavin says:

Seems like class parametrization should be a prime method, but allow some exceptions. One exception may be to handle what I would call "fine tuning parameters" to affect the configuration variables that may be in, for example, the hdfs-site, mapred-site . config files. If relied solely on class paramterization issue would be you may have classes with 30 - 50 parameters which is very cumbersome. Instead you might treat these with "global variables". Also related is keeping in mind easing the process of iterating on modules and adding new parameters; some of the signatures seem very set while other will evolve more rapidly.

How to solve a 'macro substitution of the puppet code' issue (or whether we need to solve it at all)

Current Bigtop code uses quite a bit of dynamic lookups to solve a problem of multiple classes creating a pretty rich state in class-local variables and then calling onto a common piece of code to instantiate configuration files based on that information. Here's an example:

Code Block

class one_of_many {
    if ($::this == 'that') {
       
Code Block

class one_of_many {
    if ($::this == 'that') { 
       $var1 = 'something' 
    } else { 
        ....  
   } 

   include common_code
}

With older Puppets common_code class would have access to all the $varN variables simply by the virtue of doing dynamic scope lookups. With newer puppets the only alternative seems to be explicit ennumeration of all the variables that are required for common_code to perform its task. E.g.

Code Block
class common_code($var1, ... $varN) {
}
....
   class { "common_code": 
       var1 => $var1,
       .....
       varN => $var1
   }
jcbollinger says:

With newer Puppet, the key is that each class needs to know from where, proximally, the data it consumes come (and coding to that principle is good practice in every version of Puppet). Class parameters are probably the most obvious proximal source on which a class can rely for data (in 2.6+), but classes can also rely on class variables of other classes by referring to them by fully-qualified name, and it is discussed above how classes can obtain shared external data via an appropriate function, such as hiera() or extlookup(). To the extent that the current codebase appears generally to declare each class in just one place, the most direct accommodation for Puppet3 (and good practices generally) would probably be to just qualify all the variable references. Indeed, that was typical advice to anyone preparing to move from Puppet 2.x to Puppet 3. That is not meant to discount the opportunity to perform a deeper redesign, however.

How much of the DevOps community would we lose if we focus on Puppet 3?


       varN => $var1
   }
jcbollinger says:

With newer Puppet, the key is that each class needs to know from where, proximally, the data it consumes come (and coding to that principle is good practice in every version of Puppet). Class parameters are probably the most obvious proximal source on which a class can rely for data (in 2.6+), but classes can also rely on class variables of other classes by referring to them by fully-qualified name, and it is discussed above how classes can obtain shared external data via an appropriate function, such as hiera() or extlookup(). To the extent that the current codebase appears generally to declare each class in just one place, the most direct accommodation for Puppet3 (and good practices generally) would probably be to just qualify all the variable references. Indeed, that was typical advice to anyone preparing to move from Puppet 2.x to Puppet 3. That is not meant to discount the opportunity to perform a deeper redesign, however.

How much of the DevOps community would we lose if we focus on Puppet 3?

Puppet 3 provides quite a bit of enchantments in precisely the areas where we need it (Hiera, etc.). The question is, if we focus on Puppet 3 too much, would we alienate tons of DevOps community and make our Puppet code less useful to them?

rpelavin says:

Answer to this depends on understanding in more detail what you want to deliver and context under which this system is being used.

First, is the "thing you are delivering" a system that builds and deploys Hadoop stacks or are you delivering (in addition) a set of reusable Hadoop stack modules.

If former is the goal, then the following is applicable: If the Puppet modules are being viewed as the implementation to provide higher-level functions to build and deploy Hadoop stacks then I think you can make a choice and use the Puppet version/features that makes it easiest for you to design the best and most modular components (Puppet 3 being the best choice I believe). If, on the other hand, the user of the system will want to integrate, for example, Puppet modules that build full applications that use  Hadoop stack services, then issue of what Puppet version/features used becomes important. Related is whether the view of Bigtop-Puppet is that of a  top level system hiding and controlling Puppet infrastructure or instead that of a solution where these modules are built such they can run on the Puppet environment currently used by the end user.

I think a good analogy is that to external or embedded databases. Are you looking towards a solution where Puppet is viewed as the "internal configuration management" system or conversely are you looking to Puppet as an "external configuration management system", one that is maintained to some extent by a Puppet expert/administrator (the analogy being to a DB admin)Puppet 3 provides quite a bit of enchantments in precisely the areas where we need it (Hiera, etc.). The question is, if we focus on Puppet 3 too much, would we alienate tons of DevOps community and make our Puppet code less useful to them?