In a previous post, I did some experiments with gRPC, protocol buffer and Terraform. The idea was to transform the “Terraform” cli tool into a micro-service thanks to gRPC.

This post is the second part of the experiment. I will go deeper in the code and see if it is possible to create a brand new utility, without hacking Terraform. The idea is to import some packages that compose the binary and create my own service based on gRPC.

The Terraform structure

Terraform is a binary utility written in go. The main package resides in the root directory of the terraform directory. As usual with go projects, all other subdirectories are different modules.

The whole business logic of Terraform is coded into the subpackages. The “main” package is simply an enveloppe for kick-starting the utility (env variables, etc.) and to initiate the command line.

The cli implementation

The command line flags are instantiated by Mitchell Hashimoto’s cli package. As explained in the previous post, this cli package is calling a specific function for every action.

The command package

Every single action is fulfilling the cli.Command interface and is implemented in the command subpackage. Therefore, every “action” of Terraform has a definition in the command package and the logic is coded into a Run(args []string) int method (see the doc of the Command interface for a complete definition.

Creating a new binary

The idea is not to hack any of the packages of Terraform to allow an easier maintenance of my code. In order to create a custom service, I will instead implement a new utility; therefore a new main package. This package will implement a gRPC server. This server will implement wrappers around the functions declared in the terraform.Command package.

For the purpose of my POC, I will only implement three actions of Terraform:

  • terraform init
  • terraform plan
  • terraform apply

The gRPC contract

In order to create a gRPC server, we need a service definition. To keep it simple, let’s consider the contract defined in the previous post (cf the section: Creating the protobuf contract). I simply add the missing procedure calls:

 1syntax = "proto3";
 2
 3package pbnhite;
 4
 5service Terraform {
 6    rpc Init (Arg) returns (Output) {}
 7    rpc Plan (Arg) returns (Output) {}
 8    rpc Apply (Arg) returns (Output) {}
 9}
10
11message Arg {
12    repeated string args = 2;
13}
14
15message Output {
16    int32 retcode = 1;
17    bytes  stdout = 2;
18    bytes stderr = 3;
19}

Fulfilling the contract

As described previously, I am creating a grpcCommand structure that will have the required methods to fulfill the contract:

 1type grpcCommands struct {}
 2
 3func (g *grpcCommands) Init(ctx context.Context, in *pb.Arg) (*pb.Output, error) {
 4    ....
 5}
 6func (g *grpcCommands) Plan(ctx context.Context, in *pb.Arg) (*pb.Output, error) {
 7    ....
 8}
 9func (g *grpcCommands) Apply(ctx context.Context, in *pb.Arg) (*pb.Output, error) {
10    ....
11}

In the previous post, I have filled the grpcCommand structure with a map filled with the command definition. The idea was to keep the same CLI interface. As we are now building a completely new binary with only a gRPC interface, we don’t need that anymore. Indeed, there is still a need to execute the Run method of every Terraform command.

Let’s take the example of the Init command.

Let’s see the definition of the command by looking at the godoc:

1type InitCommand struct {
2    Meta
3    // contains filtered or unexported fields
4}

This command holds a substructure called Meta. Meta is defined here and holds the meta-options that are available on all or most commands. Obviously we need a Meta definition in the command to make it work properly.

For now, let’s add it to the grpcCommand globally, and we will see later how to implement it.

Here is the gRPC implementation of the contract:

 1type grpcCommands struct {
 2    meta command.Meta
 3}
 4
 5func (g *grpcCommands) Init(ctx context.Context, in *pb.Arg) (*pb.Output, error) {
 6    // ...
 7    tfCommand := &command.InitCommand{
 8        Meta: g.meta,
 9    }
10    ret := int32(tfCommand.Run(in.Args))
11    return &pb.Output{ret, stdout, stderr}, err
12}

How to initialize the grpcCommand object

Now that we have a proper grpcCommand than can be registered to the grpc server, let’s see how to create an instance. As the grpcCommand only contains one field, we simply need to create a meta object.

Let’s simply copy/paste the code done in Terraform’s main package and we now have:

1var PluginOverrides command.PluginOverrides
2meta := command.Meta{
3    Color:            false,
4    GlobalPluginDirs: globalPluginDirs(),
5    PluginOverrides:  &PluginOverrides,
6    Ui:               &grpcUI{},
7}
8pb.RegisterTerraformServer(grpcServer, &grpcCommands{meta: meta})

According to the comments in the code, the globalPluginDirs() returns directories that should be searched for globally-installed plugins (not specific to the current configuration). I will simply copy the function into my main package

About the UI

In the example CLI that I developed in the previous post, what I did was to redirect stdout and stderr to an array of bytes, in order to capture it and send it back to a gRPC client. I noticed that this was not working with Terraform. This is because of the UI! UI is an interface whose purpose is to get the output stream and write it down to a specific io.Writer.

Our tool will need a custom UI.

A custom UI

As UI is an interface (see the doc here), it is easy to implement our own. Let’s define a structure that holds two array of bytes called stdout and stderr. Then let’s implement the methods that will write into these elements:

1type grpcUI struct {
2    stdout []byte
3    stderr []byte
4}
5
6func (g *grpcUI) Output(msg string) {
7    g.stdout = append(g.stdout, []byte(msg)...)
8}

Note 1: I omit the methods Info, Warn, and Error for brevity.

Note 2: For now, I do not implement any logic into the Ask and AskSecret methods. Therefore, my client will not be able to ask something. But as gRPC is bidirectional, it would be possible to implement such an interaction.

Now, we can instantiate this UI for every call, and assign it to the meta-options of the command:

1var stdout []byte
2var stderr []byte
3myUI := &grpcUI{
4    stdout: stdout,
5    stderr: stderr,
6}
7tfCommand.Meta.Ui = myUI

So far, so good: we now have a new Terraform binary, that is working via gRPC with a very little code.

What did we miss?

Multi-stack

It is fun but not usable for real purpose because the server needs to be launched from the directory where the tf files are… Therefore the service can only be used for one single Terraform stack… Come on!

Let’s change that and pass as a parameter of the RPC call the directory where the server needs to work. It is as simple as adding an extra argument to the message Arg:

1message Arg {
2    string workingDir = 1;
3    repeated string args = 2;
4}

and then, simply do a change directory in the implementation of the command:

 1func (g *grpcCommands) Init(ctx context.Context, in *pb.Arg) (*pb.Output, error) {
 2    err := os.Chdir(in.WorkingDir)
 3    if err != nil {
 4        return &pb.Output{int32(0), nil, nil}, err
 5    }
 6    tfCommand := &command.InitCommand{
 7        Meta: g.meta,
 8    }
 9    var stdout []byte
10    var stderr []byte
11    myUI := &grpcUI{
12        stdout: stdout,
13        stderr: stderr,
14    }
15    ret := int32(tfCommand.Run(in.Args))
16    return &pb.Output{ret, stdout, stderr}, err
17}

Implementing a new push command

I have a Terraform service. Alright. Can an “Operator” use it?

The service we have deployed is working exactly like Terraform. I have only changed the user interface. Therefore, in order to deploy a stack, the ’tf’ files must be present locally on the host.

Obviously we do not want to give access to the server that hosts Terraform. This is not how micro-services work.

Terraform has a push command that Hashicorp has implemented to communicate with Terraform enterprise. This command is linked with their close-source product called “Atlas” and is therefore useless for us.

Let’s take the same principle and implement our own push command.

Principle

The push command will zip all the tf files of the current directory in memory, and transfer the zip via a specific message to the server. The server will then decompress the zip into a unique temporary directory and send back the ID of that directory. Then every other Terraform command can use the id of the directory and use the stack (as before).

Let’s implement a protobuf contract:

 1service Terraform {
 2    // ...
 3    rpc Push(stream Body) returns (Id) {}
 4}
 5
 6message Body {
 7    bytes zipfile = 1;
 8}
 9
10message Id {
11    string tmpdir = 1;
12} 

Note: By now I assume that the whole zip can fit into a single message. I will probably have to implement chunking later

Then instantiate the definition into the code of the server:

 1func (g *grpcCommands) Push(stream pb.Terraform_PushServer) error {
 2    workdir, err := ioutil.TempDir("", ".terraformgrpc")
 3    if err != nil {
 4    return err
 5    }
 6    err = os.Chdir(workdir)
 7    if err != nil {
 8    return err
 9    }
10
11    body, err := stream.Recv()
12    if err == io.EOF || err == nil {
13        // We have all the file
14        // Now let's extract the zipfile
15        // ...
16    }
17    if err != nil {
18        return err
19    }
20    return stream.SendAndClose(&pb.Id{
21            Tmpdir: workdir,
22    })
23}

going further…

The problem with this architecture is that it’s stateful, and therefore easily scalable.

A solution would be to store the zip file in a third party service, identify it with a unique id. And then call the Terraform commands with this ID as a parameter. The Terraform engine would then grab the zip file from the third party service if needed and process the file

Implementing a micro-service of backend

I want to keep the same logic, therefore the storage service can be a gRPC microservice. We can then have different services (such as s3, google storage, dynamodb, NAS, …) written in different languages.

The Terraform service will act as a client of this “backend” service (take care, it is not the same backend as the one defined within Terraform).

Our Terraform-service can then be configured in runtime to call the host/port of the correct backend-service. We can even imagine the backend address being served via consul.

This is a work in progress and may be part of another blog post.

Hip1 is cooler than cool: Introducing Nhite

I have talked to some people about all this stuff and I feel that people are interested. Therefore, I have set up a GitHub organisation and a GitHub project to centralize what I will do around that.

The project is called Nhite.

There is still a lot to do, but I really think that this could make sense to create a community. It may give a product by the end, or go in my attic of dead projects. Anyway, so far I’ve had a lot of fun!