plicense 23 days ago [-]
To summarize:

1. Your build pipeline has a lot of hermetic actions.

2. To speed it up, you execute these actions remotely on isolated environments, cache the results and reuse when possible.

Pretty neat.

You might want to look into https://goo.gl/TB49ED and https://console.cloud.google.com/marketplace/details/google/... if you need a managed service to do just that.

peterwwillis 23 days ago [-]
I suggest a documentation cleanup. The initial README should have blurbs about who should use it, what it's for, how it does it, and links to example use cases. A quick start guide steps a user through accomplishing a simple task, and links to extended documentation. Extended documentation is the reference guide to the latest code, and should be generated from the code. I would not suggest splitting documentation up into multiple places (a readme here, a lengthy blogpost there, plus discombobulated Wiki); all documentation should be accessible from a single portal, with filtering capabilities (search is incredibly difficult to make accurate, whereas filtering is easy and effective).

This whole solution seems like a very custom way to use docker. You can already create custom Docker images with specific content, use multi-stage builds to cache layers, split pipelines up into sections that generate static assets and pull the latest ones based on a checksum of its inputs, etc. I think the cost of maintaining this solution is going to far outweigh that of just using existing tooling differently.

pasxizeis 20 days ago [-]
Thanks for the suggestions, we'll improve the documentation soon.
rossmohax 24 days ago [-]
in Docker that would be something like

  COPY Gemfile Gemfile.lock /src
  RUN bundle install
even if they don't use docker to run application in prod, it can be [ab]used to perform efficient build layer (build step) caching and distribution.
pasxizeis 21 days ago [-]
That's what a mistry recipe looks like, with some additional conventions (e.g. params are accessible in the project as files located in `/data/params`). mistry recipes are essentially Dockerfiles.
quickthrower2 23 days ago [-]
Problem with docker is there if something early on changes then everything after needs to be rebuild.

However with a build you might have A and B feed into C. If A changes and B hasn't you want to just build A and get B from cache.

caleblloyd 23 days ago [-]
One solution in Docker is multistage builds where A and B can be separated into intermediate images with results copied into C.

A pattern that I like to use is two multistage build Dockerfiles- `dev` and `prod`.

The `dev` image has 2 stages. The first stage copies only files required by the package manager, such as `package.json` into another directory using a combination of `find` and `cp --parents`, then restores dependencies. The second stage copies the dependencies from the first stage and overlays source code. The `dev` image is then instantiated to run all tests.

The `prod` image also has 2 stages. The first stage starts with the `dev` image and publishes a production bundle to a directory. The second stage starts with a clean image and copies the production bundle from the first stage.

nstart 23 days ago [-]
Curious what the HN community feels is a "slow deploy". I scanned the article first to find time reductions and still couldn't see how much time was actually taken at the end of it.

11 minutes is a great time reduction. (11*30 builds a day = 5.5 hours saved in total).

But I still am not sure what constitutes as a slow builds. I assume at some point there's an asymptotic curve of diminishing returns where in order to shave off a minute, the complexity of the pipeline increases dramatically (caching being a tricky example). So do y'all have any opinions on what makes a build slow for you?

pasxizeis 21 days ago [-]
For us, when assets had to be compiled, a production deploy would take 15-17 minutes. After mistry (when asset compilation time was shaved off), deployment takes ~5 mins.
thinkingkong 23 days ago [-]
I think the build / deploy times are more secondary. To me the more important part is the time to recover from an outage where a rollback is the fastest or best solution.
arenaninja 22 days ago [-]
The first point isn't so much a change in build pipelines as much as it is avoiding the build pipeline altogether and deploying prebuilt artifacts; I can't think of a reason to re-run your build for prod if you have run it for another environment already. In other words recognizing that deployment and build stages are different.
pasxizeis 21 days ago [-]
Correct. It might sound obvious to some, but if you're an organization from the Rails 0.x era (~2011) you have a lot of legacy code and infrastructure that albeit not fancy new tech, it works.
siscia 23 days ago [-]
It's actually touch a point very close to my work.

We definitely need much more speed in running our pipeline.

The software is mostly C/C++ with a lot of internal dependencies.

Do you guys have any experience in that?

What is worth the complexity and what is not?

Joky 23 days ago [-]
The main difficulty with C/C++ is making the build steps "hermetics". The build system is frequently unaware of what files the preprocessor will touch/lookup before hands.

Build systems like Bazel are providing some way to ensure some level of isolation that provides correctness guarantees. It also provide an easy way to statically build a superset of the files needed for each step of the build, allowing to integrate with a distributed service. Some related documentation/discussions can be found starting from https://docs.bazel.build/versions/master/remote-execution.ht... Another good source of information is online videos from BazelCon: https://conf.bazel.build

danielparks 23 days ago [-]
This is basically parallel-make as a service.

This has been an increasingly difficult problem as more and more pipelines move to containers for testing and building. What other solutions have folks come up with?

jillesvangurp 23 days ago [-]
I'm usually really obsessed by build speeds because I know how long build times can suck the life out of a team. Slow builds cause a lot of negative behavior and frustration. People sit on their hands waiting for builds to finish; many times per day. It breaks their flow and leads to procrastination. If your build takes half an hour, it's a blocker for doing CI or CD because it's not really continuous if you need to take 30 minutes breaks every time you commit something.

Here are a few tricks I use.

- Use a fast build server. This sounds obvious but people try to cut cost for the wrong reasons. CPU matters when you are running a build. This is the reason I never liked travis CI because you could not pay them to give you faster servers; only to give you more servers and they used quite slow instances. When your laptop outperforms your CI server, something is deeply wrong.

- Run your CI/CD tooling in the same data center that your production and staging environments live in and avoid long network delays to move e.g. docker containers or other dependencies around the planet. Amazon is great for this as it has local mirrors for a lot of things that you probably need (e.g. ubuntu and red hat mirrors).

- Use build tools that do things concurrently. If you have multiple CPU cores and all but one of them are idling, that's lost time.

- Run tests in parallel. If you do this right, you can max out most of your CPU while your tests are running

- Learn to test asynchronously and avoid using sleep or other stop gap solutions where your tests is basically waiting for something else to catch up while blocking a thread for many seconds where it does absolutely nothing useful whatsoever. People set timeouts conservatively so most of that time is wasted. Consider polling instead.

- Avoid expensive cleanups in your integration test. I've seen completely trivial database applications take twenty minutes to run a few integration tests because somebody decided it was a good idea to rebuild the database schema in between tests. If your tests are dropping and recreating tables tables, you are going to increase your build time by many seconds for every test you add.

- Randomize test data to avoid tests interacting with each other. So, never re-use the same database ids or other identifiers and avoid having magical names. This helps you skip deleting data in between tests and can save a lot of time. Also, your real world system is likely to have more than 1 user and the point of integration tests is also finding issues related to broken assumptions related to people doing things at the same time.

- Dockerize your builds and use docker layers to your advantage. E.g. dependency resolving is only needed if the file that lists the dependencies actually changed. If you are merging pull requests, you can avoid double work because right after merge the branches are identical and the docker will be able to make use of that.

For reference, I have a kotlin project that builds and compiles in about 3 minutes on my laptop. This includes running a over 500 API integration tests running against Elasticsearch (running as an ephemeral docker container). None of the tests delete data (unless that is what we are testing). Our schema initializes just once.

A cold Docker build for this project on our CI server can take 15 minutes because it just takes that long to download remote docker layers, bootstrap all the stuff we need, download dependencies etc. However, most of our builds don't run cold and typically from commit to finished deploy takes around 6 minutes and it jumps straight into compiling and running tests. Our master branch deploys to a staging environment. When we merge master to our production branch to update production, the docker images start deploying almost immediately because it already built most of the layers it needs for the master branch and the branches are at this point identical. So a typical warm production push would jump straight to pushing out artifacts and be done in 2 minutes.

deboflo 23 days ago [-]
15 minutes to pull an image is crazy. Run a “docker pull {image}” followed by a “docker build —cache-from {image} ...” to speed up your pulls by 10X.
jillesvangurp 23 days ago [-]
Not pull an image, build an image from scratch.
deboflo 22 days ago [-]
Ah, that makes more sense then.
kardos 23 days ago [-]
Do you have any experience with stitching in a compiler cache here? Is it profitable or is it more complexity than it's worth?
jillesvangurp 23 days ago [-]
Not really worth it on our build since compilation is not that slow. In my experience, you get the biggest time savings from optimizing the process of gathering dependencies and making tests and deployments faster.

With other languages that build dependencies from source, doing that in a separate docker build step would probably be a good idea so you can cache the results as a separate docker layer.

deboflo 23 days ago [-]
JavaScript bundles are often a bottleneck in web builds. I wish there were better ways to speed this up.
swsieber 23 days ago [-]
Yarn plug-n-play (pnp) is a mechanism developed to speed up yarn install (up to 70% faster).

IIUC, angular is considering (working on?) using Hazel under the hood to parallelize angular builds.

deboflo 22 days ago [-]
Thanks, I'll look into those.