A little late to this discussion, but although I'm a big fan of many of the concepts in Concourse, I think it's lacking a lot of polish.<p>For example, I have actually implemented patterns within Jenkins to force all jobs to run inside of containers. A job is just a container + command with some linking using the jenkins pipeline plugin that reads some json configuration to determine how jobs are linked.<p>The primary issues I have surround the fact that my company uses kubernetes, thus we have no insight into the runc containers. Load balancing in concourse is non-existent. If a worker goes down due to load, if you bring it back up, it's going to go down immediately from all the jobs that have been triggered while the worker was offline. Not only that, the resource requirements seem pretty high. Recently a concourse worker stalled because the amount of volumes/images it was caching was over 100 gigs, and not knowing the internals, I wasn't sure what the best way was to clear this cache. Having to tell the infrastructure team that we just need to spend more money is a hard sell when we've upped the cpu, memory, storage, postgres disk, all more than once. I understand that different images have vastly different sizes, and jobs different amounts of work, but their need to be some clear suggestions for sizing. If there exist them now, I apologize but I haven't seen them.<p>So yeah I've had some fun developing in it, but more help making it reliable would be really nice. Also if kubernetes is absolutely the wrong way to run it, which it seems like, I'd have to be provided a better/easier alternative to really become an advocate.<p>Final note: has anyone actually setup metrics/monitoring for concourse that doesn't know BOSH? The docs describing it seem huge unless you already have the infrastructure pieces. Let's setup riemann (we already have statsd and no experience with riemann), emit to influxDB (we have prometheus, no experience with influxDB), then use Grafana (ok we already have that). I just wanted a better idea of disk, cpu, mem, # of containers, lifecycle without having to setup all these new pieces of infrastructure. Finally, just not that interested in BOSH, which all of the example metrics repo's are.