DamageControl
  1. DamageControl
  2. DC-162

DamageControl as a hub in a distributed build farm

    Details

    • Type: New Feature New Feature
    • Status: Open Open
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Description

      On one of the big ThoughtWorks projects there are around 10 different continuous integration servers doing different things. Running a DamageControl process on each of them took a lot of resources. Resources that could have otherwised been used for running the build faster. Also, visibility of all of the different builds that were done were limited because nobody had an overview of all of them.

      DamageControl could act as a hub in a distributed build farm by doing a special distributed BuildExecutor.

      All build requests and all admin is done on the central DamageControl server/hub which runs the ordinary (albeit specially configured) DamageControlServer.

      A slave process runs on each slave node. The DistributedBuildExecutor communicates with the slave process. The slave process runs the ordinary BuildExecutor with a special hub that sends all events on to the central server (which would do all the logging, reporting and so on). The DistributedBuildExecutor should probably use drb to communicate and not xmlrpc as it's ruby-to-ruby and needs to be as performant as possible.

      The DistributedBuildExecutor should be able to be pinned to a certain server for a project. So a specific server can be dedicated to run a specific project (the build might need special software or hardware installed for example).

      If the slave process still takes too much resources (because of running Ruby) it should be possible to implement it in C. This will of course not be entirely trivial.

        Issue Links

          Activity

          Hide
          Lars Trieloff added a comment -

          From an IM sessin with Aslak:

          (21:12:57) Lars Trieloff: Another topic: I was thinking about distributed builds with DC.
          (21:13:36) Aslak Hellesoy: yeah - that's something we've wanted to do for a long while
          (21:13:41) Aslak Hellesoy: using drb
          (21:13:48) Lars Trieloff: what is drb?
          (21:14:53) Lars Trieloff: As most of the commands DC uses are using the Command Line, it should be possible to do everything via a remote shell, e.g. ssh
          (21:15:24) Lars Trieloff: you get security for free and need no special build server process.
          (21:16:00) Lars Trieloff: all you need to do is to redirect the commands to the remote shell and parse the remote shell's output.
          (21:16:56) Lars Trieloff: ok, I know now what drb is
          (21:21:16) Aslak Hellesoy: drb: distributed ruby. an rpc protocol/library
          (21:21:33) Lars Trieloff: I ve just googled for the doc.
          (21:22:00) Aslak Hellesoy: how would we deal with things like logs and build artifacts?
          (21:22:18) Aslak Hellesoy: the server needs to get them back somehow. scp maybe?
          (21:22:32) Aslak Hellesoy: i like the idea. it's simple
          (21:23:04) Aslak Hellesoy: in the project config you could declare a set of files that should be copied back upon completion
          (21:23:13) Aslak Hellesoy: wdyt?
          (21:23:46) Lars Trieloff: the files are the real problem.
          (21:24:35) Lars Trieloff: coping the files to be merged back is the most simple solution, but how do you implement looking at the working files?
          (21:24:43) Aslak Hellesoy: just sent you a mail with an image
          (21:25:30) Aslak Hellesoy: we could say that for distributed builds you can't
          (21:25:44) Lars Trieloff: I imagined to define a build executor for each known ssh server, this means a single project can be build on three servers, one for R-1, one for R and one for R+1
          (21:26:21) Lars Trieloff: disabling this option would be ok for the start, later one could implement a ssh-vfs.
          (21:26:41) Aslak Hellesoy: yes - it's not that important as a feature
          (21:27:01) Lars Trieloff: there is another problem - i think
          (21:27:33) Lars Trieloff: is the checkout process currently executed by the buildexecutor?
          (21:28:01) Lars Trieloff: if it is not, this would result in outdated files at the buildserver.
          (21:28:40) Aslak Hellesoy: brb - on phone
          (21:29:59) Aslak Hellesoy: you're right - it is currently done by the buildexecutor
          (21:30:08) Lars Trieloff: this is good.
          (21:30:32) Lars Trieloff: is the changeset determination done by the build executor too?
          (21:30:37) Aslak Hellesoy: but before we can do anything with this there is a number of issues we need to solve
          (21:30:53) Lars Trieloff: right.
          (21:30:53) Aslak Hellesoy: no, that is done by the SCM class (CVS/SVN)
          (21:31:01) Lars Trieloff: eecellent.
          (21:31:29) Lars Trieloff: we should try to solve the bugs first

          Show
          Lars Trieloff added a comment - From an IM sessin with Aslak: (21:12:57) Lars Trieloff: Another topic: I was thinking about distributed builds with DC. (21:13:36) Aslak Hellesoy: yeah - that's something we've wanted to do for a long while (21:13:41) Aslak Hellesoy: using drb (21:13:48) Lars Trieloff: what is drb? (21:14:53) Lars Trieloff: As most of the commands DC uses are using the Command Line, it should be possible to do everything via a remote shell, e.g. ssh (21:15:24) Lars Trieloff: you get security for free and need no special build server process. (21:16:00) Lars Trieloff: all you need to do is to redirect the commands to the remote shell and parse the remote shell's output. (21:16:56) Lars Trieloff: ok, I know now what drb is (21:21:16) Aslak Hellesoy: drb: distributed ruby. an rpc protocol/library (21:21:33) Lars Trieloff: I ve just googled for the doc. (21:22:00) Aslak Hellesoy: how would we deal with things like logs and build artifacts? (21:22:18) Aslak Hellesoy: the server needs to get them back somehow. scp maybe? (21:22:32) Aslak Hellesoy: i like the idea. it's simple (21:23:04) Aslak Hellesoy: in the project config you could declare a set of files that should be copied back upon completion (21:23:13) Aslak Hellesoy: wdyt? (21:23:46) Lars Trieloff: the files are the real problem. (21:24:35) Lars Trieloff: coping the files to be merged back is the most simple solution, but how do you implement looking at the working files? (21:24:43) Aslak Hellesoy: just sent you a mail with an image (21:25:30) Aslak Hellesoy: we could say that for distributed builds you can't (21:25:44) Lars Trieloff: I imagined to define a build executor for each known ssh server, this means a single project can be build on three servers, one for R-1, one for R and one for R+1 (21:26:21) Lars Trieloff: disabling this option would be ok for the start, later one could implement a ssh-vfs. (21:26:41) Aslak Hellesoy: yes - it's not that important as a feature (21:27:01) Lars Trieloff: there is another problem - i think (21:27:33) Lars Trieloff: is the checkout process currently executed by the buildexecutor? (21:28:01) Lars Trieloff: if it is not, this would result in outdated files at the buildserver. (21:28:40) Aslak Hellesoy: brb - on phone (21:29:59) Aslak Hellesoy: you're right - it is currently done by the buildexecutor (21:30:08) Lars Trieloff: this is good. (21:30:32) Lars Trieloff: is the changeset determination done by the build executor too? (21:30:37) Aslak Hellesoy: but before we can do anything with this there is a number of issues we need to solve (21:30:53) Lars Trieloff: right. (21:30:53) Aslak Hellesoy: no, that is done by the SCM class (CVS/SVN) (21:31:01) Lars Trieloff: eecellent. (21:31:29) Lars Trieloff: we should try to solve the bugs first
          Hide
          Aslak Helles°y added a comment -

          We should definitely take a closer look at distcc (http://distcc.samba.org/doc.html). Some of the theory and concepts might be appliccable to DC. Also see

          http://blogs.codehaus.org/people/vmassol/archives/001006_distributed_build.html

          Show
          Aslak Helles°y added a comment - We should definitely take a closer look at distcc ( http://distcc.samba.org/doc.html ). Some of the theory and concepts might be appliccable to DC. Also see http://blogs.codehaus.org/people/vmassol/archives/001006_distributed_build.html
          Hide
          Aslak Helles°y added a comment -

          might not be relevant, but we should check out the distributed ideas described at http://www-sop.inria.fr/oasis/ProActive/doc/api/org/objectweb/proactive/doc-files/p2p.html

          Show
          Aslak Helles°y added a comment - might not be relevant, but we should check out the distributed ideas described at http://www-sop.inria.fr/oasis/ProActive/doc/api/org/objectweb/proactive/doc-files/p2p.html

            People

            • Assignee:
              Unassigned
              Reporter:
              Jon Tirsen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: