Archiva
  1. Archiva
  2. MRM-1347

Migrate repository proxy to the new repository API

    Details

    • Type: Task Task
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.3.0
    • Component/s: remote proxy
    • Labels:
    • Number of attachments :
      0

      Issue Links

        Activity

        Hide
        Brett Porter added a comment -

        note that we probably want this to plug into the resolution mechanism, not just be a front-end. For example, requesting a project in browse where the parent POM is not already in the repository should resolve it correctly. This isn't something that it is capable of doing today.

        Also, this should be linked to some related issues that can be addressed at the same time - for example, by ensuring we use a streaming API we can address the issues about requests blocking.

        Show
        Brett Porter added a comment - note that we probably want this to plug into the resolution mechanism, not just be a front-end. For example, requesting a project in browse where the parent POM is not already in the repository should resolve it correctly. This isn't something that it is capable of doing today. Also, this should be linked to some related issues that can be addressed at the same time - for example, by ensuring we use a streaming API we can address the issues about requests blocking.
        Hide
        Maria Odea Ching added a comment -

        From what I understand, these are the things that need to be changed for the repository proxy:

        1. for artifact requests, resolve artifact first if it exists in the managed repository through the metadata resolver
        2. if the artifact is not found:
          1. resolve the artifact from the remote repository first
          2. if it is resolved from the remote repository, then proxy the artifact
          3. update metadata of proxied artifact
        3. otherwise, if the artifact is found:
          1. get the actual artifact file from the storage and return it

        For 2.1, a new metadata resolver for remote repositories need to be written. I just have a clarification on this, a remote repository will have it's own metadata repository?

        Show
        Maria Odea Ching added a comment - From what I understand, these are the things that need to be changed for the repository proxy: for artifact requests, resolve artifact first if it exists in the managed repository through the metadata resolver if the artifact is not found: resolve the artifact from the remote repository first if it is resolved from the remote repository, then proxy the artifact update metadata of proxied artifact otherwise, if the artifact is found: get the actual artifact file from the storage and return it For 2.1, a new metadata resolver for remote repositories need to be written. I just have a clarification on this, a remote repository will have it's own metadata repository?
        Hide
        Brett Porter added a comment -

        There's a few things to clarify here.

        Firstly, I don't think the logic for the proxy needs to change at all, and there's quite a few rules in there about how that does things. There are plenty of tests too, though you might want to shuffle them around to get a better separation of the unit tests and the "integration tests" that go via webdav. What we're looking to achieve is improve the way it is structured in the code (to the point where it could be turned on or off as a module, rather than being intertwined into the webdav stuff).

        The key to the new architecture is that you want to obtain the metadata first, and only obtain the artifact when that is requested. This can get a bit confusing in Maven since it has its metadata, POM metadata, and then there is other metadata for plain artifact files The proxy should become a little "dumber" - it shouldn't know anything about repository storage or remote repo formats - but basically jumping in between trying to get metadata / artifacts from storage and going remote if necessary, though it will need to do a number of filtering operations (convert paths for the remote repo, whitelist/blacklist, search multiple remotes, error handling, determine if it needs to update something already in storage).

        Not sure if that's making sense - we should sketch this out on a wiki page some more.

        Show
        Brett Porter added a comment - There's a few things to clarify here. Firstly, I don't think the logic for the proxy needs to change at all, and there's quite a few rules in there about how that does things. There are plenty of tests too, though you might want to shuffle them around to get a better separation of the unit tests and the "integration tests" that go via webdav. What we're looking to achieve is improve the way it is structured in the code (to the point where it could be turned on or off as a module, rather than being intertwined into the webdav stuff). The key to the new architecture is that you want to obtain the metadata first, and only obtain the artifact when that is requested. This can get a bit confusing in Maven since it has its metadata, POM metadata, and then there is other metadata for plain artifact files The proxy should become a little "dumber" - it shouldn't know anything about repository storage or remote repo formats - but basically jumping in between trying to get metadata / artifacts from storage and going remote if necessary, though it will need to do a number of filtering operations (convert paths for the remote repo, whitelist/blacklist, search multiple remotes, error handling, determine if it needs to update something already in storage). Not sure if that's making sense - we should sketch this out on a wiki page some more.
        Hide
        Brett Porter added a comment -

        BTW, here are the references I found to the model and repository layer that we'd be aiming to remove. It's probably not a one-for-one substitution in this case due to the redesign.

        Keys (for debug logging) and RepositoryURL in DRPC; ArtifactReference in DRPC, RPC; RepositoryConnectors from repo-layer could be folded into this module in some way; MetadataTools (and others in that package) used for merging in DRPC can be pushed back into the maven2 repository implementation, ManagedRepositoryContent and RepositoryContentFactory are used quite a bit for path calculation in DRPC which would move to the storage (means rethinking the tmpdir calculation and how to get the connectors), RemoteRepositoryContent is something that needs to be rethought and perhaps added to the metadata repository API.

        Show
        Brett Porter added a comment - BTW, here are the references I found to the model and repository layer that we'd be aiming to remove. It's probably not a one-for-one substitution in this case due to the redesign. Keys (for debug logging) and RepositoryURL in DRPC; ArtifactReference in DRPC, RPC; RepositoryConnectors from repo-layer could be folded into this module in some way; MetadataTools (and others in that package) used for merging in DRPC can be pushed back into the maven2 repository implementation, ManagedRepositoryContent and RepositoryContentFactory are used quite a bit for path calculation in DRPC which would move to the storage (means rethinking the tmpdir calculation and how to get the connectors), RemoteRepositoryContent is something that needs to be rethought and perhaps added to the metadata repository API.

          People

          • Assignee:
            Unassigned
            Reporter:
            Maria Odea Ching
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: