Details
-
Type:
Improvement
-
Status:
Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 0.4
-
Fix Version/s: 1.0RC1
-
Component/s: Client, Http Light Server, Replicator (Harmony)
-
Labels:None
-
Number of attachments :
Description
1) HydraClient will cache the Version information for each GET using soft reference
2) HydraClient will be responsible for object serialization
3) Server will be responsible for version increment not the client
The client interface will look something like this for now:
Serializable get(String key)
put(String key, Serializable object) throws VersionConflictException
-------------------------------------------
Some background discussion on this topic.
Hi David,
Thanks for the input and here is my thinking.
1) Hide the Data implementation. 2) Version information, what do we do with that?
I put these two points together since they are closely related. The only reason I am reluctant to expose the Data class is because it contains the Version information. Just like using a version control system, 95% of the time we only care about the content not the version, and only the other way around when there is a conflict. Most of the VCS won't even bother to show you the version of each file during a normal update. As framework designer I think we should try hard to minimize the amount of work involved for the normal execution path since they are the most common one and a slight reduction of complexity can go a long way for the users.
Now just imagine what the client code will look like when the user needs to explicitly deal with the versioning.
[Early: Thread A]
// The data should be the latest one since the server perform version consolidation
Data shoppingCartData = hydraCacheClient.get("shoppingCart-16");
ShoppingCart shoppingCart = (ShoppingCart) deserializer.deserialize(shoppingCartData.getContent());
display(shoppingCart);
[Later: Thread B] - Scenario A - Without local caching
Parameters params = receiveUserInput();
// Get latest shopping cart again
Data shoppingCartData = hydraCacheClient.get("shoppingCart-16");
ShoppingCart shoppingCart = (ShoppingCart) deserializer.deserialize(shoppingCartData.getContent());
updateShoppingCart(shoppingCart, params);
// Do we do it here or leave it to the server?
Version newVersion = shoppingCartData.getVersion().incrementFor(???)
hydraCache.put("shoppingCart-16", new Data(shoppingCartData.getKeyHash(), newVersion, serializer.serialize(shoppingCart)));
Problems with scenario A:
- Extra get before put is a bit expensive
- Since we always get the latest data before put, it will cause a last-write-wins situation
- Not clear who should be responsible for version increase and what node to use since the client probably don't have a clue
- A lot of plumbing code
[Later: Thread B] - Scenario B - With local caching
Parameters params = receiveUserInput();
// Get latest shopping cart from local cache
Data shoppingCartData = session.getAttribute("shoppingCart");
ShoppingCart shoppingCart = (ShoppingCart) deserializer.deserialize(shoppingCartData.getContent());
updateShoppingCart(shoppingCart, params);
// Do we do it here or leave it to the server?
Version newVersion = shoppingCartData.getVersion().incrementFor(???)
hydraCache.put("shoppingCart-16", new Data(shoppingCartData.getKeyHash(), newVersion, serializer.serialize(shoppingCart)));
Problems with scenario B:
- Local cache is not as scalable as the hydra cache so it can't cache too many records locally
- Not clear who should be responsible for version increase and what node to use since the client probably don't have a clue
- Still a lot of plumbing code
[Later: Thread B] - Scenario C - If Hydra client and server can deal with it
Parameters params = receiveUserInput();
// Get current shopping cart, would be better if it can be reconstructed instead of cached
ShoppingCart shoppingCart = session.getAttribute("shoppingCart");
updateShoppingCart(shoppingCart, params);
hydraCache.put("shoppingCart-16", shoppingCart);
Pros for Scenario C:
- A lot less plumbing code so the user can focus on the business logic
- Make sense to hide the serialization and versioning code from the client
Problems with Scenario C:
- Local cache is still used
- Need to implement different API call to retrieve version information if the client actually needs it
Now lets discuss these 3 scenarios (or anyother ones we can think of) and see which one make sense. I personally prefer the Scenario C. Then we can talk about how exactly we can implement these different scenarios.
1) Hide the Data implementation. 2) Version information, what do we do with that?
I put these two points together since they are closely related. The only reason I am reluctant to expose the Data class is because it contains the Version information.
----------------------------------------
But Version is needed for resolving conflicts, isn't it?
// The data should be the latest one since the server perform version consolidation
Data shoppingCartData = hydraCacheClient.get("shoppingCart-16");
How can the server consolidate data? I thought only the client was in the position to do so:
- Ask for shopping cart 16
- Receive two of them because the server has found two parallel unresolvable versions
- Merge them
- Put the merged version back, flagged as resolved
I think I have a misunderstanding of the conflict resolution will work, from which stems my inadequate answers to your questions.
D.
----------------------------------------
Hi David,
Now I see where the confusion is. I think a while back we had a discussion about how vector clock will be used to preserve multiple parallel versions to allow client to consolidate them, but due to lack of understanding that how the client will be using the vector clock we decide to release v1.0 without the vector clock versioning but instead only implement the Increment versioning. With Increment versioning in place the entire Hydra space will pretty much act like a ORM with optimistic locking, in other words the Hydra space will be operating in the eager (but delayed) consistency mode rather than eventual cnosistency mode for now. I think this is a good starting point since even based on Amazon's experience that most of the applications will be running with the increment versioning instead of the vector clocking, so we can add this feature as an improvement in the future release once we know more about how the system will be used.
Thats being said, now a few months after our initial discussion, I think I now have a vague idea how this can be achieve without introducing too much overhead to the client.
Regular PUT:
try{
hydraClient.put("shoppingCart", shoppingCart);
}catch(VersionConflictException ex){
handleConflict(ex);
}
PUT/GET with consolidation option:
hydraClient.put("shoppingCart", shoppingCart, new ShoppingCartConflictResolver());
hydraClient.get("shoppingCart", new ShoppingCartConflictResolver());
But this is for the future releases for now since the Increment is an universal counter the server can easily consolidate the versions for a GET operation using ArbitraryResolver to pick the most recent one. For PUT, just like Hibernate optimisitc locking, the Hydra space will reject a PUT operation if the Data it receives is older than the Data produced by a collective GET.
Does this make sense to you?
Nick