Milyn
  1. Milyn
  2. MILYN-27

EdiSax issues/improvements offered by Dave Degroff

    Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Smooks v1.6
    • Component/s: EDI
    • Labels:
      None
    • Number of attachments :
      0

      Description

      Dave Degroff said......

      There are a couple of issues I'd like to raise with the experience I've had with EDI processors:

      1. Would it be possible to enable EDISax to dynamically "discover" delimiters? Here's an example:

      ISA*00* 00 ZZ*3327233476332 *ZZ*INTERNAL_TP *010806*1200*U*00401*000000009*0*T:~

      In this segment, the only "known" when starting to parse is the segment identifier ("ISA"). The next byte following this identifer becomes the element delimiter for the remainder of the message ("*").

      EDI protocol also provides for dynamic discovery of the sub-element delimiter and the segment delimiter. To-wit:

      The first element (bytes found between element delimiters following "ISA"), also known as ISA01, has a defined name and purpose. The next element (ISA02) the same, and so on to ISA16. ISA16 has a special purpose – defining the component element separator (in the example - ":").

      Furthermore, the byte(s) that immediately follow ISA16 (and preceed the next segment identifier ("GS")) become the segment delimiter. In the example, this is the tilde ("~").

      2. In your example, the only repeating elements are "field" and "segment". Would it be possible to include another collection ("loops"), that would include a predetermined group of segments?

      3. Speaking of loops, a collection of segments may contain the same segments, but may have different meanings for mapping. For example:

      NM1*85*2*PREMIER BILLING SERVICE*****24*587654321
      N3*234 SEAWAY ST
      N4*MIAMI*FL*33111
      NM1*87*2*KILDARE ASSOC*****24*581234567
      N3*2345 OCEAN BLVD
      N4*MIAMI*FL*33111

      The first collection of NM1, N3 and N4 segments represents one entity ("Billing Provider"). The second collection represents a different entity ("Pay-to Provider"). The means of distinguishing between the two collections is the NM101 element ("85" for "Billing Provider"; "87" for "Pay-to Provider").

      Would it be possible to account for conditional content-determining to distinguish between similar objects?

        Issue Links

          Activity

          Hide
          Ivan Peev added a comment -

          Hi Tom,

          First of all, thanks for the "dumb" EDI parser I like when simple things are kept simple. I'm more interested in the EDI parser independent of other "context" solutions. In the short amount of time after I did comment, I was able to come up with a solution for the 3th point. Here is my proposal:

          1. Extend segcode attribute of segment element in the configuration to accept specifications like these:

          segcode="NM1*85"

          and

          segcode="NM1*87"

          Where * separator can be different, depending on the specified field delimiter.

          Then modify the the existing parser line 298:

          			if(!currentSegmentFields[0].equals(expectedSegment.getSegcode())) {
          

          to

          			if ( !equalSegment( currentSegmentFields, expectedSegment.segcode ) )
          

          Here is the code for equalSegment function:

          		/**
          		 * Return true if current segment equals segcode.
          		 * @param currentSegment Current segment fields.
          		 * @param segcode Segment code to match.
          		 */
          		private boolean equalSegment( String[] currentSegment, string segcode )
          		{
          			bool result = false;
          
          			String[] segcodeFields = StringUtils.splitPreserveAllTokens(segcode, mappingModel.getDelimiters().getField());
          
          			int fieldsCount = segcodeFields.length;
          			if ( currentSegment.Length > fieldsCount )
          			{
          				result = true;
          
          				for ( int fieldIndex = 0; fieldIndex < fieldsCount; fieldIndex++ )
          				{
          					String segcodeField = segcodeFields[fieldIndex];
          					if ( segcodeField == "" )
          					{	// Field not specified. Skip it.
          						continue;
          					}
          
          					if ( !currentSegment[fieldIndex].Equals( segcodeField ) )
          					{	// Current segment doesn't equal segcode.
          						result = false;
          						break;
          					}
          				}
          			}
          
          			return result;
          		}
          

          In this way the parser will be able to handle a requirement like in the 3th point. Tom, let me know what do you think about the proposal?

          Show
          Ivan Peev added a comment - Hi Tom, First of all, thanks for the "dumb" EDI parser I like when simple things are kept simple. I'm more interested in the EDI parser independent of other "context" solutions. In the short amount of time after I did comment, I was able to come up with a solution for the 3th point. Here is my proposal: 1. Extend segcode attribute of segment element in the configuration to accept specifications like these: segcode="NM1*85" and segcode="NM1*87" Where * separator can be different, depending on the specified field delimiter. Then modify the the existing parser line 298: if (!currentSegmentFields[0].equals(expectedSegment.getSegcode())) { to if ( !equalSegment( currentSegmentFields, expectedSegment.segcode ) ) Here is the code for equalSegment function: /** * Return true if current segment equals segcode. * @param currentSegment Current segment fields. * @param segcode Segment code to match. */ private boolean equalSegment( String [] currentSegment, string segcode ) { bool result = false ; String [] segcodeFields = StringUtils.splitPreserveAllTokens(segcode, mappingModel.getDelimiters().getField()); int fieldsCount = segcodeFields.length; if ( currentSegment.Length > fieldsCount ) { result = true ; for ( int fieldIndex = 0; fieldIndex < fieldsCount; fieldIndex++ ) { String segcodeField = segcodeFields[fieldIndex]; if ( segcodeField == "" ) { // Field not specified. Skip it. continue ; } if ( !currentSegment[fieldIndex].Equals( segcodeField ) ) { // Current segment doesn't equal segcode. result = false ; break ; } } } return result; } In this way the parser will be able to handle a requirement like in the 3th point. Tom, let me know what do you think about the proposal?
          Hide
          Tom Fennelly added a comment -

          So how would that work if the field you wanted to "switch" on was not the first field in the segment?

          I wonder how it would work if you had an optional XPath like condition variable. So you could define something like (see the "where" attribute):

          <medi:segment segcode="NM1" where="field4/a=85" xmltag="xxxxx">
          	<medi:field xmltag="field1" />
          	<medi:field xmltag="field2" />
          	<medi:field xmltag="field3" />
          	<medi:field xmltag="field4">
          		<medi:component xmltag="a" />
          		<medi:component xmltag="b" />
          	</medi:field>
          	<medi:field xmltag="field5" />
          </medi:segment>
          
          Show
          Tom Fennelly added a comment - So how would that work if the field you wanted to "switch" on was not the first field in the segment? I wonder how it would work if you had an optional XPath like condition variable. So you could define something like (see the "where" attribute): <medi:segment segcode= "NM1" where= "field4/a=85" xmltag= "xxxxx" > <medi:field xmltag= "field1" /> <medi:field xmltag= "field2" /> <medi:field xmltag= "field3" /> <medi:field xmltag= "field4" > <medi:component xmltag= "a" /> <medi:component xmltag= "b" /> </medi:field> <medi:field xmltag= "field5" /> </medi:segment>
          Hide
          Ivan Peev added a comment -

          Tom,

          If you take a closer look at equalSegment function you will see that in case the delimited value is empty, it will skip it. So let's say you want to check for the second value instead. You will specify it like this:

          segcode="NM1**85"

          Now it will check for the second delimited value to be 85.

          I was thinking about more extensive solution like the one you are proposing. The question is whether it is worth at this point, if we don't have a good business case for it? I know for sure that the solution I suggest will work in my particular case and it is not very intrusive to the current source code. A second benefit is that the specification is very similar to the incoming input data and it somehow feels natural.

          Show
          Ivan Peev added a comment - Tom, If you take a closer look at equalSegment function you will see that in case the delimited value is empty, it will skip it. So let's say you want to check for the second value instead. You will specify it like this: segcode="NM1**85" Now it will check for the second delimited value to be 85. I was thinking about more extensive solution like the one you are proposing. The question is whether it is worth at this point, if we don't have a good business case for it? I know for sure that the solution I suggest will work in my particular case and it is not very intrusive to the current source code. A second benefit is that the specification is very similar to the incoming input data and it somehow feels natural.
          Hide
          Tom Fennelly added a comment -

          Ah OK... I missed that Ivan, thanks.

          Yeah, this does seem nice and clean. I prefer it to the "where" attribute. We can do this

          Show
          Tom Fennelly added a comment - Ah OK... I missed that Ivan, thanks. Yeah, this does seem nice and clean. I prefer it to the "where" attribute. We can do this
          Hide
          Tom Fennelly added a comment -

          Added support for matching segments using a regex pattern. See MILYN-220.

          Show
          Tom Fennelly added a comment - Added support for matching segments using a regex pattern. See MILYN-220 .

            People

            • Assignee:
              Bård Langöy
              Reporter:
              Tom Fennelly
            • Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: