History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: JANINO-96
Type: New Feature New Feature
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Arno Unkrig
Reporter: Gary Nunes
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Janino

Feature Request: Obtain parameter information for an evaluated expression

Created: 17/Jul/07 06:35 PM   Updated: 20/Dec/07 04:49 PM
Component/s: None
Affects Version/s: None
Fix Version/s: 2.5.12

Time Tracking:
Not Specified

Environment: Running on Windows and Mac OS X under Java JDK 1.5+


 Description  « Hide
I originally posted this feature request on <user@janino.codehaus.org> and was asked to also submit it here (jira.codehaus.org). I've pasted the contents of the email response to my request (from Arno Unkrig). It also contains the original feature request:

---- email contents begin ----
Hi Gary,

thank you for your precise feature request!

Yes, what you want is possible, but the API is not as slick as the one
that you propose. I WILL integrate the feature into the JANINO API, but
before I do that, I want to introduce the ad-hoc solution to you, to
make sure that it does exactly what you need.

Please let me know your thoughts about it...

Also, if you want to make life easier for me, register on
jira.codehaus.org and submit this feature request there!

CU

Arno

Gary Nunes schrieb:
> Perhaps this is possible and I just don't know how to do it. In that case
> this request is for an example of how to do it (instead of a request to
> provide the means to do it).
>
>
> Given an expression to be evaluated, e.g.,
> a + b + c
>
> where, in general, it is not known in advance what parameter names appear in
> the expression (perhaps the expression is provided interactively by a user)
> I would like to be able to query Janino regarding information about the
> parameters used in the expression. Something like
> String[] parameters = ExpressionEvaluator.getParameters("a + b + c");
>
> which, in the case of the above expression would return (as a String array):
> "a", "b", "c"
>
> The information would be used to display the set of parameters in a form
> that the user could manipulate to provide parameter values (e.g., as a
> JTable of parameter name + value columns).
>
> After providing the values the user might press a button that then caused
> the execution of an ExpressionEvaluator construct followed by an evaluate,
> e.g.,
> ExpressionEvaluator ee = new ExpressionEvaluator(
> "a + b + c",
> int.class,
> parameters,
> new Class[] { int.class, int.class, int.class }
> );
>
> Integer res = (Integer) ee.evaluate(getParameterValuesFromTable());
>
>
> I hope I've managed to convey what I'm looking for.
>
> Any advice?
>
>
> Regards,
> Gary Nunes
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list please visit:
>
> http://xircles.codehaus.org/manage_email
>
>
>

import java.io.*;

import org.codehaus.janino.*;
import org.codehaus.janino.Java.*;
import org.codehaus.janino.util.Traverser;

public class Checker {

/**

  • An example how you can "check" your JANINO expression before you execute it: This program
  • reports on all "ambiguous names" in a JAVA expression, and throws a RuntimeException
  • when it detects a mthod invocation.
    *
  • E.g., in the expression "a.b + c", "a" and "c" are "ambiguous names" which you should
  • map with parameters.
    */
    public static void main(String[] args) throws Exception {
    Parser parser = new Parser(new Scanner(null, new StringReader(args[0])));
    Java.ReturnStatement returnStatement = new Java.ReturnStatement(null, parser.parseExpression().toRvalueOrPE());

new Traverser() {

public void traverseAmbiguousName(AmbiguousName an) { System.out.println("Need a parameter \"" + an.identifiers[0] + "\""); super.traverseAmbiguousName(an); }

public void traverseMethodInvocation(MethodInvocation mi) { throw new RuntimeException("Method invocation forbidden"); }

}.traverseReturnStatement(returnStatement);
}
}
---- email contents end ----

I hope this satisfies your request.

I will also try your ad-hoc solution and let you know if it how it works.

Regards,
Gary



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Arno Unkrig - 19/Jul/07 11:12 PM
Gary Nunes, 2007-07-18:

Arno,

I've tried the Checker class that you provided above and it mostly meets my
needs. There are two problems, one of which I think I can deal with and one,
perhaps not.

First, note that although I used "a + b + c" as my example expression in my
feature request the expressions will, in fact, tend to be more complicated.
For example:
"b0 * P^b1 * (S * (1 - R))^b3 * (S * (1 - R))"

for which Checker returns the output:
Need a parameter "b0"
Need a parameter "P"
Need a parameter "b1"
Need a parameter "S"
Need a parameter "R"
Need a parameter "b3"
Need a parameter "S"
Need a parameter "R"

This illustrates the first problem. Checker returns the same identifier more
than once (e.g., "S" and "R"). This problem I can deal with by removing
duplicates as I collect the identifiers.

The second problem is illustrated by noting that Java has no intrinsic power
operator ("^" in the formula above) so the formula must actually be
expressed as:
"b0 * pow(P, b1) * pow(S * (1 -R), b3) * (S * (1 - R))"

for which Checker returns (since method invocation is forbidden):
Need a parameter "b0"
RuntimeException: Method invocation forbidden ...

when I would actually like it to return the same identifiers (minus perhaps
the duplicates) as the first test above.

More generally, expressions input by users may contain method invocations of
all kinds (built-in and user defined). I'm not sure I know enough about how
Janino works to resolve this second problem.

Regards,
Gary


Arno Unkrig - 19/Jul/07 11:13 PM
Gary Nunes, 2007-07-19:

Arno,

You may have noticed a small error in the Java version of the more
complicated expression that I sent you. It should have read:

"b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))"

(i.e., the Math. prefixes are needed where the pow methods are invoked.

Hope this error didn't cause you any problems.

Regards,
Gary


Arno Unkrig - 19/Jul/07 11:20 PM
Hi Gary,

The second problem is illustrated by noting that Java has no intrinsic power
operator ("^" in the formula above) so the formula must actually be
expressed as:
"b0 * pow(P, b1) * pow(S * (1 -R), b3) * (S * (1 - R))"

for which Checker returns (since method invocation is forbidden):
Need a parameter "b0"
RuntimeException: Method invocation forbidden ...

when I would actually like it to return the same identifiers (minus perhaps
the duplicates) as the first test above.

More generally, expressions input by users may contain method invocations of
all kinds (built-in and user defined). I'm not sure I know enough about how
Janino works to resolve this second problem.

You missed my point...

public void traverseMethodInvocation(MethodInvocation mi) { throw new RuntimeException("Method invocation forbidden"); }

is only an example what could be done with the Traverser, in this case, forbid the user to use method invocations. For your purpose, you would remove the traverseMethodInvocation() method and only have traverseAmbiguousName()

Your problem #1 is inevitable with this concept, and your solution is correct.

So in summary, the functionality is OK for you, right? If so, I will integrate it into the Janino API.

CU

Arno


Gary Nunes - 20/Jul/07 03:30 AM
Arno,

Thanks much for the help you've been providing.

I think I take your point. I guess my point is that I'm somewhat ignorant of the workings of Traverser so I am unsure of how to get it to do precisely what I desire.

For example, I ran my test formula
"b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))"

with the provided Checker code after removing the traverseMethodInvocation method as you suggested.

The resulting ambiguous names were:

b0
Math
P
b1
Math
S
R
b3
S
R

As you can see the Math class qualifier that is part of the pow method invocation calls is now listed as one of the ambiguous names. Not exactly what I want. My preferred output would be (ignoring duplicates):

b0
P
b1
S
R
b3

i.e., all expression variables listed, including those used as method arguments but not those used as part of a method invocation. Alternatively one might provide something similar to the original output above but with some indication of whether the ambiguous name is used as part of a method invocation or not, e.g., return a tuple with the string and a boolean (where true = part of invocation, otherwise false)

b0, false
Math, true
P, false
b1, false
Math, true
S, false
R, false
b3, false
S, false
R, false

or something similar. Admittedly, this behavior might not be preferable for everyone so the Janino API implementation might allow some specification as to the kind of results preferred (e.g., ignore/include invocation identifiers, ignore/include method names, etc.).

Does any of this make sense?

Regards,
Gary


Arno Unkrig - 20/Jul/07 03:47 AM
Hi Gary,

For example, I ran my test formula
"b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))"
...
As you can see the Math class qualifier that is part of the pow method invocation calls is now listed as one of the ambiguous names. Not exactly what I want. My preferred output would be (ignoring duplicates): ...

gotcha! Here we get to the concept of "ambiguous names" in the Java programming languages. The exact definition is in

http://java.sun.com/docs/books/jls/third_edition/html/names.html#6.5.2

The short version is: "Math" is ambiguous: It could be a type, a field, a local variable or a method parameter in this context. The ambiguous name cannot be reclassified before the expression is compiled, but in order to compile it, we must first determine its parameters. A chicken-egg problem.

I see two solutions here:

  • You assume that all ambiguous names are parameters (b0, Math, P, b1, S, R, b3) and compile the expression, which should work fine. Then you remove the parameters one by one and compile again to determine which ANs are really parameters.
  • You decide heuristically which ANs should map to parameters: Assuming that you have no fields nor local variables around, then an AN starting with a capital is assumed to be a type, the others parameters. Effectively, this algorithm computes the correct set of parameters for your example.

Any other idea?

One more thing: How would you deal with parameter types? In your example, you define that all parameters are int type. Is that always sufficient for you?

CU

Arno


Gary Nunes - 20/Jul/07 01:53 PM
Arno,

The short version is: "Math" is ambiguous: It could be a type, a field, a local variable or a method parameter in this context. The ambiguous name cannot be reclassified before the expression is compiled, but in order to compile it, we must first determine its parameters. A chicken-egg problem.

I see two solutions here:

  • You assume that all ambiguous names are parameters (b0, Math, P, b1, S, R, b3) and compile the expression, which should work fine. Then you remove the parameters one by one and compile again to determine which ANs are really parameters.

Well, up to now I'd been thinking in terms of text parsing on the raw expression string but it seems that the information that can be obtained from that approach is limited.

Question: if the expression string were compiled would there then be enough additional information available to determine whether an identifier was of a particular kind, e.g., expression name, type name, method name, etc.? In other words, to use my previous test expression:

"b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))"

after compilation could it be determined that

b0, P, b1, S, R and b3 are expression names
Math is a type name
pow is a method name

or something like that?

  • You decide heuristically which ANs should map to parameters: Assuming that you have no fields nor local variables around, then an AN starting with a capital is assumed to be a type, the others parameters. Effectively, this algorithm computes the correct set of parameters for your example.

I'm trying to use Janino in two contexts:

  • Within my program a user is allowed to enter a Java expression with arbitrary "standalone" identifiers and method calls (from both Java built-in and other available classes). My program would then look at the expression and generate a table of variables to which values could be assigned by the user (see below for issues of variable typing). Here the issue is to not include in this table the identifiers associated with method calls and one possible problem with a heuristic solution is that adherence to naming conventions cannot always be guaranteed.
  • Also (in another part of my program) a user could specify a set of Java statements to be executed (a Java "Script"). In this case however the script is expected to be "self-contained"...there would be no variable table generated (and thus no need to parse the program for identifiers).

Any other idea?

Well, if compiling won't provide additional information then I suppose disambiguation of ambiguous names could be done by checking whether any of the names are part of a method invocation expression (either using a feature of the Janino parsing system or by using regular expression tests) and eliminating those names.

One more thing: How would you deal with parameter types? In your example, you define that all parameters are int type. Is that always sufficient for you?

This feature of my program is "in progress". The initial implementation will probably assume that all variables obtained from the expression (and displayed in the variable table mentioned above) are of double type. A later implementation may include a column in the variable table where a type can be specified, e.g., something like

Name Type Value
b0 double 12.0
P int 4
... ... ...

where both the Type and Value columns are editable.

Sorry for all the verbiage. I hope some of it was useful.

Regards,
Gary


Arno Unkrig - 22/Jul/07 04:34 PM
Gary,

Question: if the expression string were compiled would there then be enough additional information available to determine whether an identifier was of a particular kind, e.g., expression name, type name, method name, etc.? In other words, to use my previous test expression:

"b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))"

after compilation could it be determined that

b0, P, b1, S, R and b3 are expression names
Math is a type name
pow is a method name

or something like that?

The answer is: No. You have to know the number, names and types of all parameters before you compile. E.g. if you define that "Math" is a parameter, then "Math.pow(P, b1)" is an invocation of method "pow()" of parameter "Math". In other words: The Java programming language is not laid out for your question "which ambiguous names in this piece of code are parameters?".

Well, if compiling won't provide additional information then I suppose disambiguation of ambiguous names could be done by checking whether any of the names are part of a method invocation expression (either using a feature of the Janino parsing system or by using regular expression tests) and eliminating those names.

May be a feasible approach, but also doesn't work perfectly: Your algorithm would e.g. regard "System.out" as a parameter, although you most probably mean "java.lang.System.out".

This feature of my program is "in progress". The initial implementation will probably assume that all variables obtained from the expression (and displayed in the variable table mentioned above) are of double type. A later implementation may include a column in the variable table where a type can be specified, e.g., something like
Name Type Value
b0 double 12.0
P int 4
... ... ...

where both the Type and Value columns are editable.

Sorry for all the verbiage. I hope some of it was useful.

Regards,
Gary
[ Show » ]
Gary Nunes - [20/Jul/07 01:53 PM ] Arno,

The short version is: "Math" is ambiguous: It could be a type, a field, a local variable or a method parameter in this context. The ambiguous name cannot be reclassified before the expression is compiled, but in order to compile it, we must first determine its parameters. A chicken-egg problem. I see two solutions here:

  • You assume that all ambiguous names are parameters (b0, Math, P, b1, S, R, b3) and compile the expression, which should work fine. Then you remove the parameters one by one and compile again to determine which ANs are really parameters.

Well, up to now I'd been thinking in terms of text parsing on the raw expression string but it seems that the information that can be obtained from that approach is limited. Question: if the expression string were compiled would there then be enough additional information available to determine whether an identifier was of a particular kind, e.g., expression name, type name, method name, etc.? In other words, to use my previous test expression: "b0 * Math.pow(P, b1) * Math.pow(S * (1 -R), b3) * (S * (1 - R))" after compilation could it be determined that b0, P, b1, S, R and b3 are expression names Math is a type name pow is a method name or something like that?

  • You decide heuristically which ANs should map to parameters: Assuming that you have no fields nor local variables around, then an AN starting with a capital is assumed to be a type, the others parameters. Effectively, this algorithm computes the correct set of parameters for your example.

I'm trying to use Janino in two contexts:

  • Within my program a user is allowed to enter a Java expression with arbitrary "standalone" identifiers and method calls (from both Java built-in and other available classes). My program would then look at the expression and generate a table of variables to which values could be assigned by the user (see below for issues of variable typing). Here the issue is to not include in this table the identifiers associated with method calls and one possible problem with a heuristic solution is that adherence to naming conventions cannot always be guaranteed.
  • Also (in another part of my program) a user could specify a set of Java statements to be executed (a Java "Script"). In this case however the script is expected to be "self-contained"...there would be no variable table generated (and thus no need to parse the program for identifiers).

Any other idea?

Well, if compiling won't provide additional information then I suppose disambiguation of ambiguous names could be done by checking whether any of the names are part of a method invocation expression (either using a feature of the Janino parsing system or by using regular expression tests) and eliminating those names.

One more thing: How would you deal with parameter types? In your example, you define that all parameters are int type. Is that always sufficient for you?

This feature of my program is "in progress". The initial implementation will probably assume that all variables obtained from the expression (and displayed in the variable table mentioned above) are of double type. A later implementation may include a column in the variable table where a type can be specified, e.g., something like
Name Type Value
b0 double 12.0
P int 4
... ... ...

New idea: If you offer the user to edit the parameter types, then he's also free to remove the "wrong" parameter "Math" from the list, right? In other words: Not your program tells the user "your expression has exactly these parameters, please fill them in", but you program "guesses" a set of parameters from the expression, and then offers the user to remove, retype and rearrange them. Writing expressions is not bulletproof, anyway – the user could make syntax errors, or execute operators on the wrong types (e.g. "2.0 << 3.0") etc. So why try to do a perfect job guessing the parameter names?

CU

Arno


Gary Nunes - 22/Jul/07 05:45 PM
Arno,

New idea: If you offer the user to edit the parameter types, then he's also free to remove the "wrong" parameter "Math" from the list, right? In other words: Not your program tells the user "your expression has exactly these parameters, please fill them in", but you program "guesses" a set of parameters from the expression, and then offers the user to remove, retype and rearrange them. Writing expressions is not bulletproof, anyway - the user could make syntax errors, or execute operators on the wrong types (e.g. "2.0 << 3.0") etc. So why try to do a perfect job guessing the parameter names?

Yes, I suppose that some combination of reasonable heuristics (like trying to remove method invocation names) and human cognition (manual removal of names) will be required.

If so, the Janino API that you provide might have at least two "entry points" (earlier, for the purpose of illustration, I assumed static methods in class ExpressionEvaluator but you may have a better idea):

// returns all ambiguous names
public String[] getAmbiguousNames(String expression); 

// return ambiguous names after filtering with NameChecker interface
public String[] getAmbiguousNames(String expression, NameChecker nc);

where NameChecker is an interface of the form

public interface NameChecker {
    public boolean isValidName(String name);
}

that can be applied to the names as they are generated to filter the name list (of course I can do this myself after I get the whole list but you know us programmers...lazy).

Any other thoughts?

Regards,
Gary


Arno Unkrig - 30/Jul/07 04:22 PM
Hi Gary,

I'd like to keep it more generic, say:

SimpleCompiler.setVerifyer(Traverser traverser)
ClassBodyEvaluator.setVerifier(Traverser traverser)
ScriptEvaluator.setVerifiers(Traverser[] traversers)
ExpressionEvaluator.setVerifiers(Traverser[] traversers)

That way one can do arbitrary checks on the generated syntax trees, e.g. extract ambiguous names, and even update the parameter names and types. E.g.

ExpressionEavluator ee = new ExpressionEvaluator(...);
ee.setVerifier(new Traverser() {
    public void traverseAmbiguousName(Java.AmbiguousName an) {
        System.out.println("Ambiguous name detected: " + an);
        super.traverseAmbiguousName(an);
    }
};
ee.cook("a + b + Math.pow(c, d)");

Would that do?

CU

Arno


Gary Nunes - 30/Jul/07 07:31 PM
Arno,

Sounds good, especially if accompanied with some documentation/examples to highlight usefulness (i suspect some users of Janino [read me] may not be familiar enough with the Traverser and associated classes).

This will certainly go a ways in allowing me to use Janino as I've described in previous comments.

Thanks for all the help.

Regards,
Gary


Arno Unkrig - 20/Dec/07 04:49 PM
Hi Gary,

it was a long road, but now I have decided on the "proper" API: Please check out "EE.guessParameterNames()" and "SE.guessParameterNames()" and let me know about your thoughts.

CU

Arno