Skip to content
dmiller edited this page Feb 6, 2011 · 12 revisions

CLR interop is essentially the same as JVM interop. However, we have to make a number of extensions to allow for parts of the CLR object model that are not in the JVM.

Basic interop

All the basic functions for interop mentioned on http://clojure.org/java_interop work as advertised. This includes all of

  • Member access
    • (.instanceMember instance args*)
    • (.instanceMember Classname args*)
    • (Classname/staticMethod args*)
    • Classname/staticField
  • Dot special form
    • (. instance-expr member-symbol)
    • (. Classname-symbol member-symbol)
    • (. instance-expr (method-symbol args*)) or
    • (. instance-expr method-symbol args*)
    • (. Classname-symbol (method-symbol args*)) or
    • (. Classname-symbol method-symbol args*)
  • Instantiation
    • (Classname. args*)
    • (new Classname args*)
  • Assignment
    • (set! (. instance-expr instanceFieldName-symbol) expr)
    • (set! (. Classname-symbol staticFieldName-symbol) expr)
  • Miscellaneous
    • (.. instance-expr member+)
    • (.. Classname-symbol member+)
    • (doto instance-expr (instanceMethodName-symbol args*)*)
    • (instance? Class expr)
    • (memfn method-name arg-names*)

Most of the array interop functions work.

See below for how to work with generic methods.

The things not implemented yet:

  • bean
  • parse

Reflection

Thanks to the magic of DLR, reflection is no longer as much of a performance hit. Any interop call that cannot be resolved at compile-time gets compiled into a call-site that supports caching of methods discovered at run-time. Thus, if we have an interop call (.m2 c x), and it is called with a @c@’s value being an instance of class C1 and a @x@’s value being a String, the first call will identify the appropriate method to call and cache this information. A subsequent call with a C2 and a DateTime will identify and cache a different method. The next call’s parameter values will be checked for matching C1/String and C2/DateTime. If either test succeeds, then the corresponding cached method will be called, thus avoiding reflection. Failure to match will cause reflection to be used.

ByRef parameters

Support has been added for ByRef parameters. There is no notion of an uninitialized variable in Clojure, so the distinction seen in C# between ref and out parameters is irrelevant. The special syntactic form by-ref wraps local variables in interop calls to indicate a parameter that is to be passed by reference. Upon completion of the call, the local variable will be rebound to the value ‘returned’ from the interop call.

For example, suppose you had the following class defined in C#:

public class dm.interop.C1
{
        public int m3(int x) { return x; }
        public int m3(ref int x) { x = x + 1; return x+20; }
        public string m5(string x, ref int y) { y = y + 10;  return x + y.ToString(); }
        public int m5(int x, ref int y) { y = y + 100; return x+y; }
}

The following Clojure function, passed an instance of dm.interop.C1 and an Int32 will call the overload of m3 with the by-ref argument.

(defn f3r [c n]
  (let [m (int n)]
     (.m3 c (by-ref m))
     m))  

Note that it is necessary to provide the type hint via (int n). Otherwise, the only argument to match would be a ref Object. This example will use reflection and will match on a first argument of any type that has a method with signature m3(ref int). To avoid the reflection, you could type-hint the the variable c.

To get the other overload, the interop call (.m3 c m) will do the trick.

For method m5, consider

(defn f5 [c x y]
  (let [m (int y)
        v (.m5 c x (by-ref m))]
    [v m]))

Then (f5 c1 "help" 12) => ["help22" 22] and (f5 c1 15 20) => [135 120]. In other words, because the m5 call is not resolved at compile-time, reflection will pick the correct overload at run-time.

The same mechanism works for new expressions.

The syntactic form by-ref can be used at the top-level of interop calls, as shown. (by-ref can also be used in definterface, deftype and similar mechanisms. See Defining types.) It can only wrap a local variable. Supplying a non-local variable argument or invoking someplace other than the top-level of an interop call will cause an exception to be thrown.

params arguments

Consider the following class:

namespace dm.interop
{
    public class C6
    {
        public static int sm1(int x, params object[] ys)
        {
            return x + ys.Length;
        }
       public static int sm1(int x, params string[] ys)
        {
            int count = x;
            foreach (String y in ys)
                count += y.Length;
            return count;
        }
       public static int m2(ref int x, params object[] ys)
        {
            x += ys.Length;
            return ys.Length;
        }
    }
}

Consider calling sm1. There are overloads with params args of type object[] and of type string[]. The first overload can be invoked by

 (dm.interop.C6/sm1 12 #^objects (into-array Object [1 2 3] ))

or

 (dm.interop.C6/sm1 12 #^"System.Object[]" (into-array Object  [1 2 3]))

The second overload can be invoked by

(dm.interop.C6/sm1 12 #^"System.String[]" (into-array String ["abc" "de" "f"]))

or

(dm.interop.C6/sm1 12 #^"System.String[]" (into-array ["abc" "de" "f"]))

Note that when a type name is given as a string in a type tag, it must be namespace-qualified.

For the combination of by ref and params as given by m2:

(defn c6m2 [x] 
  (let [n (int x)
        v (dm.interop.C6/m2 (by-ref n) #^objects (into-array Object [1 2 3 4]))]
    [n v]))

gen-delegate

The proxy function is especially useful in JVM-land to create listeners and the like. In CLR-land, we need delegates. We have added a gen-delegate macro to assist in creating delegates. Here is a sample use:

 (.add_Click button 
	     (gen-delegate EventHandler [sender args]
	        (let [c  (Double/Parse (.Text tb)) ]
	          (.set_Text f-label (str (+ 32 (* 1.8 c)) " Fahrenheit")))))

Type references

Clojure uses symbols to name types in two ways:

  • a package-qualified symbol (one containing periods internally) is taken to name the Java class with the same character sequence
  • a namespace may contain a mapping from a symbol to a Java class, via import.

Resolving a symbol is the process of determining the value of a symbol during evalution.

Identifying types with symbol names works reasonably well for Java because package-qualified class names are lexically compatible with symbols.

Not so for the CLR. CLR typenames can contain arbitrary characters. Backslashes can escape characters that do have special meaning in the typename syntax (comma, plus, ampersand, asterisk, left and right square bracket, left and right angle bracket, backslash). Fully-qualified type names can contain an assembly identifier, which involves spaces and commas. Thus, fully-qualified type names cannot be represented as symbols.

ClojureCLR extends the reader syntax for symbols to allow vertical bar escaping.

Vertical bars are used in pairs to surround the name (or part of the name) of a symbol that has many special characters in it. It is roughly equivalent to escaping every character in the surrounded fragment. a backslash in front of every character so surrounded. For example, |A(B)|, A|(|B|)|, and A|(B)| all mean the symbol whose name consists of the four characters A, (, B, and ).

|anything except a vertical bar...<>@#$@#$#$|

To include a vertical bar in a symbol name that is |-escaped, use a doubled vertical bar.

|This has a vertical bar in the name ...  || ...<>@#$@#$#$|

Note: |-escaping defined in ClojureCLR differs from the similar mechanism in CommonLisp in one significant way:

  • CommonLisp allows a literal vertical bar in a symbol name with backslash-escaping: abc\|123 has name “abc|123”. ClojureCLR uses a doubled vertical bar, and only within an escaped symbol name. We would write |abc||123| for the symbol with name “abc|123”.

There is a special interaction of |-escaping with / used to separate namespace from name. Any / appearing |-escaped does not count as a namespace/name separator. Thus,

(namespace 'ab|cd/ef|gh) => nil
(name 'ab|cd/ef|gh)  => "abcd/efgh"

(namespace 'ab/cd|ef/gh|ij) => "ab"
(name 'ab/cd|ef/gh|ij) => "cdef/ghij"

With this mechanism can we make a symbol referring to a type such as:

(|com.myco.mytype+nested, MyAssembly, Version=1.3.0.0, Culture=neutral, PublicKeyToken=b14a123334343434|/DoSomething x y)
(reify 
  |AnInterface`2[Int32,String]| 
  (m1 [x] ...)
  I2
  (m2 [x] ...))

Special note should be made of the proper way to refer to generic types (instantiated or not).

|System.Collections.Generic.IList`1[System.Int32]|

This is the official CLR way of referring to the type that would be referred to in C# (with an import) by IList. We do not implement C# or Visual Basic lexical conventions for type names. The existence of characters such as the backquote and square brackets force some type of escaping mechanism.

I have not yet implemented print-readably for symbols with bad characters. (Nor has ClojureJVM.) This is straightforward and should be done.

Invoking generic methods

!!! Alpha. Experimental. Subject to change. !!!

We cannot yet do full type inferencing a la C#, as in:

Enumerable.Range(1,10).Where( x => x % 2 == 0)

However, if you are willing to explicitly declare types, you can do the following:

(import 'System.Linq.Enumerable)
(def r1 (Enumerable/Range 1 10))     ; Not generic
(seq r1)                             ; => (1 2 3 4 5 6 7 8 9 10)
(def r2 (Enumerable/Repeat (type-args Int32) 2 5))
                                     ; use type-args to supply the type parameters to the generic method
(seq r2)                             ; => (2 2 2 2 2)
(def r3 (Enumerable/Where r1 (sys-func [Int32 Boolean] [x] (even? x))))
                                     ; the sys-func call generates a delegate of type System.Func`2[int,bool]
                                     ; Note that type-args is NOT needed for the Where -- they are inferred
(seq r3)                             ; => (2 4 6 8 10)
(def r4 (Enumerable/Where [1 2 3 4 5] (sys-func [Object Boolean] [x] (even? x))))
                                     ; Clojure types implement interfaces such as IEnumerable`1[Object]
                                     ; so that type inferencing works here
(seq r4)                             ; => (2 4)

Not yet implemented: multi-dimensional arrays

CLR supports true multi-dimensional arrays in addition to the ragged arrays that the JVM supports. Some extensions need to be defined.

See Completing CLR interop for some thoughts on the items below.

Assemblies

The JVM works with class files found directories and JAR files on the classpath. We have substituted the environment variable clojure.load.path as a mechanism for providing the set of directories to probe when looking for CLJ scripts and compiled-from-CLJ assemblies to load.