ScientificJ a new language on top of Java for scientific programming.


In what follows I have drawn heavily on Fortress, and to a lesser extent on Nice and Groovy. It seeks to address issues raised by the Java Numerics community for high performance Java.

This document is a first attempt at a programming language built on top of the Java class libraries and JVM which is aimed at scientific and technical programming. It is meant to be used in situations where the scientific programming portion is a relatively small portion of the whole project, so the scientific parts may be written in a language much better suited to them, with only a minor impedance mismatch between ScientificJ and Java. It should be possible to choose between implementation languages of ScientificJ and Java on a class by class basis. Java itself has several disadvantages for scientific programming as has been pointed out several times before, these are mainly:
One of the main benefits of Java is that it is not a big language, it is small enough to learn easily as the interaction between portions of the language are below the level at which the combinatorial explosion really explodes. So this language should only add features if a similar number of features are removed from Java.


Casting is not needed as:
Java ScientificJ
final int y = 4;
final byte x = (byte)y;
Integer y = 4;
Byte x = new Byte(y);
List aList = new ArrayList();

if (aList instance of ArrayList) {
   ArrayList thisList = (ArrayList)thisList;
   // use thisList
}
List aList := new ArrayList();

if (aList instance of ArrayList) {
   // use aList which is known to be of type ArrayList
}


With Autoboxing/Unboxing added in Java J2SE 5.0 it is not necessary to use Integer, Float, Double, etc, within ScientificJ code, when required the int, float and double values will be boxed and unboxed when calling a Java language library.

so we have
Java ScientificJ
final int y = 4; Integer y = new Integer(4);
Integer y = 4; // short-hand
y = 4; // type inference
final double x = 3.142 Double x = new Double(3.142);
Double x = 3.142; // short-hand
x = 3.142; // type inference
final Complex c = new Complex(0.0, 1.0); Complex c = new Complex(0.0, 1.0);
Complex c = new Imaginary(1.0); // use imaginary type that does not commit to sign of real part.
Complex c = i; // use imaginary constant
c = i; // type inference

Threads are not used directly in ScientificJ, instead use is made of the java.lang.concurrent which includes higher level concurrency features than threads and synchronized blocks.

Java ScientificJ
final FutureTask<Double> futureA =
new FutureTask<Double>(new Callable<Double>() {
public Double call() {
return A.norm();
}});
executor.execute(futureA);
final FutureTask<Double> futureB =
new FutureTask<Double>(new Callable<Double>() {
public Double call() {
return A.norm();
}});
executor.execute(futureB);
final double x = futureA.get().value() + futureB.get().value();
future Double a = || A ||;
future Double b = || B ||;
Double x = a + b;


ScientificJ also allows parallel for loops rather like Fortress, but unlike Fortress loops are not parallel by default.

parallel for (i <- 1:100) {
V[i] = ...     // long computation which is not dependent on previous iterations of the loop
}

The ScientificJ compiler is free to execute these iterations in parallel or serial on any number of CPUs, using a set of heuristics and policies which are part of the programing and execution environments. One method is to use a  CompletionService from the java.util.concurrent pakage.

input and use of non-ascii characters


To be Java compatible the convention of using the prefix _$ is used. Greek letters are thus:
_$alpha _$beta _$gamma _$delta _$epsilon _$zeta _$eta _$theta _$iota _$kappa _$lambda _$mu _$nu _$xi _$omicron _$pi _$rho _$sigma _$tau _$upsilon _$phi _$chi _$psi _$omega

α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ τ υ φ χ ψ ω

and similarly for the upper case Greek letters _$ALPHA, etc. Various mathematical symbols will also be available (TBD) which will be available as method names.

[Bold brackets, braces and parentheses may be obtained by the values _$openBracket, _$closeBracket, _$openBrace, _$closeBrace, _$openParen, _$closeParen. that give:

[ ]  { } ( )

I'm not sure how they might be used yet, perhaps to enclose initializers?

Matrix A = [ [ 1.0, 0.0, 0.0  ] [ 0.0, 1.0, 0.0 ] [ 0.0, 0.0, 1.0 ] ];  // 3x3 sqaure 2D double matrix.
]

Square Arrays

In Java arrays of 2 or more dimensions are implemented as arrays of arrays. This is very flexible but leads to loss of performance, in ScientificJ we have types:
Each of which has multiple implementation classes. The compiler chooses the concrete class to use by examination of the program or by use of  hints from the programmer.

Matrix<Double> A = [ [ 1.0, 0.0, 0.0  ] [ 0.0, 1.0, 0.0 ] [ 0.0, 0.0, 1.0 ] ];     // 3x3 sqaure 2D matrix.

@sparse Matrix<Double> B;      // sparse NxM rectangular 2D matrix (2D by default, may not be square).

The index sizes may often be deduced from the program text, but where required the index sizes may be added

C = new Matrix<Double>[10,10];

A full set of matrix operators will be included, including transpose, dot and cross vector products, multiply, 

All the basic matrix classes will be implemented in Java, so Matrix<Double> will be mapped onto the MatrixDouble Java class and Vector<Float> will be mapped onto the VectorFloat Java class. The cross product operator in ScientificJ will be mapped onto the method _$by in the VectorFloat Java class. This allows full interoperability between the ScientificJ and Java languages.

Java ScientificJ
final MatrixDouble C = Matrix.doubleImpl(10,10); C = new Matrix<Double>[10,10];
C = new Matrix[10,10]; // double assumed
VectorDouble A = Vector.doubleImpl(10);
A := new Vector[10];
final VectorDouble P = C._$dot(A);
final VectorDouble CP = A._$by(P);
P = C.A;
CP = AxP;
final double x = A.get(5);
A.put(4,2.0);
Double x = A[5];
A[4] = 2.0;
final MatrixDouble S = Matrix.doubleImpl("sparse",1000,1000);
S.put(100,100,1.0);
double d = S.get(100,100);
@sparse Matrix S := new Matrix[1000,1000];
S[100,100] = 1.0;
d = S[100,100];


Assignment

Assignment is performed by two different operators, depending on whether the assignment is permanent or whether it may be changed.
Java ScientificJ
final double x = 1.0;
x = 2.0; //error
Double x = 1.0;
x := 2.0; //error
int y = 5;
y = 6; // OK
Integer y := 5;
y := 6; // OK
y = 7; // error

 For reference classes permanent assignment may not include assigning the value null;


Better syntax for inner classes

If the inner class needs to only implement one method and if all types used may be deduced then a less verbose syntax may be used.

Java ScientificJ
interface Predicate<T>
boolean evaluate(T object);
}
public List<T> sift(List<T> original, Predicate<T> p)
List<T> newList = new ArrayList<T>();
for (T element : original) {
if (p.evaluate(element)) { newList.add(element); }
}
return newList;
}

List<Integer> li = new ArrayList<Integer>();
List<Integer> nli = sift(li, new Predicate<Integer>()
{
boolean evaluate(Integer object) {
return object.intValue() < 10;
}
};
List<Integer> nli = sift(li, (Integer i |  return i <10; ) ) ;

Perhaps we can use the same syntax as for Groovy closures ( i  ->  i <10; )


Ranges


A range is a pair of integer values with the first lower or equal to the second.

Syntax:   x : y (alternatively use the Groovy syntax x .. y)

Generators

A generator produces a sequence of values, or in the case of parallel loops a set of  values (the same ones but in no particular order):
 
i <- { 1, 3, 5, 7, 9 };
i <- sift( 0:10, (Integer i | i mod 2 = 1) );
i <- (new Generator(1:10) ).next( (Integer i | i mod 2 = 1) );

generators may be used withing for loops as an alternative to the usual Java syntax.

parallel for ( i <- 0:9, j <- 0:9) {
M[i,j] = ... // some computation that takes a long time.
}

Array Slicing

Matrix A = new Matrix[10,10];

Vector V = A[1][5:7];  // V has 3 elements, no need for a copy both have final assignment.
Vector V = A[1,5:7];  // alternative indexing style

Vector V = A[1,(new Generator(1:10) ).next( (Integer i | i mod 2 = 1) )];  // A[1,1],A[1,3],A[1,5],A[1,7],A[1,9]


Tuples and multiple return values

parentheses are used for tuples where appropriate.

(Integer, Double) A;      // declare a tuple of an integer and a float.
(Matrix<Double>, Vector<Double>) foo();     // a method returning a tuple.
(Integer, Vector<Double>) A = (2, [10.0, 100.0]);     // tuple
(Integer, Integer) foo() {
...
return (1,2);      // return of a tuple
}


Dimensions and Units

The idea of dimensions and units is to allow dimensional checking of formulas and ensuring the units used are the same (or converting between them).

interface Length extends Dimension<Double> {}
interface Mass extends Dimension<Double> {}
interface Time extends Dimension<Double> {}
interface Meter extends Unit<Length> {}
interface Kilogram extends Unit<Mass> {}
interface Second extends Unit<Time> {}

Kilogram m = 3.0;
Meter x = 5.0;
Second t = 4.0;

take the expression
m * x / t
it has Dimension Mass.Length/Time and units of Meter.Kilogram/Second, but it must have the same Dimension as
x * m / t
which are Dimension Length.Mass/Time and units of Kilogram.Meter/Second. So we may say that

Meter.Kilogram/Second mks = x * m / t;

where the units and dimensions match.

But what happens with units that do not match

interface Gram extends Unit<Mass> {
Kilogram  conversionFactor = 1000.0;
 }

Gram g = 3000.0;
Meter.Kilogram/Second mks2 = x * g / t;     // mks == mks2

An automatic conversion takes place using the conversion factor defined in the Pound interface.

Type aliases are used to cut down the verbage in the language.
type M = Matrix<Double>;
type Area = Length^2;       // alternatively type Area = Length*Length;

All these types get erased during the conversion into Java bytecode so the equivalent Java would only use double variables

Java ScientificJ
double m = 3.0;
double x = 5.0;
double t = 4.0;
double mks = x * m / t
Kilogram m = 3.0;
Meter x = 5.0;
Second t = 4.0;
Meter.Kilogram/Second mks = x * m / t;
double g = 3000.0;
double mks2 = x * (g/Gram.conversionFactor) / t;
Gram g = 3000.0;
Meter.Kilogram/Second mks2 = x * g / t;

An example


(Vector<Float>, Float) conjGrad(Matrix<Float> A, Vector<Float> x)
{
cgit_max = 25;
Vector<Float> z = 0;
Vector<Float> r = x;
Vector<Float> p = r;
Float _$rho = r^T * r;
for ( j <- 1:cgit_max) {
q = A * p;
_$alpha = _$rho / p^T * q;
z := z + _$alpha * p;
r := r - _$alpha * q;
_$rho_0 = _$rho;
_$rho := r^T r;
_$beta = _$rho / _$rho_0;
p := r + _$beta * p;
}
return (z, || x - A * z||)
}

As displayed on a compatible IDE

(Vector<Float>, Float) conjGrad(Matrix<Float> A, Vector<Float> x)
{
cgitmax = 25;
Vector<Float> z = 0;
Vector<Float> r = x;
Vector<Float> p = r;
Float ρ = rT * r;
for ( j <- 1:cgitmax) {
q = A p;
α = ρ / pT q;
z := z + α p;
r := r - α q;
ρ0 = ρ ;
ρ := rT r;
β = ρ / ρ 0;
p := r + β p;
}
return (z, || x - A * z||)
}