ScientificJ
a new language on top of Java for scientific programming.
In what follows I have drawn heavily on Fortress,
and to a lesser extent on Nice
and Groovy. It seeks to
address issues raised by the Java Numerics community
for high performance Java.
This document is a first attempt at a programming language built on top
of the Java class libraries and JVM which is aimed at scientific and
technical programming. It is meant to be used in situations where the
scientific programming portion is a relatively small portion of the
whole project, so the scientific parts may be written in a language
much better suited to them, with only a minor impedance mismatch
between ScientificJ and Java. It should be possible to choose between
implementation languages of ScientificJ and Java on a class by class
basis. Java itself has several disadvantages for
scientific programming as has been pointed out several times before,
these are mainly:
- No square arrays.
- No value objects other than
those built in, in particular no complex
type, no interval type and no range type or the ability to add new
value object types.
- Clunky syntax for
operations, e.g. A.times(B.transpose()) instead of
the more natural A
Bt
or 2 * PI * r instead of 2 π r.
- No operator overloading
- BigDecimal and BigInteger
not integrated into the language
- Not easy to declare map, set
and matrix data.
- Not possible to declare
methods to be pure functions (i.e. without side effects and depending
only on input parameters).
- Clunky syntax for anonymous
classes.
- Hard to create programs that use multiple processors efficiently
(especially important now dual core processors are becoming common).
One of the main benefits of Java is that it is not a big language, it
is small enough to learn easily as the interaction between portions of
the language are below the level at which the combinatorial explosion
really explodes. So this language should only add features if a similar
number of features are removed from Java.
- Casting
- Object, primitive distinction
- Direct access to threads,
synchronization, etc.
Casting is not needed as:
| Java |
ScientificJ |
final int y = 4;
final byte x = (byte)y; |
Integer y = 4;
Byte x = new Byte(y); |
List aList = new
ArrayList();
if (aList instance of ArrayList) {
ArrayList thisList = (ArrayList)thisList;
// use thisList
} |
List aList := new
ArrayList();
if (aList instance of ArrayList) {
// use aList which is
known to be of type ArrayList
} |
With Autoboxing/Unboxing added in Java J2SE 5.0 it is not necessary to
use Integer, Float, Double, etc, within ScientificJ code, when required
the int, float and double values will be boxed and unboxed when calling
a Java language library.
so we have
| Java |
ScientificJ |
| final int y = 4; |
Integer y = new
Integer(4);
Integer y = 4; // short-hand
y = 4; // type inference |
| final double x = 3.142 |
Double x = new
Double(3.142);
Double x = 3.142; // short-hand
x = 3.142; // type inference |
| final Complex c = new
Complex(0.0, 1.0); |
Complex c = new
Complex(0.0, 1.0);
Complex c = new Imaginary(1.0); //
use imaginary type that does not
commit to sign of real part.
Complex c = i;
// use imaginary constant
c = i;
// type inference |
Threads are not used directly in ScientificJ, instead use is made of
the java.lang.concurrent which includes higher level concurrency
features than threads and synchronized blocks.
- Atomic Boolean, Integer,
Long, and
arrays of Int and Long
- Futures
- (possibly others locks,
conditions, etc.)
| Java |
ScientificJ |
|
|
final FutureTask<Double> futureA = new FutureTask<Double>(new Callable<Double>() { public Double call() { return A.norm(); }}); executor.execute(futureA); final FutureTask<Double> futureB = new FutureTask<Double>(new Callable<Double>() { public Double call() { return A.norm(); }}); executor.execute(futureB); final double x = futureA.get().value() + futureB.get().value();
|
future Double a = || A
||;
future Double b = || B
||;
Double x = a + b; |
ScientificJ also allows parallel for loops rather like Fortress, but
unlike Fortress loops are not parallel by default.
parallel for (i <- 1:100) {
V[i] = ... // long computation which is not dependent
on previous iterations of the loop
}
The ScientificJ compiler is free to execute these iterations in
parallel or serial on any number of CPUs, using a set of heuristics and
policies which are part of the programing and execution environments.
One method is to use a CompletionService
from the java.util.concurrent pakage.
input and use of
non-ascii characters
To be Java compatible the convention of using the prefix _$ is used.
Greek letters are thus:
_$alpha _$beta _$gamma _$delta _$epsilon _$zeta _$eta _$theta _$iota
_$kappa _$lambda _$mu _$nu _$xi _$omicron _$pi _$rho _$sigma _$tau
_$upsilon _$phi _$chi _$psi _$omega
α β γ δ ε ζ
η θ ι κ λ
μ ν ξ ο π ρ σ
τ υ φ χ ψ ω
and similarly for the upper case Greek letters _$ALPHA,
etc. Various mathematical symbols will also be available (TBD)
which will be available as method names.
[Bold brackets, braces and
parentheses may be obtained by the values _$openBracket,
_$closeBracket, _$openBrace, _$closeBrace, _$openParen, _$closeParen.
that give:
[ ] { } ( )
I'm not sure how they might be
used yet, perhaps to enclose initializers?
Matrix A = [ [ 1.0, 0.0, 0.0 ] [ 0.0,
1.0, 0.0 ] [ 0.0, 0.0, 1.0 ] ]; // 3x3 sqaure 2D double
matrix.
]
Square Arrays
In Java arrays of 2 or more dimensions are implemented as arrays of
arrays. This is very flexible but leads to loss of performance, in
ScientificJ we have types:
Each of which has multiple implementation classes. The compiler chooses
the concrete class to use by examination of the program or by use of
hints from the programmer.
Matrix<Double> A = [ [ 1.0, 0.0,
0.0 ] [ 0.0,
1.0, 0.0 ] [ 0.0, 0.0, 1.0 ] ];
// 3x3 sqaure 2D matrix.
@sparse Matrix<Double> B; // sparse NxM rectangular 2D matrix (2D by
default, may not be square).
The index sizes may often be deduced from the program text, but where
required the index sizes may be added
C = new
Matrix<Double>[10,10];
A full set of matrix operators will be included, including transpose,
dot and cross vector products, multiply,
All the basic matrix classes will be implemented in Java, so
Matrix<Double> will be mapped onto the MatrixDouble Java
class and Vector<Float> will be mapped onto the
VectorFloat Java class. The cross product operator in ScientificJ will
be mapped onto the method _$by in the VectorFloat Java class.
This allows full interoperability between the ScientificJ and Java
languages.
| Java |
ScientificJ |
| final MatrixDouble C = Matrix.doubleImpl(10,10); |
C = new
Matrix<Double>[10,10];
C = new Matrix[10,10]; // double assumed |
VectorDouble A = Vector.doubleImpl(10);
|
A := new Vector[10]; |
final VectorDouble P =
C._$dot(A);
final VectorDouble CP = A._$by(P); |
P = C.A;
CP = AxP;
|
final double x = A.get(5);
A.put(4,2.0); |
Double x = A[5];
A[4] = 2.0; |
final MatrixDouble S = Matrix.doubleImpl("sparse",1000,1000);
S.put(100,100,1.0);
double d = S.get(100,100); |
@sparse Matrix S := new Matrix[1000,1000];
S[100,100] = 1.0;
d = S[100,100]; |
Assignment
Assignment is performed by two different operators, depending on
whether the assignment is permanent or whether it may be changed.
| Java |
ScientificJ |
final double x = 1.0;
x = 2.0; //error |
Double x = 1.0;
x := 2.0; //error |
int y = 5;
y = 6; // OK |
Integer y := 5;
y := 6; // OK
y = 7; // error |
For reference classes permanent assignment may not include
assigning the value null;
Better syntax for inner classes
If the inner class needs to only implement one method and if all types
used may be deduced then a less verbose syntax may be used.
| Java |
ScientificJ |
interface Predicate<T>
boolean evaluate(T object);
}
public List<T> sift(List<T> original,
Predicate<T> p)
List<T> newList = new
ArrayList<T>();
for (T element : original) {
if (p.evaluate(element)) {
newList.add(element); }
}
return newList;
}
List<Integer> li = new ArrayList<Integer>();
|
|
List<Integer> nli = sift(li, new
Predicate<Integer>()
{
boolean evaluate(Integer object) {
return object.intValue() < 10;
}
};
|
List<Integer> nli = sift(li, (Integer i | return
i <10; ) ) ;
|
Perhaps we can use the same syntax
as for Groovy closures ( i -> i <10; )
Ranges
A range is a pair of integer values with the first lower or equal to
the second.
Syntax: x : y (alternatively use the Groovy syntax x .. y)
Generators
A generator produces a sequence of values, or in the case of parallel
loops a set of values (the same ones but in no particular order):
i <- { 1, 3, 5, 7, 9 };
i <- sift( 0:10, (Integer i | i mod 2 = 1) );
i <- (new Generator(1:10) ).next( (Integer i | i mod 2 = 1) );
generators may be used withing for loops as an alternative to the usual
Java syntax.
parallel for ( i <- 0:9, j <-
0:9) {
M[i,j] = ... // some computation that takes a long time.
}
Array Slicing
Matrix A = new Matrix[10,10];
Vector V = A[1][5:7]; // V has
3 elements, no need for a copy both have final assignment.
Vector V = A[1,5:7]; //
alternative indexing style
Vector V = A[1,(new Generator(1:10) ).next( (Integer i | i mod 2 = 1)
)]; //
A[1,1],A[1,3],A[1,5],A[1,7],A[1,9]
Tuples and multiple return
values
parentheses are used for tuples where appropriate.
(Integer, Double) A;
// declare a tuple of
an integer and a float.
(Matrix<Double>, Vector<Double>) foo();
//
a method returning a tuple.
(Integer, Vector<Double>) A = (2, [10.0, 100.0]);
//
tuple
(Integer, Integer) foo() {
...
return (1,2); // return of a tuple
}
Dimensions and Units
The idea of dimensions and units is to allow dimensional checking of
formulas and ensuring the units used are the same (or converting
between them).
interface
Length extends Dimension<Double> {}
interface Mass extends Dimension<Double> {}
interface Time extends Dimension<Double> {}
interface Meter extends Unit<Length> {}
interface Kilogram extends Unit<Mass> {}
interface Second extends Unit<Time> {}
Kilogram m = 3.0;
Meter x = 5.0;
Second t = 4.0;
take the expression
m
* x / t
it has Dimension Mass.Length/Time and units of Meter.Kilogram/Second,
but it must have the same Dimension as
x
* m / t
which are Dimension Length.Mass/Time and units
of Kilogram.Meter/Second. So we may say that
Meter.Kilogram/Second
mks = x * m / t;
where the units and dimensions match.
But what happens with units that do not match
interface
Gram extends Unit<Mass> {
Kilogram conversionFactor
= 1000.0;
}
Gram g = 3000.0;
Meter.Kilogram/Second mks2 = x * g / t;
// mks ==
mks2
An automatic conversion takes place using the conversion factor defined
in the Pound interface.
Type aliases are used to cut down the verbage in the language.
type
M = Matrix<Double>;
type Area = Length^2; // alternatively type Area =
Length*Length;
All these types get erased during the conversion into Java bytecode so
the equivalent Java would only use double variables
| Java |
ScientificJ |
double m = 3.0;
double x = 5.0;
double t = 4.0;
double mks = x * m / t |
Kilogram m = 3.0;
Meter x = 5.0;
Second t = 4.0;
Meter.Kilogram/Second
mks = x * m / t; |
double g = 3000.0;
double mks2 = x * (g/Gram.conversionFactor) / t; |
Gram g = 3000.0;
Meter.Kilogram/Second mks2 = x * g / t; |
An example
(Vector<Float>,
Float)
conjGrad(Matrix<Float> A, Vector<Float> x)
{
cgit_max
= 25;
Vector<Float> z = 0;
Vector<Float> r = x;
Vector<Float> p = r;
Float _$rho = r^T * r;
for ( j <- 1:cgit_max) {
q
= A * p;
_$alpha = _$rho / p^T * q;
z := z + _$alpha * p;
r := r - _$alpha * q;
_$rho_0
= _$rho;
_$rho := r^T r;
_$beta = _$rho / _$rho_0;
p := r + _$beta * p;
}
return (z, || x - A * z||)
}
As displayed on a compatible IDE
(Vector<Float>,
Float) conjGrad(Matrix<Float>
A, Vector<Float>
x)
{
cgitmax
= 25;
Vector<Float>
z = 0;
Vector<Float>
r = x;
Vector<Float>
p = r;
Float
ρ = rT
* r;
for
( j <- 1:cgitmax)
{
q
= A p;
α
= ρ / pT
q;
z
:= z + α p;
r
:= r - α q;
ρ0
= ρ ;
ρ
:= rT
r;
β
= ρ / ρ
0;
p
:= r + β p;
}
return
(z, || x - A * z||)
}