Current goals

Today I was trying to finish recursive vector code, and of course lots of other things interfered. Several things happened today:
  • Fixed radix expansion of DftCt with left recursion in Maude, will produce a tail-recursive program, which can be converted into a loop. This has nice parallel with what FLAME is doing.
  • I talked to Franz and there are several issues with scalability of Spiral. We need to be able to compose several different codes: real, complex, FMA, fixed-point, vector, SMP, recursive and so forth.
Automatic iterative DFT construction is very nice, and easy to to in Maude. I will do this one day. Franz also showed me how to get a large sample of iterative variants automatically (Pease, Stockham, Korn-Lambiot, etc).

The scond main issue is scalability of Spiral. Here we came up with a list of TODOs:
  1. RulesFor(pattern, ...) (for example [TRC, [DFT, @]])
  2. Composable unparsers
    • C99Unparser, TRC, TCR, TCC
    • Composable CodeRuleTree, or at least one should resolve all conflicts within different modes
    • This is complete when TCR(RDFT(..)) produces assymmetric code (real input, complex output)
  3. SPL/DP options record expansion. Franz also suggested the idea of "worksheet". Worksheet would keep options along with all intermediate results (ie. ruletree, formula, sigma-spl formula, and finally code). This should simplify conflict resolution
  4. Eliminate all global variables (2 and 3 basically do this)
  5. DP should return code + file where it is stored.

Micro-agenda

  • X dct3
  • P dags for Markus
  • hashing issues c2|c3(5)
  • ^T breaks, nontriv path
  • port/del legacy funcs
  • small lib
Transposition and hashing need writing. I will probably post them here. I need to document the problem, think carefully and come up with a robust solution.

Maude

Maude



Equational and rewrite logic programming with reflection.

Feedparser usage

>>> import feedparser
>>> d = feedparser.parse("http://sparch.blogspot.com/atom.xml")
>>> print d['entries'][0]['content'][0]['value']

Rewrite Systems

ELAN (A, AC)
Maude (A, AC)
Stratego (no AC)

Installing intel compiler without rpm

sed -i 's/opt/intel_cc_80g' icc
sed -i 's/opt/intel_cc_80g' iccvars.sh
sed -i 's/opt/intel_cc_80g' iccvars.csh
sed -i 's/opt/intel_cc_80g' icpc

Identifier shadowing

Current priorities:
imports -> pkgs -> Global

Suppose you have 'q' in imports.
failed load(q) creates valueless identifier 'q' in pkgs.

typing 'q' causes "q must have a value".

From algsimp.ml: (* * simplify patterns of the fo...

From algsimp.ml:

(*
* simplify patterns of the form
*
* ((c_1 * a + ...) + ...) + (c_2 * a + ...)
*
* The pattern includes arbitrary coefficients and minus signs.
* A common case of this pattern is the butterfly
* (a + b) + (a - b)
* (a + b) - (a - b)
*)
(* this whole procedure needs much more thought *)
and deepCollectM maxdepth l =

My suggestion:
  • Incorporate (a+b)+(a-b) simplification, i.e. Sum + Sum
  • Figure out the difference between c1*a+c2*a -> (c1+c2)*a and c1*a + c1*b -> c1*(a+b)
  • Implement "deep" versions of two collects above
  • Is (a+b)+(a-b) a special case of collect? it looks so, but c1=1, and c2=-1.

Dimensionless

We got dimensionless rule to produce good code with Xu and Riddhi. They left today.

full_merge_tensor_chains is the main function for simplification, it tries to break-and-match two composed tensors compose the individual terms.

The function is very general, but probably rather slow, and this adds to rewriting overhead.

GAP Speed

We need to improve speed of compilation. Things to consider:
  1. Function overhead (300ns on mini/1.6ghz, about 500 cycles)
  2. Slow record access
  3. String operations with var.id
  4. Algorithms of course..
GAP-to-C compiler?
For records we should have fast int->int hashing (like in tab), and vtables to get superclass fields compiled-in.
Maybe also a form of strong typing?