Current goals
Today I was trying to finish recursive vector code, and of course lots of other things interfered. Several things happened today:- Fixed radix expansion of DftCt with left recursion in Maude, will produce a tail-recursive program, which can be converted into a loop. This has nice parallel with what FLAME is doing.
- I talked to Franz and there are several issues with scalability of Spiral. We need to be able to compose several different codes: real, complex, FMA, fixed-point, vector, SMP, recursive and so forth.
The scond main issue is scalability of Spiral. Here we came up with a list of TODOs:
- RulesFor(pattern, ...) (for example [TRC, [DFT, @]])
- Composable unparsers
- C99Unparser, TRC, TCR, TCC
- Composable CodeRuleTree, or at least one should resolve all conflicts within different modes
- This is complete when TCR(RDFT(..)) produces assymmetric code (real input, complex output)
- SPL/DP options record expansion. Franz also suggested the idea of "worksheet". Worksheet would keep options along with all intermediate results (ie. ruletree, formula, sigma-spl formula, and finally code). This should simplify conflict resolution
- Eliminate all global variables (2 and 3 basically do this)
- DP should return code + file where it is stored.
Micro-agenda
- X dct3
- P dags for Markus
- hashing issues c2|c3(5)
- ^T breaks, nontriv path
- port/del legacy funcs
- small lib
Feedparser usage
>>> import feedparser>>> d = feedparser.parse("http://sparch.blogspot.com/atom.xml")
>>> print d['entries'][0]['content'][0]['value']
Rewrite Systems
ELAN (A, AC)Maude (A, AC)
Stratego (no AC)
Installing intel compiler without rpm
sed -i 's/opt/intel_cc_80g' icc
sed -i 's/opt/intel_cc_80g' iccvars.sh
sed -i 's/opt/intel_cc_80g' iccvars.csh
sed -i 's/opt/intel_cc_80g' icpc
Identifier shadowing
Current priorities:imports -> pkgs -> Global
Suppose you have 'q' in imports.
failed load(q) creates valueless identifier 'q' in pkgs.
typing 'q' causes "q must have a value".
From algsimp.ml: (* * simplify patterns of the fo...
From algsimp.ml:My suggestion:
(*
* simplify patterns of the form
*
* ((c_1 * a + ...) + ...) + (c_2 * a + ...)
*
* The pattern includes arbitrary coefficients and minus signs.
* A common case of this pattern is the butterfly
* (a + b) + (a - b)
* (a + b) - (a - b)
*)
(* this whole procedure needs much more thought *)
and deepCollectM maxdepth l =
- Incorporate (a+b)+(a-b) simplification, i.e. Sum + Sum
- Figure out the difference between c1*a+c2*a -> (c1+c2)*a and c1*a + c1*b -> c1*(a+b)
- Implement "deep" versions of two collects above
- Is (a+b)+(a-b) a special case of collect? it looks so, but c1=1, and c2=-1.
Dimensionless
We got dimensionless rule to produce good code with Xu and Riddhi. They left today.full_merge_tensor_chains is the main function for simplification, it tries to break-and-match two composed tensors compose the individual terms.
The function is very general, but probably rather slow, and this adds to rewriting overhead.
GAP Speed
We need to improve speed of compilation. Things to consider:- Function overhead (300ns on mini/1.6ghz, about 500 cycles)
- Slow record access
- String operations with var.id
- Algorithms of course..
For records we should have fast int->int hashing (like in tab), and vtables to get superclass fields compiled-in.
Maybe also a form of strong typing?