Flexible Computation Research Laboratory
The aim of this laboratory is to develop techniques, tools and educational material for flexible computation. (A presentation1 on this laboratory is available.)
What is flexible computation?
A key aspect of flexible computation is well defined flexible execution, i.e. values read during program execution, are independent of the permitted program execution orders. Parallel and/or sequential program execution orders with certain properties are permitted, and with all such executions, the values read from the variables should be identical.
Also, it is highly desirable that the language used for expressing flexible computation, is usable as a (simplified) hardware definition language. This will give the flexibility to implement solutions in software and/or hardware.
Simple Flexible Language (SFL)2 is a language enabling flexible execution. A prototype compiler for it was developed3. This language also enables hardware block diagrams to be generated from programs. Subsequently we have adapted this material and developed courses on flexible algorithms for beginners4,5,6,7.
Flexible execution is achieved by enhancing the scope rules of variables and defining where they may be initialized, where they may be updated, where they may be read. Updates are performed by suitable composite assignments which given these refined scope rules, ensure that the values read are well defined. Examples of such suitable assignments are once only assignment, "or=", "and=", "-=", etc. Thus this approach can extend or replace once only assignment with suitable composite assignments, ensuring well defined read values given these scope rules.
The life cycle of a variable enabling flexible execution is as follows.
1) Allocated and initialized possibly to a language specified default value (once only),
2) Updated using suitable composite assignments (sequentially in any order; sometimes in parallel - see below),
3) Read (in parallel),
4) Freed (once only).
Topics being researched
We concentrate research efforts on the following topics.
Embedded language for flexible computation
An embedded language (EFL) with these characteristics has been developed incorporating the scope rules supporting well defined flexible execution. This approach allows embedding flexible execution of code written in existing programming languages. The sequential parts of the computation (including all the Input/Output) are expressed in the host language only. The parallel parts of the computation are expressed in the embedded language only. EFL simplifies parallel programming (a) by embedding blocks capable of well defined flexible computation inside code written in an imperative host language (our first implementation is for Python), and (b) by freeing the programmer from all the details of the underlying native parallel programming technicalities required by the host language parallel programming constructs. A pre-compiler has been developed to convert EFL embedded blocks of code into native parallel code in the host language.
For further details see http://flexcomp.jct.ac.il/EFLimp .
Hardware support for flexible computation
We have built a prototype of shared OR memory using a FPGA which implements the lock free OR= composite assignment for two processors. A comparison of the time to execute in parallel regular assignments WITHOUT locking as opposed to regular assignments WITH locking was made. Avoiding locking made timings many times faster. Locking and unlocking are performed by functions which activate the FPGA hardware, making things much slower. Regarding the hardware implementation of OR= allowing well defined parallel execution WITHOUT locking, it was found to be a few percent slower than regular assignment WITHOUT locking. Therefore, a speedup of many times can be expected when using OR= WITHOUT locking, as opposed to using regular assignments WITH locking. These remarks apply to timings of single instructions.
Running complete programs, a speedup of about 30% to 40% was obtained when using OR= as opposed to using regular assignments WITH locking.
The surface area of the hardware implementation was two to three times larger than the area used for regular assignment because of the way logic cells are allocated on a FPGA. It is estimated that a custom implementation (using an ASIC) would require a surface area 30% to 40% more than the area used for regular assignment. Using capacitance-based DRAM memory should require even less additional surface area.
We would like to develop capacitor based memories to implement “OR=“ as this has the potential of a very simple implementation. Also, there is no need to read the contents of the memory to perform the “OR=". This also has the potential of parallel execution of OR=, given the electrical properties of capacitors and semiconductors.
Educational material for flexible computation
Educational material has been developed for beginners. However, it has concentrated on flexible algorithms; the material on hardware block diagrams being treated in brief. Further development of the educational material is desirable, for example:
- Develop an integrated course on flexible algorithms and digital logic
- Develop an advanced course on flexible algorithms
- Develop a course for secondary schools.
- Develop an internet based system for course preparation and teacher training, in flexible computation8.
Work in progress
1) Investigation of EFL usability by a case study implementation of task trees .
2) Using the method of statistical usage testing and operational reliability estimation based on operational profiles and test suite quality indicators to help programmers to achieve freedom from violations of once only assignment with high reliability.
3) Implementing parallel data structures using EFL.
4) Developing a version of the EFL pre-compiler using MPI.
5) Combining EFL with Transactional Memory (TM).
6) Rewriting the DEEPSAM package for finding the global minimum of the potential energy surface of proteins using EFL.
Please contact email@example.com for further information.