M1RI Development Blog: 2014

Friday, August 22, 2014

Google Summer of code 2014 ending

M1RI right now is in great shape, GF(3) multiplication memory leaks that temporarily caused havoc are gone, leaving the multiplication speed competitive.

The Google summer of code was fantastic. My mentors, Martin Albrecht, and Clément Pernet were extraordinary and talented mentors, the former of which I was lucky enough to meet at a developer meeting.

Thanks to the organizers of the Google Summer of Code, the lmonade team, and Sage.

Sunday, July 20, 2014

Week 8

This week has been about implementing missing portions of the API, and working on a issue on the said API.

I added a submatrix function that returns a separate matrix with its own values. This works on numbers of columns and rows matrices. Furthermore, matrices can be multiplied with scalar multiplication.

Optimizations, as far as inlines and versions of functions ignoring null check, are going to be added later, to avoid the risk of errors arising from premature optimization.

Null values are being stamped out as acceptable for use in matrix functions.

Documentation is far more thorough, printing parameters and and with verbose explanations.

Next week I’ll focus on making tests as thorough as possible, getting thorough tests written that verbosely and thoroughly check for errors.

Tuesday, July 15, 2014

More polished

I've updated a lot of documentation, which is now written to compile nicely in doxygen.   Libpng is no longer assumed to be installed, and will compile without.   Furthermore, issues the old API have become more apparent, with tests finding more useless input.

I've written out revised schedule for the project at this point.   Making sure everything works as bug free as possible, with a understandable interface is top priority.

July 20 - - an implementation over GF(3), GF(5) and GF(7) of this API (or a variant
thereof)

------------------------------------------------------------------------------------------------------------
m3d_t *m3d_create(rci_t m, rci_t n);

void m3d_free(m3d_t *A);

m3d_t *m3d_concat(m3d_t *C, const m3d_t *A, const m3d_t *B);

m3d_t *m3d_stack(m3d_t *C, const m3d_t *A, const m3d_t *B);

m3d_t *m3d_submatrix(m3d_t *S, const m3d_t *M, const rci_t lowr, const rci_t lowc, const rci_t highr, const rci_t highc);

m3d_t *m3d_init_window(const m3d_t *A, const rci_t lowr, const rci_t lowc, const rci_t highr, const rci_t highc);

void m3d_free_window(m3d_t *A);

m3d_t *m3d_add(m3d_t *C, const m3d_t *A, const m3d_t *B);
m3d_t *m3d_sub(m3d_t *C, const m3d_t *A, const m3d_t *B);

m3d_t *m3d_addmul_naive(m3d_t *C, const m3d_t *A, const m3d_t *B);
m3d_t *m3d_mul_naive(m3d_t *C, const m3d_t *A, const m3d_t *B);
m3d_t *m3d_mul_scalar(m3d_t *C, const long a, const m3d_t *B);

void m3d_rand(m3d_t *A);

m3d_t * m3d_copy(m3d_t *B, const m3d_t *A);

void m3d_set_ui(m3d_t *A, long value);

long m3d_read_elem(const m3d_t *A, const rci_t row, const rci_t col);

void m3d_add_elem(m3d_t *A, const rci_t row, const rci_t col, const long elem);

void m3d_write_elem(m3d_t *A, const rci_t row, const rci_t col, const long elem);

int m3d_cmp(m3d_t *A, m3d_t *B);

int m3d_is_zero(const m3d_t *A);

void m3d_add_multiple_of_row(m3d_t *A, rci_t ar, const m3d_t *B, rci_t br, long x, rci_t start_col);

void m3d_add_row(m3d_t *A, rci_t ar, const m3d_t *B, rci_t br, rci_t start_col);

void m3d_rescale_row(m3d_t *A, rci_t r, rci_t start_col, const long x);

void m3d_row_swap(m3d_t *M, const rci_t rowa, const rci_t rowb);

void m3d_copy_row(m3d_t* B, rci_t i, const m3d_t* A, rci_t j);

void m3d_row_add(m3d_t *M, const rci_t sourcerow, const rci_t destrow);

void m3d_print(const m3d_t *M);
------------------------------------------------------------------------------------------------------
    I have most of these written out, though a few, such at concat don't have m5d and m7d versions at the moment.

July 27 - test cases covering these functions with multiple inputs, at least more complicated ones like arithmetic.

August 3 - clear documentation of what each of these functions explaining what these functions do (with their inputs) and what conditions they assume.

Have a lot of these written at this point.

August 17 - an implementation of Strassen for at least m3d , with documentation, test cases and timing.    Worked on a lot of m3d_strassen but need to test input and accuracy.

Monday, July 7, 2014

New API andDocumentation Standards

The API of M1RI has been updated for the better. Names across m3d, m5d, and m7d, have been synchronized.   Documentation in format for generation in doxygen is being added. New Basic functions that were not planned before have been added.

Until now m1ri_die, which displays an error message, has been underused up until this point. Many functions have simply done nothing if improper data is entered.   This is being remedied.    For example, in m3d_strassen, if the matrix to be written to is already allocated, and of improper dimensions, it outputs "m3d_strassen: Provided return matrix has wrong dimensions.", while calling m1ri_die (forces the m1ri to close).

Many void functions now return pointers.   This also coincides with several functions not requiring preallocated pointers.    An example is m3d_create, which returns an m3d_t * instead of void.

Evalutions and post Seattle

Evaluations where very productive for this project. A new API is being configured, adding memory allocations and documenting, and in a format that Doxygen will read. I changed some of the direction, and will focus on documentation, and insuring ease of use.

My mentors have fantastic and helpful.

Tuesday, June 24, 2014

Week 5

To start, the last handful of days have consisted of me making it to Seattle for bug days a meeting for Sage development, consisting mainly of bugs but also of other, development, like in my case M1RI.   My mentor, Martin Albrecht will be coming to this event too.

I've worked on bitpacking, changing bitsliced groups into of rows of bits, have a output consisting of numerous shades of grey.   Libpng is doing the low level work. I had ran into a problem, not properly setting up autotools to detect the library in the Makefile.am and this has now thankfully been fixed.

Done but in need of work, m*d_classic and m*d_strassen found here are in a state that has large portions of output, and has the amount of work reminiscent of the final state. What needs checking still is logic errors, which will be the main focus of next week.

A point of history that is noteworthy and relevant to this is this version of Strassen is not the original version but an optimization described in this paper called Strassen-Winograd.   This is the mathematician,   Shmuel Winograd who discovered the the asymptotically fast Coppersmith Winograd Algorithm which has not found widespread use currently due to the massive constant required for it to become efficient for matrix multiplication.

Monday, June 16, 2014

Week 4

Next:
Matrix multiplication proper is the next frontier. The code is already quite far along, with a 64*64 base case, strassen, and padding for multiplication. Peeling is another goal to work on, cutting off portions of the matrix for when the matrix does not form a square. This is best for when the input would otherwise waist a lot of time on unused values, from unused padding, and worst when the padding needed is minimal.

Issues:
After dealing with an issue of configuring autotools, libpng does produce output, albeit not colorful or conforming to inputs.

Sunday, June 8, 2014

Input/Output and Permutation matrices

I/O is contained within the m1ri_io. At this point printing to ascii text is finished. Within the next week, input files for matrices will become a feature. Libpng, is optional, being less portable and is detected by autotools before functions relying upon it are used.

Permutation matrices are rather important and interesting. Using lapack style permutation matrices, they consist of an array where the array index and the respective value stored at index represent a row/column to be swapped. These will serve a major purpose later, being part of PLE decomposition, improving performance when working with sparse matrices.

Sunday, June 1, 2014

Second week of Gsoc

Today marks the end of the first two weeks of coding, the first milestone of the GSOC.    I’ve accomplished thorough tests for arithmetic, mod 3, mod 5, and mod 7, in matrices referred to as m3d_t , m5d_t, and m7d_t respectively.

I have a need to polish up a few functions, such as m*d_quarter, splitting a matrix into fourths not working on matrices less than 64 * 64.   An optimal way to do Hadamard GF(5), and GF(7) is still unknown, though that's a low priority issue at the moment.

Coming soon, I'll start working libpng read and write, along with ascii input into M1RI.   File input will make testing easier, and png files will add png files, along with being convenient, could make interesting art, depending on how numbers are represented.   The smallest footprint involves shades of grey, though primary and secondary colors corresponding to values seems more interesting.

Sunday, May 25, 2014

1st week of GSOC

This week I've got back into the swing of things, and have gotten tests worked on.

For m3d, tests for addition and subtraction are complete, and hadamard needs to be tested.

Addition and subtraction are tested by adding variable a to b, writing the result to c, then subtracting c from b and writing that result to variable d. Variable d is tested for equality to a.

For m5d and m7d, tests are still a work in progress, hadamard needs to be finalized.

A big issue had been remembering which representation was which and a confusion I had when communicating, over which bit was on which bitslice.

A large amount of the milestone needs to be finished, which at this point is mostly tests, which has more or less logic errors. The next week, I have laid out much better, with a clear vision of how how tests will work, relying on the communicative and associative property.

Thursday, May 22, 2014

GF(7) Elementwise multiplication needs further optimization.

I used a logic friday boolean logic optimization software to attempt to find an optimal set element wise multiplication (Hadamard). Unfortunately, this yielded disappointing results for the moment, not less than 70 bit operations. Possible ways to optimize include temporary bit slice during multiplication. This would mean that for the duration of the equation, the elements could be operated on as if they had four bits instead of three. Such as was first suggested here http://arxiv.org/abs/0901.1413 for GF(5), though a better representation has since been found for GF(5).

Tuesday, May 13, 2014

Submatrices, Addition, Subtraction, and Hadamard

The first major milestone of this project will be coding and setting up tests for the major building blocks of any feasible Matrix library.   To start, wrappers for rand, free, and other essential library functions, will be worked on.Basic functions like m1ri_rand, and m1ri_free, have not appeared to any problems at this point, and seam to be working well.

Arithmetic in the form of element wise addition and multiplication (Hadamard Multiplication) is the first major area that is needing worked on.   For GF(3), referred to as m3d, hadamard is implimented as element wise multiplication in that representation is trivial, taking 3 bit operations, half as many as is required for vector addition.    GF(5), and GF(7) on the other hand, optimized in their representations for addition and traditional matrix multiplication, Boolean expression for element wise multiplication still found.

   Tests with Autotools are an area that needs work, while simple benchmarks have been used, they've always been written on the fly and edited soon afterwards, and the current "tests" will succeed as long as the program doesn't have any syntax errors.    To improve upon this, tests that check for logic errors with a pass fail on the aforementioned functions will be written in the following .   Throughout this summer, testing for logic errors will be be a major theme of the project.   This not only will make further work easier to build off of but will add to its legitimacy.

Sunday, April 27, 2014

Accepted into the Google Summer of Code!

I'll be sponsored by Google this summer when working on the project, because I've been accepted into the Google Summer of Code. My mentors are Martin Albrecht a postdoc Cryptographer, and Clément Pernet , a renowned Mathematician. Here is our developer group, which will be updated along with this blog as the summer progresses. The source code will can be found at https://github.com/grudy/m1ri.

Sunday, March 30, 2014

What and Why M1RI?

What is M1RI?
As stated in the proposal:
"The goal of this project is to impliment a matrix library with bitslicing techniques described by Tom Boothby and Robert Bradshaw here http://arxiv.org/abs/0901.1413. This takes the “Method of Four Russians”, an algorithm made for efficient logical matrix algorithms, and using bitslicing to extend that to matrices over larger finite fields. GF(3), GF(5), and GF(7) matrices will be the scope of this project.

Why?
This technique allows for very fast matrix calculations. It allows low level arithmetic to collide with the efficiency of modern CPU word sizes.

Linear algebra has an elegant aesthetic but, delving deep into, unearths arcane knowledge about the nature of mathematics, and computer science. Getting to learn more of this along the way of this project will be quite fulfilling.

How?

This project is going to be written in C99, using autotools for portability and for testing.

Links