Z-ordering (Morton ordering)

big_gie · 2009-11-05 23:38:10

Hi,

I need to sort a list of objects based on their position into a Z-oder[1]. Everywhere I look, z-order is used by interleaving bit representation of integers. In my case, the objects have a double precision[2] position in three dimension.

Somebody told me it would not work with double precision because the mantissa has a different meaning then the exponent. But I was not convinced and tried anyway. Actually, since the exponent is stored in higher significant bits then the mantissa, I think I could interleave the bits from the exponents, and then interleave the mantissa. So I implemented that. Hell, I even did it manually (with 32 bits floats, not 64 bits doubles).

But I cannot make it work reliably. I'd like to get some comments on that...

My procedure was as follow. Place 16 particles on a 4x4 grid. The positions in x and y dimensions are 1, 2, 3 and 4 bohrs[3] but expressed in meters. The initial ordering (as found in memory) is:

13      14      15      16


9       10      11      12


5       6       7       8


1       2       3       4

The 16 [x,y] positions are, in meters:

[5.29177240552557e-11 , 5.29177240552557e-11]
[1.05835448110511e-10 , 5.29177240552557e-11]
[1.58753172165767e-10 , 5.29177240552557e-11]
[2.11670896221023e-10 , 5.29177240552557e-11]
[5.29177240552557e-11 , 1.05835448110511e-10]
[1.05835448110511e-10 , 1.05835448110511e-10]
[1.58753172165767e-10 , 1.05835448110511e-10]
[2.11670896221023e-10 , 1.05835448110511e-10]
[5.29177240552557e-11 , 1.58753172165767e-10]
[1.05835448110511e-10 , 1.58753172165767e-10]
[1.58753172165767e-10 , 1.58753172165767e-10]
[2.11670896221023e-10 , 1.58753172165767e-10]
[5.29177240552557e-11 , 2.11670896221023e-10]
[1.05835448110511e-10 , 2.11670896221023e-10]
[1.58753172165767e-10 , 2.11670896221023e-10]
[2.11670896221023e-10 , 2.11670896221023e-10]

As you might "see", the x dimension is incremented first, then y.

I then converted these decimals to binary representation[4]:

[00101110011010001011110000010000 , 00101110011010001011110000010000]
[00101110111010001011110000010000 , 00101110011010001011110000010000]
[00101111001011101000110100001100 , 00101110011010001011110000010000]
[00101111011010001011110000010000 , 00101110011010001011110000010000]
[00101110011010001011110000010000 , 00101110111010001011110000010000]
[00101110111010001011110000010000 , 00101110111010001011110000010000]
[00101111001011101000110100001100 , 00101110111010001011110000010000]
[00101111011010001011110000010000 , 00101110111010001011110000010000]
[00101110011010001011110000010000 , 00101111001011101000110100001100]
[00101110111010001011110000010000 , 00101111001011101000110100001100]
[00101111001011101000110100001100 , 00101111001011101000110100001100]
[00101111011010001011110000010000 , 00101111001011101000110100001100]
[00101110011010001011110000010000 , 00101111011010001011110000010000]
[00101110111010001011110000010000 , 00101111011010001011110000010000]
[00101111001011101000110100001100 , 00101111011010001011110000010000]
[00101111011010001011110000010000 , 00101111011010001011110000010000]

Then, I manually interleaved the x and y bits. For example, for the first position [1,1] bohr, I first put the two binary numbers one after the other. Secondly, I add a space between each digit, and third shift the lower one by adding another space at the front of it. The bit interleaving just then appears with the bits aligned:

1)
00101110011010001011110000010000
00101110011010001011110000010000
2)
0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0
0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0
3)
0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0
 0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0
4)
0000110011111100001111001100000011001111111100000000001100000000

Doing this manual procedure for all the combinations, I get the following:

0000110011111100001111001100000011001111111100000000001100000000    r1
0000110011111100101111001100000011001111111100000000001100000000    r2
0000110011111110000111001110100011000101111100100000000110100000    r3
0000110011111110001111001100000011001111111100000000001100000000    r4
0000110011111100011111001100000011001111111100000000001100000000    r5
0000110011111100111111001100000011001111111100000000001100000000    r6
0000110011111110010111001110101011000101111100100000000110100000    r7
0000110011111110011111001100000011001111111100000000001100000000    r8
0000110011111101001011001101010011001010111100010000001001010000    r9
0000110011111101101011001101010011001010111100010000001001010000    r10
0000110011111111000011001111110011000000111100110000000011110000    r11
0000110011111111001011001101010011001010111100010000001001010000    r12
0000110011111101001111001100000011001111111100000000001100000000    r13
0000110011111101101111001100000011001111111100000000001100000000    r14
0000110011111111000111001110100011000101111100100000000110100000    r15
0000110011111111001111001100000011001111111100000000001100000000    r16

Note that I added a name at the end to keep track of which line represent which positions. I saved that to a file, cat'ed it, and piped to "sort":

$ cat to_sort.txt | sort
0000110011111100001111001100000011001111111100000000001100000000    r1
0000110011111100011111001100000011001111111100000000001100000000    r5
0000110011111100101111001100000011001111111100000000001100000000    r2
0000110011111100111111001100000011001111111100000000001100000000    r6
0000110011111101001011001101010011001010111100010000001001010000    r9
0000110011111101001111001100000011001111111100000000001100000000    r13
0000110011111101101011001101010011001010111100010000001001010000    r10
0000110011111101101111001100000011001111111100000000001100000000    r14
0000110011111110000111001110100011000101111100100000000110100000    r3
0000110011111110001111001100000011001111111100000000001100000000    r4
0000110011111110010111001110101011000101111100100000000110100000    r7
0000110011111110011111001100000011001111111100000000001100000000    r8
0000110011111111000011001111110011000000111100110000000011110000    r11
0000110011111111000111001110100011000101111100100000000110100000    r15
0000110011111111001011001101010011001010111100010000001001010000    r12
0000110011111111001111001100000011001111111100000000001100000000    r16

I now have an (almost) Z-order list. Here is a what it produces:

13      14      15      16
|  \    | \     |  \    |
|    \  |  \    |    \  |
9       10  |   11      12
   \        |      \
     \      |        \
5       6   |   7 ----- 8
|  \    |   |      \
|    \  |    \       \
1       2       3 ----- 4

As you see, the 7 and 4 are inverted from what it should give (an "N" ordering, instead of "Z", but that is equivalent and irrelevant). Now if I automate all this, I get similar weird results. It is even worse when the system is big. As examples, I have ploted the ordering for a small grid (the previous example)[5], a bigger grid[6] and a big disk[7]

Now I cannot belive that Z-ordering does not work with double precision data since I almost got it. But then, I cannot correctly get what I want... Did anybody worked with that before? Does anyone can comment on the procedure? If you spot a mistake?

I absolutely need to make this working...

Thanx a lot.

[1] http://en.wikipedia.org/wiki/Z-order_%28curve%29
[2] http://en.wikipedia.org/wiki/Double_precision
[3] http://en.wikipedia.org/wiki/Bohr_radius
[4] http://www.binaryconvert.com/convert_float.html
[5] http://nbigaouette.inrs-emt.homelinux.n … _small.png
[6] http://nbigaouette.inrs-emt.homelinux.n … id_big.png
[7] http://nbigaouette.inrs-emt.homelinux.n … r_disk.png

Edit: fixed Z-order wikipedia URL

Last edited by big_gie (2009-11-06 15:45:14)

Ranguvar · 2009-11-06 05:09:53

*Whoosh*

sr · 2009-11-06 05:27:32

Stupid question: why not just do the z-ordering in bohrs, (which will be integers) and then convert back to metres?

From what I can see, the almost z-ordering for the 4x4 block may just be happening by accident The larger grids have much worse errors.

big_gie · 2009-11-06 15:44:00

Ranguvar wrote:

*Whoosh*

I know... It's not for the faint of heart

sr wrote:

Stupid question:

There is no such thing as a stupid questions, just stupid answers

sr wrote:

why not just do the z-ordering in bohrs, (which will be integers) and then convert back to metres?

I also tried that. The problem is that the positions covers a wide range of scales: New particles are created a tiny fraction of a bohr away, while others are leaving the system on the meter scale... Scaling to bohr and casting to integers would make far away particles overflow the integer size.
I tried different kind of manipulation like that, with mixed success. If I had good result I would have kept them, but they are no better then the original "directly from double" so I would prefer fixing this one instead, since there should be no loss of information in that interleaving.

keenerd · 2009-11-06 18:05:44

I need to do some catching up (the interleaving algo is new to me). One possible glitch with using IEEE floating points strikes me immediately. To save a bit, the implied "1." is always removed from the mantissa, leaving only the fractional part. This could throw off any bitwise operations.

One other possibility would be to change the interleaving algo to make it more float friendly. You can think of a floating point number as a binary run length encoding, with the exponent holding a run of zeros. If would work if you un-encoded the floating point, but then it would be easier to use some sort of BigDecimal strut/class.

Doing some more reading, it looks like the interleaving only works on homogeneous data. Unsigned ints only. Looks like there is active research for good floating algos, for example: http://compgeom.com/~piyush/papers/tvcg_stann.pdf

Also, you cross posted on comp.lang.c. They've given you sound advice.

tavianator · 2009-11-06 18:52:00

The interleaving Z-curve algorithm only works when the coordinates are specified in a way which gives correct comparisons when the values are compared as binary integers. This is often, but not always true for IEEE floats, hence your algorithm "almost" works. In fact, the C standard doesn't specify the representation for floats, so this becomes a portability concern too. Instead, why not try:

- Add each floating-point x, y, z, etc. coordinate to a single list, and sort it.
- Replace each coordinate in your data with its integer numerical index in the sorted list.
- In "random" data there should be no ties (equal coordinates), but it may be wise to treat consecutive equal elements as having the same index.
- Apply Z-ordering to this adjusted data.

I'm not sure if this'll work, but I think it should

tavianator · 2009-11-06 20:52:22

Actually never mind, positive IEEE floats always compare correctly when treated as binary integers. The issue lies in the fact that the binary representation of a float is not proportional to its value. I think your method works only when the exponents are equal. So I'm pretty sure my previous method won't work either.

You could always use bigints (GMP etc.) and represent coordinates in bohrs. That might be about the cleanest way.

big_gie · 2009-11-07 05:05:07

keenerd wrote:

I need to do some catching up (the interleaving algo is new to me). One possible glitch with using IEEE floating points strikes me immediately. To save a bit, the implied "1." is always removed from the mantissa, leaving only the fractional part. This could throw off any bitwise operations.

The 1. is removed, but it is removed for all numbers. So all numbers have the same information as others.

keenerd wrote:

One other possibility would be to change the interleaving algo to make it more float friendly. You can think of a floating point number as a binary run length encoding, with the exponent holding a run of zeros. If would work if you un-encoded the floating point, but then it would be easier to use some sort of BigDecimal strut/class.

I'm not sure I followed... But the next part is the best.

keenerd wrote:

Doing some more reading, it looks like the interleaving only works on homogeneous data. Unsigned ints only. Looks like there is active research for good floating algos, for example: http://compgeom.com/~piyush/papers/tvcg_stann.pdf

That's really interesting. I have searched last spring and summer for similar information but could not find anything. The paper's Algorithm 1 on page 3 will probably help me. I implemented it but it is not working for now. For some reason I can't find for now it fails to correctly compare two points. I'll need more time to understand it thoroughly. Thanx a lot for that link!

@tavianator
I base everything on the IEEE 754 double precision format, which is used almost everywhere, at least on the machines I am/will be using.

I'll report back about Kumar's paper.

big_gie · 2009-11-17 00:12:23

big_gie wrote:

keenerd wrote:
Looks like there is active research for good floating algos, for example: http://compgeom.com/~piyush/papers/tvcg_stann.pdf
That's really interesting. I have searched last spring and summer for similar information but could not find anything. The paper's Algorithm 1 on page 3 will probably help me. I implemented it but it is not working for now. For some reason I can't find for now it fails to correctly compare two points.

I have wrote a small test program that implements the paper's Algorithm 1 on page 3. It is not working though. I am pretty sure the code does not have a bug, so I think the error comes from understanding Algorithm 1. This explication should cover the algorithm:

http://compgeom.com/~piyush/papers/tvcg_stann.pdf wrote:

The algorithm takes two points with ﬂoating point coordinates. The relative order of the two points is determined by the pair of coordinates who have the ﬁrst differing bit with the highest exponent. The XOR MSB function computes this by ﬁrst comparing the exponents of the coordinates, then comparing the bits in the mantissa, if the exponents are equal. Note that the MSDB function on line 14 returns the most signiﬁcant differing bit of two integer arguments. This is calculated by ﬁrst XORing the two values, then shifting until we reach the most signiﬁcant bit. The EXPONENT and MANTISSA functions return those parts of the ﬂoating point number in integer format.

Now, here is the code I have wrote think it would implement the algorithm:

// **************************************************************
// Double precision masks
//      0 11111111111 0000000000000000000000000000000000000000000000000000
//      http://www.binaryconvert.com/result_unsigned_int.html?hexadecimal=000007FF
const uint64_t double_mask_exponent = uint64_t(2047) << uint64_t(52);
//      0 00000000000 1111111111111111111111111111111111111111111111111111
const uint64_t double_mask_mantissa = ((~double_mask_exponent) << uint64_t(1)) >> uint64_t(1);
//      1 00000000000 0000000000000000000000000000000000000000000000000000
const uint64_t double_mask_sign = ~((~uint64_t(0)) >> uint64_t(1));

// **************************************************************
uint64_t EXPONENT(const double *a)
{
    // Get the double precision binary representation into an integer
    // Necessary for the logical AND
    uint64_t double_in_uint64_t;
    memcpy(&double_in_uint64_t, a, sizeof(double));

    // Bitwise AND (&)
    return double_mask_exponent & double_in_uint64_t;
}

// **************************************************************
uint64_t MANTISSA(const double *a)
{
    // Get the double precision binary representation into an integer
    // Necessary for the logical AND
    uint64_t double_in_uint64_t;
    memcpy(&double_in_uint64_t, a, sizeof(double));

    // Bitwise AND (&)
    return double_mask_mantissa & double_in_uint64_t;
}

// **************************************************************
uint64_t XOR_MSB(const double *a, const double *b)
/**
 * See Algorithm 1 of reference.
 * The relative order of the two points is determined by the
 * pair of coordinates who have the ﬁrst differing bit with
 * the highest exponent. The XOR MSB function computes this
 * by ﬁrst comparing the exponents of the coordinates, then
 * comparing the bits in the mantissa, if the exponents
 * are equal.
 */
{

    uint64_t x = EXPONENT(a);
    uint64_t y = EXPONENT(b);
    uint64_t z = 0;

    if (x == y)
    {
        z = MOI_MSDB(MANTISSA(a), MANTISSA(b));
        x = x - z;
        return x;
    }

    if (y < x) return x;
    else       return y;
}

// **************************************************************
uint64_t MSDB(uint64_t a, uint64_t b)
/**
 * Returns the Most Signiﬁcant Differing Bit of two integer arguments
 * See http://compgeom.com/~piyush/papers/tvcg_stann.pdf
 */
{
    if (a == b)
    {
        // Doing a XOR would be useless, as it would give 0
        return uint64_t(0);
    }
    // XOR the too integers. The result will have 0's at bit positions
    // where a and b are the same, and 1 where they are different.
    const uint64_t xor_ab = a ^ b;
    uint64_t i = 64;

    // OR a 0 value to get 1111... Then rigth shift that to get 01111...
    // And finally OR that again to get 100000...
    // We will shift that bit until we find the MSB of xor_ab
    uint64_t xor_ab_msb = ~((~uint64_t(0)) >> uint64_t(1));

    // Until the xor_ab's bit at position xor_ab_msb is 0, do the loop
    while ((xor_ab_msb & xor_ab) == uint64_t(0))
    {
        // Right shift by one the MSB's position
        xor_ab_msb = xor_ab_msb >> uint64_t(1);
    }

    return xor_ab_msb;
}

// **************************************************************
bool Compare_ZOrder(const double a[3], const double b[3])
/**
 * Function to compare two double precision 3D positions in Z-order.
 * @param   a   Position 1
 * @param   b   Position 2
 * @return  true    if a  < b
 *          false   else
 */
{

    // **********************************************************
    // See http://compgeom.com/~piyush/papers/tvcg_stann.pdf
    uint64_t x   = 0;
    uint64_t y   = 0;
    uint64_t dim = 0;

    int d;
    for (d = 0 ; d < 3 ; d++)
    {         //  &(a[d])
        y = XOR_MSB(a+d, b+d);
        if (x < y)
        {
            x   = y;
            dim = d;
        }
    }

    return (a[dim] < b[dim]);
}

Now it does not work at all. I cannot use Compare_ZOrder() for sorting due to Z-Order. The paper talks about its homepage: http://sites.google.com/a/compgeom.com/stann/ where a package exist.

I downloaded it and checked the source, and was surprised to see that the functions are not doing what I though they should be doing. For example, sep_float.hpp at line 143 implements the msdb() function. It does so by using the linux's frexp() function, which is used to get a representation of the number "fraction * 2^exp" where fraction is between 0.5 (inclusive) and 1 (exclusive) and exponent is the real exponent (not the biased from double precision).

So the package is playing with frexp() while I was using directly the bit representation... I still don't see the relation between the package and the algorithm from the paper...

Anybody has a clue??? Thanx...

sr · 2009-11-17 13:14:14

My recollection of IEEE-754 is hazy, but the (fraction, exp) pair from frexp() is almost the binary representation that you're using, modulo endianness: it just adds back the implicit MSB 1 to the binary representation of the mantissa, and removes the offset from the exponent.

Sorry I can't help more

Edit: wikipedia to the rescue: float64

Yet another edit:

From the source in STANN, there is:

The msdb of the two floating point numbers is NOT computed based on the internal representation of the floating point number. The answer is computed based on a theoretical integer representation of the numbers.

so it's definitely not what you're trying to do, since this sounds like it converts the numbers into a general binary representation, and then finds the highest power of 2 where they differ.

Last edited by sr (2009-11-17 13:31:16)

Arch Linux

#1 2009-11-05 23:38:10

Z-ordering (Morton ordering)

#2 2009-11-06 05:09:53

Re: Z-ordering (Morton ordering)

#3 2009-11-06 05:27:32

Re: Z-ordering (Morton ordering)

#4 2009-11-06 15:44:00

Re: Z-ordering (Morton ordering)

#5 2009-11-06 18:05:44

Re: Z-ordering (Morton ordering)

#6 2009-11-06 18:52:00

Re: Z-ordering (Morton ordering)

#7 2009-11-06 20:52:22

Re: Z-ordering (Morton ordering)

#8 2009-11-07 05:05:07

Re: Z-ordering (Morton ordering)

#9 2009-11-17 00:12:23

Re: Z-ordering (Morton ordering)

#10 2009-11-17 13:14:14

Re: Z-ordering (Morton ordering)

Board footer