Ed's Blog

A PhD Student's Musings

Pcmpistri Revisited: Aggregation Modes

Warning: This blog post is not going to be useful to you unless you are trying to understand the pcmpistri instruction. It is not self-contained. See Section 4.1 of the Intel Manual for reference.

A while back I wrote a blog post about how bad the documentation is for the x86 instruction pcmpistri. I wrote the previous blog post on pcmpistri as I was trying to implement the behavior of pcmpistri only for the control byte 0xc. Now I’m trying to add a more general implementation to BAP. This involves reading and understanding the documentation, specifically focusing on the various aggregation modes that effectively control the behavior of pcmpistri. The problem with the documentation is that it’s a literal translation of what the processor is doing, i.e., a description of the circuit that implements the instruction, rather than an abstract description of what’s going on. A more abstract view is helpful when not implementing hardware. In this blog post, I give pseudocode that computes the intres1 intermediate result variable for each of the four aggregation modes.

I hope these are helpful in understanding the various aggregation modes. Even though each one of these cases is simple, it was difficult for me to get here from the documentation. However, you should take these pseudocode algorithms with a grain of salt. I will update them as I find any problems, but I haven’t tested these yet. Each pseudocode listing is a direct translation from my notes of the Intel manual. I have also yet to formally demonstrate that these are equivalent to the specifications in the documentation. I hope to write a blogpost about how you can use BAP’s verification condition algorithms to show such an equivalence.

Notation

r is the register operand, rm is the register or memory operand, and u is the upper bound on the number of elements inside each operand (i.e., 16 for byte-size elements, and 8 for word-size).

Pseudocode

EqualAny

EqualAny sets intres1[j] if rm[j] is a character listed in r.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
intres1 = 0;

for (int j = 0; j < u; j++) {
  // Once the string ends, we cannot find more characters
  if (rm[j] == 0) break;

  for (int i = 0; i < u; i++) {
    // Once the character set ends, stop looking
    if (r[i] == 0) break;

    if (rm[j] == r[i]) {
      // We found a match, move to the next character in string
      intres1[j] = 1; break;
    }
  }
}

Ranges

Ranges sets intres1[j] if r[i] <= rm[j] <= r[i+1] for some even index i. r is a list of ranges.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
intres1 = 0;

for (int j = 0; j < u; j++) {

  if (rm[j] == 0) break;

  for (int i = 0; i < u; i += 2) {
    // XXX: Not sure if next line is correct
    // Intel Manual underspecifies
    if (r[i] == 0 || r[i+1] == 0) { break; }

    // Check if rm[j] is in the range
    if (r[i] <= rm[j] && rm[j] <= r[i+1]) { intres1[j] = 1; break; }
  }
}

EqualEach

EqualEach sets intres1[j] iff rm[j] = r[j].

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
bool rnull = 0, rmnull = 0;
intres1 = -1;

for (int i = 0; i < u; i++) {
   if (r == 0) { rnull = 1; }
   if (rm == 0) { rmnull = 1; }

   if (rnull && rmnull) { break; }
   else
   // If exactly one string is null
   if (rnull != rmnull) { intres1[i] = 0; }
   else
   // If both strings are valid.
   if (rm[i] != r[i]) { intres[i] = 0; }
   // Fall through: strings are valid and the same
}

EqualOrdered

EqualOrdered sets intres1[j] iff the substring in r is present at rm[j].

1
2
3
4
5
6
7
8
9
10
11
12
intres1 = -1;

for (int j = 0; j < u; j++) {
  for (int i = 0, int k = j; i < u-j; k++, i++) {
    // end of substring
    if (r[i] == 0) { break; }
    // the string ended before the substring
    else if (rm[k] == 0) { intres1[j] = 0; break; }
    // the string and substring differed
    else if (r[i] != rm[k]) { intres1[j] = 0; break; }
  }
}

Musings

Ideally we’d be able to compute each intres1[j] independently, but it doesn’t seem to be the case. break can be thought of as using implied state, i.e., whether we executed break or not.

Comments