Didier Stevens

Tuesday 20 January 2015

YARA Rule: Detecting JPEG Exif With eval()

Filed under: Forensics,Malware — Didier Stevens @ 20:39

My first release of 2015 was a new YARA rule to detect JPEG images with an eval() function inside their Exif data.

Such images are not new, but I needed an example to develop a complex YARA rule:

rule JPEG_EXIF_Contains_eval
{
    meta:
        author = "Didier Stevens (https://DidierStevens.com)"
        description = "Detect eval function inside JPG EXIF header (http://blog.sucuri.net/2013/07/malware-hidden-inside-jpg-exif-headers.html)"
        method = "Detect JPEG file and EXIF header ($a) and eval function ($b) inside EXIF data"
    strings:
        $a = {FF E1 ?? ?? 45 78 69 66 00}
        $b = /\Weval\s*\(/
    condition:
        uint16be(0x00) == 0xFFD8 and $a and $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) - 0x06)
}

Here is an example of such an image:

20150120-204954

The YARA rule has 3 conditions that must be satisfied:

  1. JPEG magic header FFD8, tested with: uint16be(0x00) == 0xFFD8
  2. Exif structure: FF E1 ?? ?? 45 78 69 66 00
  3. eval function inside Exif data, tested with a regular expression: \Weval\s*\(

Condition 1 is straightforward: the file must start with FFD8. I’m using test uint16be(0x00) == 0xFFD8 instead of searching for {FF D8} at 0x00. FF D8 is a short string, searching for {FF D8} can cause performance problems (you’ll get a warning from YARA when it compiles rules with such short strings).

Condition 2 checks for the presence of the Exif data header. Bytes 3 and 4 (?? ??) encode the length of the Exif Data.

Condition 3 checks for the presence of the eval function. To reduce the number of false positives that would occur when searching for string eval, we use a regular expression that matches string eval, possibly followed by whitespace characters (\s*), and an opening parenthesis: \(. And we don’t want letters or numbers before the string eval (we don’t want to match a string like deval), eval must be the start of a word. To achieve this with regular expressions, you use a word boundary: \b. So our regular expression would be \beval\s*\(. Unfortunately, YARA’s regular expression engine does not support word boundaries, so I had to come up with something else. I match any character that is not alphanumeric: \W. Be warned that there is a small difference between \W and \b. \b also matches the beginning of a string (like $), while \W has to match a character. So the regular expression I use is \Weval\s*\(.

The eval function must also be found inside the Exif data. We don’t want to trigger on the eval function if it is found somewhere else in the image. That’s where YARA’s in ( .. ) syntax comes in.

The first 18 bytes of the Exif structure are various headers which we ignore, so our eval function $b must start at @a + 0x12 or further.

The total size of the Exif structure is given by expression 0x02 + uint16be(@a + 0x02). We add this to the start of the Exif header (@a): @a + 0x02 + uint16be(@a + 0x02). And finally, we have to subtract the size of the string matched by the regular expression. Unfortunately, YARA has no function to calculate this length. So we will use the minimum length our regular expression can match: 6 characters. So our eval function $b must start no further than @a + 0x02 + uint16be(@a + 0x02) – 0x06. Putting all this together gives: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – 0x06)

FYI: Victor told me that he plans to add a string length function to YARA, so our condition will then become: $b in (@a + 0x12 .. @a + 0x02 + uint16be(@a + 0x02) – &b)

You can find all my YARA rules here: YARA Rules.

6 Comments »

  1. Why don’t you use the “fullword” statement and check for eval?
    It does the “\b” on string or regex expressions.

    Comment by mapache14 — Wednesday 21 January 2015 @ 6:30

  2. @mapache14 because I need \b on the left of the regex, but not on its right.

    Comment by Didier Stevens — Wednesday 21 January 2015 @ 7:17

  3. @mapache14 Here os a test to illustrate why I don’t use fullword:

    rule:

    rule testing_regex_fullword
    {
    strings:
    $re = /eval\s*\(/
    $re_fullword = /eval\s*\(/ fullword
    condition:
    $re or $re_fullword
    }

    text file:
    .eval(
    .eval(base64(“1234”))
    .eval( base64(“1234”))

    result:
    yara32.exe -s test-regex-fullword.yara test-regex-fullword.txt
    testing_regex_fullword test-regex-fullword.txt
    0x1:$re: eval(
    0x9:$re: eval(
    0x20:$re: eval(
    0x1:$re_fullword: eval(
    0x20:$re_fullword: eval(

    You can see that “$re_fullword = /eval\s*\(/ fullword” did not match string “.eval(base64(“1234″))”

    Comment by Didier Stevens — Wednesday 21 January 2015 @ 9:41

  4. Thanks for your rule and the explanation.
    I’ll use it with some strings instead of one regex which includes a “*” character that makes the regex matching pretty slow.

    I’ll test it tonight in the hotel, but would rather use the following:
    $s1 = “eval” fullword ascii
    $s2 = “eval(” ascii
    $s3 = “base64”
    condition:
    $s1 and ( $s2 or $s3 )

    Regex is much slower than simple string matching. This is not very important if you use only a few rules on a few files. But this matters if you add this rule to a bigger rule set and let these rules scan a whole system or directory with a lot of files.
    You should at least restrict the regex using a maximum length:
    $re_fullword = /eval[\s]{1,20}\(/

    Thanks for your work,

    Best regards,
    Florian

    Comment by Florian Roth — Wednesday 21 January 2015 @ 13:04

  5. @Florian My experience with performance tuning (in general) is that you have to profile first.

    You have to know what code will execute most often (for example inside loops), and then you optimize that code for speed.

    With YARA, it’s not easy to know.

    Since you propose a rule that is less generic than mine, I would profile both rules and only keep the less generic rule if it provides a significant improvement in speed.

    Comment by Didier Stevens — Wednesday 21 January 2015 @ 13:21

  6. […] Now that YARA version 3.3.0 supports word boundaries in regular expressions, I’ve updated my YARA Rule for Detecting JPEG Exif With eval(). […]

    Pingback by Update: YARA Rule JPEG_EXIF_Contains_eval | Didier Stevens — Sunday 15 February 2015 @ 11:21


RSS feed for comments on this post. TrackBack URI

Leave a Reply (comments are moderated)

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.