3

This question is a follow-up to that question.

To set the context of this question, consider Null-free programming. This is a technique to masquerade a sequence of instructions (shellcode) as a string. In the C programming language, the byte 0 marks the end of a string, so the instruction sequence must be designed not to contain any such byte, otherwise it would be truncated by the string-manipulation function being abused.

The IA32 and x86-64 instruction sets, with their variable-length instructions of no particular alignment, allow instructions for task B to be decoded at an offset within an existing stream of instructions for doing task A. This technique has been used punctually in the early days of personal computing (1980s) in order to save space.

Has the technique of embedding code within code, starting at an offset within the first instruction, already been used, say, as one way to fool anti-virus detection? Does it have a name? If it is useful and has already been used, what is an example? If the attacker is writing the code to start with, it is enough for task A to do nothing in an ostensibly harmless way, which may leave enough leeway to do anything that one could want as task B.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • The answer is "yes" to all your questions, but does that really help? – Kerrek SB Jul 05 '14 at 12:09
  • 2
    @KerrekSB “—What is an example? —Yes.” is pretty crummy as an answer, and while “—Does it have a name? —Yes.” works, the answerer can go as far as providing said name in their answer. That will help me google it for further information (I had a hard time finding null-free programming even knowing it existed). – Pascal Cuoq Jul 05 '14 at 12:11
  • Note: I am not asking for a book chapter on the subject. The name of the technique, if it already exists, would answer my question. – Pascal Cuoq Jul 05 '14 at 12:31
  • A [random internet search](https://www.google.com/search?q=malware+hide+offset+instruction) I just made up on the spot suggests "steganography", but the search results themselves seem to contain further terminology. – Kerrek SB Jul 05 '14 at 12:34
  • @KerrekSB Steganography is the branch of computer science concerned with hiding anything into anything. It is a good search term to think to use, but it won't lead my to documentation any more than “zeroless programming” lead me to http://www.blackhatlibrary.net/Shellcode/Null-free – Pascal Cuoq Jul 05 '14 at 12:42
  • I am not aware of a name specifically for the "two instructions in one" obfuscation ("interleaving" comes to mind, but it's already taken), but this technique is often used in combination with [*Opaque Predicates*](http://reverseengineering.stackexchange.com/q/1669/262). You might get more answers at RE.SE. – DCoder Jul 05 '14 at 13:04
  • I wrote about [what I call "skipping instructions"](https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code/235553#235553) in a codegolf answer, a purely machine-code-size optimisation. It also links to [an ACEGALS description](https://pushbx.org/ecm/doc/acegals.htm#skipping) of those. – ecm May 12 '22 at 14:59

1 Answers1

5

Yes, this has surely been used for any situation where obfuscating code would be useful. Not only for virus programming, but for example software protection and reverse engineering prevention.

I have used it myself a few times for size coding competitions, and seen several examples in other peoples entries.

This technique naturally has been invented and re-invented many times for different processors, so you will naturally find several different names for it. I found names like "overlapping instructions" and "instruction scission".

Some resources:

Jump into the middle of instruction - in IA-32
What is “overlapping instructions” obfuscation?
A new instruction overlapping technique for anti-disassembly and obfuscation of x86 binaries

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Guffa
  • 687,336
  • 108
  • 737
  • 1,005