• Nenhum resultado encontrado

The Structure of an EXE File

No documento The Little Black Book of Computer Viruses (páginas 61-68)

header has two parts to it, a fixed-length portion, and a variable length table of pointers to segment references in the Load Module, called the Relocation Pointer Table. Since any virus which attacks EXE files must be able to manipulate the data in the EXE Header, we’d better take some time to look at it. Figure 10 is a graphical representation of an EXE file. The meaning of each byte in the header is explained in Table 1.

When DOS loads the EXE, it uses the Relocation Pointer Table to modify all segment references in the Load Module. After that, the segment references in the image of the program loaded into memory point to the correct memory location. Let’s consider an example (Figure 11): Imagine an EXE file with two segments.

The segment at the start of the load module contains a far call to the second segment. In the load module, this call looks like this:

Address Assembly Language Machine Code 0000:0150 CALL FAR 0620:0980 9A 80 09 20 06

From this, one can infer that the start of the second segment is 6200H (= 620H x 10H) bytes from the start of the load module. The

Relocation Pointer Table EXE File Header

EXE Load Module

Figure 10: The layout of an EXE file.

Relocatable Ptr Table

EXE Header

0000:0150 0620:0980

0000:0153 CALL FAR 0620:0980 Routine X

Load Module ON DISK

PSP

CALL FAR 2750:0980 Routine X

IN RAM

Executable Machine

Code

2750:0980

2130:0150

2130:0000

DOS

Figure 11: An example of relocating code.

Table 1: Structure of the EXE Header.

Offset Size Name Description

0 2 Signature These bytes are the characters M and Z in every EXE file and tify the file as an EXE file. If they are anything else, DOS will try to treat the file as a COM file.

2 2 Last Page Size Actual number of bytes in the final 512 byte page of the file (see Page Count).

4 2 Page Count The number of 512 byte pages in the file. The last page may only be partially filled, with the number of valid bytes specified in Last Page Size. For example a file of 2050 bytes would have Page Size = 4 and Last Page Size = 2.

6 2 Reloc Table Entries The number of entries in the location pointer table

8 2 Header Paragraphs The size of the EXE file header in 16 byte paragraphs, including the Relocation table. The header is always a multiple of 16 bytes in length.

0AH 2 MINALLOC The minimum number of 16 byte paragraphs of memory that the gram requires to execute. This is in addition to the image of the program stored in the file. If enough memory is not available, DOS will return an error when it tries to load the program.

0CH 2 MAXALLOC The maximum number of 16 byte paragraphs to allocate to the gram when it is executed. This is normally set to FFFF Hex, except for TSR’s.

0EH 2 Initial ss This contains the initial value of the stack segment relative to the start of the code in the EXE file, when the file is loaded.

This is modified dynamically by DOS when the file is loaded, to reflect the proper value to store in the ss register.

10H 2 Initial sp The initial value to set sp to when the program is executed.

12H 2 Checksum A word oriented checksum value such that the sum of all words in the file is FFFF Hex. If the file is an odd number of bytes long, the lost byte is treated as a word with the high byte = 0.

Often this checksum is used for nothing, and some compilers do not even bother to set it

proper-Offset Size Name Description

12H (Cont) properly. The INTRUDER virus will not alter the checksum.

14H 2 Initial ip The initial value for the instruction pointer, ip, when the program is loaded.

16H 2 Initial cs Initial value of the code ment relative to the start of the code in the EXE file. This is modified by DOS at load time.

18H 2 Relocation Tbl Offset Offset of the start of the relocation table from the start of the file, in bytes.

1AH 2 Overlay Number The resident, primary part of a program always has this word set to zero. Overlays will have ferent values stored here.

Table 1: Structure of the EXE Header (continued).

Relocation Pointer Table would contain a vector 0000:0153 to point to the segment reference (20 06) of this far call. When DOS loads the program, it might load it starting at segment 2130H, because DOS and some memory resident programs occupy locations below this. So DOS would first load the Load Module into memory at 2130:0000. Then it would take the relocation pointer 0000:0153 and transform it into a pointer, 2130:0153 which points to the segment in the far call in memory. DOS will then add 2130H to the word in that location, resulting in the machine language code 9A 80 09 50 27, or CALL FAR 2750:0980 (See Figure 11).

Note that a COM program requires none of these calisthen-ics since it contains no segment references. Thus, DOS just has to set the segment registers all to one value before passing control to the program.

Infecting an EXE File

A virus that is going to infect an EXE file will have to modify the EXE Header and the Relocation Pointer Table, as well as adding its own code to the Load Module. This can be done in a whole variety of ways, some of which require more work than others. The INTRUDER virus will attach itself to the end of an EXE program and gain control when the program first starts. This will

require a routine similar to that in TIMID, which copies program code from memory to a file on disk, and then adjusts the file.

INTRUDER will have its very own code, data and stack segments. A universal EXE virus cannot make any assumptions about how those segments are set up by the host program. It would crash as soon as it finds a program where those assumptions are violated. For example, if one were to use whatever stack the host program was initialized with, the stack could end up right in the middle of the virus code with the right host. (That memory would have been free space before the virus had infected the program.) As soon as the virus started making calls or pushing data onto the stack, it would corrupt its own code and self-destruct.

To set up segments for the virus, new initial segment values for cs and ss must be placed in the EXE file header. Also, the old initial segments must be stored somewhere in the virus, so it can pass control back to the host program when it is finished executing.

We will have to put two pointers to these segment references in the relocation pointer table, since they are relocatable references inside the virus code segment.

Adding pointers to the relocation pointer table brings up an important question. To add pointers to the relocation pointer table, it may sometimes be necessary to expand that table’s size.

Since the EXE Header must be a multiple of 16 bytes in size, relocation pointers are allocated in blocks of four four byte pointers.

Thus, if we can keep the number of segment references down to two, it will be necessary to expand the header only every other time.

On the other hand, the virus may choose not to infect the file, rather than expanding the header. There are pros and cons for both possibilities. On the one hand, a load module can be hundreds of kilobytes long, and moving it is a time consuming chore that can make it very obvious that something is going on that shouldn’t be.

On the other hand, if the virus chooses not to move the load module, then roughly half of all EXE files will be naturally immune to infection. The INTRUDER virus will take the quiet and cautious approach that does not infect every EXE. You might want to try the other approach as an exercise, and move the load module only when necessary, and only for relatively small files (pick a maximum size).

Suppose the main virus routine looks something like this:

VSEG SEGMENT VIRUS:

mov ax,cs ;set ds=cs for virus mov ds,ax

. . .

mov ax,SEG HOST_STACK ;restore host stack cli

mov ss,ax

mov sp,OFFSET HOST_STACK sti

jmp FAR PTR HOST ;go execute host

Then, to infect a new file, the copy routine must perform the following steps:

1. Read the EXE Header in the host program.

2. Extend the size of the load module until it is an even multiple of 16 bytes, so cs:0000 will be the first byte of the virus.

3. Write the virus code currently executing to the end of the EXE file being attacked.

4. Write the initial values of ss:sp, as stored in the EXE Header, to the locations of SEG HOST_STACK and OFFSET HOST_STACK on disk in the above code.

5. Write the initial value of cs:ip in the EXE Header to the location of FAR PTR HOST on disk in the above code.

6. Store Initial ss=SEG VSTACK, Initial sp=OFFSET VSTACK, Initial cs=S EG VS EG , and Initial ip=OFFSET VIRUS in the EXE header in place of the old values.

7. Add two to the Relocation Table Entries in the EXE header.

8. Add two relocation pointers at the end of the Reloca-tion Pointer Table in the EXE file on disk (the locaReloca-tion of these pointers is calculated from the header). The first pointer must point to SEG HOST_STACK in the instruction

mov ax,HOST_STACK

The second should point to the segment part of the

jmp FAR PTR HOST

instruction in the main virus routine.

9. Recalculate the size of the infected EXE file, and adjust the header fields Page Count and Last Page Size accordingly.

10. Write the new EXE Header back out to disk.

All the initial segment values must be calculated from the size of the load module which is being infected. The code to accomplish this infection is in the routine INFECT in Appendix B.

No documento The Little Black Book of Computer Viruses (páginas 61-68)