|
UsageExamples
Usage examples of pefile
IntroductionSimple code snippets to get you started Loading a PE fileImport the module and parse a file. import pefile pe = pefile.PE(‘/path/to/pefile.exe’) Optionally, setting the fast_load argument to True will prevent parsing the directories. In large PE files this can make loading significantly faster and it might be a good idea to use it none of the information from the data directories is needed. import pefile pe = pefile.PE(‘/path/to/pefile.exe’, fast_load=True) A later call to the full_load() method would parse the missing information. It's also possible to just parse raw PE data: pe = pefile.PE(data=str_object_with_pe_file_data) Reading and writing standard header membersOnce the PE file is successfully parsed, the data is readily available as attributes of the PE instance. pe.OPTIONAL_HEADER.AddressOfEntryPoint pe.OPTIONAL_HEADER.ImageBase pe.FILE_HEADER.NumberOfSections All of these values support assignment pe.OPTIONAL_HEADER.AddressOfEntryPoint = 0xdeadbeef and a subsequent call to pe.write(filename='file_to_write.exe') will write the modified file to disk. All the structures and members defined in the PE format should be available with the same names. Some convenient shortcuts exist, for instance the sections list. Usually, all the structures containing a member Characteristics (or similar fields of flags) will contain attributes set to True or False according to the value of the corresponding flag. Notes about the write support Starting from pefile 1.2 it's possible to write back any changes done to the PE file. One has to be careful with this functionality as it will not be very intelligent reconstructing the PE file. That is, it will not handle displacing structures if that would be needed because a new section/structure has been added. The rule of thumb is, if there's room for an additional header/structure to fit then there'll be no problem and pefile will write it. All other modifications, i.e. changing individual values in header/structure members should work well. One possible useful application of this could be to correct malformed headers used by some malware in order to cause certain analysis tools to malfunction. Iterating through the sectionsSections are added to a list accesible as the attribute sections in the PE instance. The common structure members of the section header are reachable as attributes. for section in pe.sections:
print (section.Name, hex(section.VirtualAddress),
hex(section.Misc_VirtualSize), section.SizeOfRawData )Output('.text', '0x1000L', '0x6D72L', 28160L)
('.data', '0x8000L', '0x1BA8L', 1536L)
('.rsrc', '0xA000L', '0x8948L', 35328L)Listing the imported symbolsEach directory, if it exists in the PE file being processed, has an entry as DIRECTORY_ENTRY_directoryname in the PE instance. The imported symbols can be listed as follows: for entry in pe.DIRECTORY_ENTRY_IMPORT:
print entry.dll
for imp in entry.imports:
print '\t', hex(imp.address), imp.nameOutputcomdlg32.dll
0x10012A0L PageSetupDlgW
0x10012A4L FindTextW
0x10012A8L PrintDlgExW
[snip]
SHELL32.dll
0x1001154L DragFinish
0x1001158L DragQueryFileWListing the exported symbolsSimilarly, the exported symbols can be listed as follows: for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols: print hex(pe.OPTIONAL_HEADER.ImageBase + exp.address), exp.name, exp.ordinal Output0x7ca0ab4f SHUpdateRecycleBinIcon 336 0x7cab44c0 SHValidateUNC 173 0x7ca7b0aa SheChangeDirA 337 0x7ca7b665 SheChangeDirExA 338 0x7ca7b3e1 SheChangeDirExW 339 0x7ca7aec6 SheChangeDirW 340 0x7ca8baae SheConvertPathW 341 Dumping all the informationprint pe.dump_info() Will produce a full textial dump of all the parsed information. Check FullDump0x90, FullDumpTinyPE or FullDumpKernel32 for examples. Retrieving the bytes at the entry pointWe can use pefile together with tools like pydasm to build a small disassembler. A toy example might look like the following. We first fetch the entry point address, the retrieve 100 bytes starting at the entry point and we loop through the data disassembling as we go: ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint ep_ava = ep+pe.OPTIONAL_HEADER.ImageBase data = pe.get_memory_mapped_image()[ep:ep+100] offset = 0 while offset < len(data): i = pydasm.get_instruction(data[offset:], pydasm.MODE_32) print pydasm.get_instruction_string(i, pydasm.FORMAT_INTEL, ep_ava+offset) offset += i.length Outputpush byte 0x70 push dword 0x1001888 call 0x1006ca8 xor ebx,ebx push ebx mov edi,[0x100114c] call edi cmp word [eax],0x5a4d jnz 0x1006b1d mov ecx,[eax+0x3c] add ecx,eax cmp dword [ecx],0x4550 jnz 0x1006b1d movzx eax,[ecx+0x18 Dumping all the informationSometimes we might not want to process an entire file if it's very large. Parsing can be time consuming in some cases an we might only be interested in a subset of the information provided by the headers and directories. It is possible to indicate pefile to only load a minimal set of the headers (up to the NT Headers) with the fast_load keyword argument and leave the directories unprocessed. The directories can be parsed later on, on demand. The following example loads the basic headers and then goes on to parse most of the directories avoiding the relocation information ( the line could have been left out altogether ). pe = pefile.PE(os.sys.argv[1], fast_load=True)
pe.parse_data_directories( directories=[
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_IMPORT'],
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_EXPORT'],
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_RESOURCE'],
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_DEBUG'],
# pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC'], # Do not parse relocations
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_TLS'],
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT'],
pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT'] ] )
|
Sign in to add a comment
pe.OPTIONAL_HEADER.NumberOfSections? => pe.FILE_HEADER.NumberOfSections?
could pefile get the "DigitalSign?"?
In your changelog, you say:
# Now it's possible to modify the version information by directly assigning new values to the keys, for instance pe.FileInfo?0?.StringTable?0?.entries'OriginalFilename'? = 'NewName?.exe' # Other common keys are: LegalCopyright?, InternalName?, FileVersion?, CompanyName?, ProductName?, ProductVersion?, FileDescription?, OriginalFilename?
This is EXACTLY what I want to do (grab the company name from the exe). However, I'm at a loss as to how to do this. When I type: print pe.FileInfo?0?, it returns: Length, ValueLength?, Type. When I print that with StringTable?0?, I get an error saying there is no attribute. Any help would be appreciated!
It would be nice to have support to modify import and export table entries. any plans?
plusbryan:
I needed to extract the file version info, which I found in the VS_FIXEDFILEINFO attribute. If your file has company info, it may be there too.
something like: from pefile import PE pe = PE('location_of_your_file.exe') print pe.VS_FIXEDFILEINFO
that should print some info, which might include the company name.
I found that section.Name was returning nulls at the end of the returned name string e.g. .text\x00\x00\x00, .data\x00\x00\x00 which was driving my output parsing insane til I discovered it.
Is it returning that way by design (startOffset-endOffset) or should that have been filtered out for a regular returned string. I didnt see this documented so just wondering, I fixed my issue by just using
section.Name = str(section.Name).replace("\0", "")
Thanks
Usually, virus or trojan read overlay data in the end of PE file.
Here's a sample:
def get_overlay(filename): pe = pefile.PE(filename, fast_load=True) filebuffer = pe.write() # get the last section offset and length s = pe.sections[-1] if (len(filebuffer) > s.PointerToRawData + s.SizeOfRawData): return filebuffer[s.PointerToRawData + s.SizeOfRawData:]