吾愛破解 - LCG - LSG |安卓破解|病毒分析|破解軟件|www.aejguz.icu

 找回密碼
 注冊[Register]

QQ登錄

只需一步,快速開始

搜索
查看: 7749|回復: 80
上一主題 下一主題

[Android 原創] 匯編與反匯編神器Unicorn

    [復制鏈接]
跳轉到指定樓層
樓主
Richor 發表于 2019-9-19 15:40 回帖獎勵
我們來先說說Unicorn有啥子卵用。
Unicorn 是一款非常優秀的跨平臺模擬執行框架,該框架可以跨平臺執行Arm, Arm64 (Armv8), M68K, Mips, Sparc, & X86 (include X86_64)等指令集的原生程序。
好了說得那么官方,我們舉個例子好了,研究OLLVM的時候是不是很頭疼函數的地址,使用Unicorn就可以打印函數注冊地址,已經參數名稱,用某音的so來演示一下Unicorn的威力
[Asm] 純文本查看 復制代碼
RegisterNatives dvmClass=com/ss/android/common/applog/UserInfo, name=getUserInfo, signature=(ILjava/lang/String;[Ljava/lang/String;)Ljava/lang/String;, [email protected][libcms.so]0x2c6c5
RegisterNatives dvmClass=com/ss/android/common/applog/UserInfo, name=getUserInfo, signature=(ILjava/lang/String;[Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;, [email protected][libcms.so]0x2c6dd
RegisterNatives dvmClass=com/ss/android/common/applog/UserInfo, name=getUserInfoSkipGet, signature=(ILjava/lang/String;[Ljava/lang/String;)Ljava/lang/String;, [email protected][libcms.so]0x2c7b1
RegisterNatives dvmClass=com/ss/android/common/applog/UserInfo, name=getUserInfo, signature=(I[Ljava/lang/String;[Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;, [email protected][libcms.so]0x2c7d1
RegisterNatives dvmClass=com/ss/android/common/applog/UserInfo, name=getPackage, signature=(Ljava/lang/String;)V, [email protected][libcms.so]0x2e0dd


很快就可以找到UserInfo的函數地址了,不管是Hook還是直接動態調試都是事半功倍。
(當初我找函數地址,找到想哭)

好了,開始Unicorn的入門
Unicorn 快速入門
多架構
Unicorn 是一款基于qemu模擬器的模擬執行框架,支持Arm, Arm64 (Armv8), M68K, Mips, Sparc, & X86 (include X86_64)等指令集。

多語言
Unicorn 為多種語言提供編程接口比如C/C++、Python、Java 等語言。Unicorn的DLL 可以被更多的語言調用,比如易語言、Delphi,前途無量。

多線程安全
Unicorn 設計之初就考慮到線程安全問題,能夠同時并發模擬執行代碼,極大的提高了實用性。

虛擬內存
Unicorn 采用虛擬內存機制,使得虛擬CPU的內存與真實CPU的內存隔離。Unicorn 使用如下API來操作內存:

uc_mem_map
uc_mem_read
uc_mem_write
使用uc_mem_map映射內存的時候,address 與 size 都需要與0x1000對齊,也就是0x1000的整數倍,否則會報UC_ERR_ARG 異常。如何動態分配管理內存并實現libc中的malloc功能將在后面的課程中講解。
Hook 機制
Unicorn的Hook機制為編程控制虛擬CPU提供了便利。
Unicorn 支持多種不同類型的Hook。
大致可以分為(hook_add第一參數,Unicorn常量):
指令執行類

UC_HOOK_INTR
UC_HOOK_INSN
UC_HOOK_CODE
UC_HOOK_BLOCK
內存訪問類

UC_HOOK_MEM_READ
UC_HOOK_MEM_WRITE
UC_HOOK_MEM_FETCH
UC_HOOK_MEM_READ_AFTER
UC_HOOK_MEM_PROT
UC_HOOK_MEM_FETCH_INVALID
UC_HOOK_MEM_INVALID
UC_HOOK_MEM_VALID
異常處理類

UC_HOOK_MEM_READ_UNMAPPED
UC_HOOK_MEM_WRITE_UNMAPPED
UC_HOOK_MEM_FETCH_UNMAPPED
調用hook_add函數可添加一個Hook。Unicorn的Hook是鏈式的,而不是傳統Hook的覆蓋式,也就是說,可以同時添加多個同類型的Hook,Unicorn會依次調用每一個handler。hook callback 是有作用范圍的(見hook_add begin參數)。
我們來寫一個舉一個簡單的栗子:
先裝一下Unicorn的導入包
[Asm] 純文本查看 復制代碼
	
pip install unicorn

然后新建一個py文件
[Python] 純文本查看 復制代碼
from unicorn import *
from unicorn.arm_const import *

ARM_CODE = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0"


# mov r0, #0x37;
# sub r1, r2, r3
# Test ARM

# callback for tracing instructions
def hook_code(uc, address, size, user_data):
    print(">>> Tracing instruction at 0x%x, instruction size = 0x%x" % (address, size))


def test_arm():
    print("Emulate ARM code")
    try:
        # Initialize emulator in ARM mode
        mu = Uc(UC_ARCH_ARM, UC_MODE_THUMB) #創建UC對象

        # map 2MB memory for this emulation 創建2MB的內存空間
        ADDRESS = 0x10000
        mu.mem_map(ADDRESS, 2 * 0x10000)
        mu.mem_write(ADDRESS, ARM_CODE) #將前面定義的ARM_CODE傳入內存空間內,只支持byte

        #未開機前寄存器賦值
        mu.reg_write(UC_ARM_REG_R0, 0x1234)
        mu.reg_write(UC_ARM_REG_R2, 0x6789)
        mu.reg_write(UC_ARM_REG_R3, 0x3333)
        #添加指令集Hook
#        mu.hook_add(UC_HOOK_CODE, hook_code, begin=ADDRESS, end=ADDRESS)

        # emulate machine code in infinite time,開機
        mu.emu_start(ADDRESS, ADDRESS + len(ARM_CODE))
        print("已開機")
        #獲取計算器結果
        r0 = mu.reg_read(UC_ARM_REG_R0)
        r1 = mu.reg_read(UC_ARM_REG_R1)
        print(">>> R0 = 0x%x" % r0)
        print(">>> R1 = 0x%x" % r1)
    except UcError as e:
        print("ERROR: %s" % e)

test_arm()

我把核心的位置都寫了備注,這樣很明顯了吧
我們看看運行結果

R0寄存器的就變成了0x37,R1=0x3456,
上面我們明明沒有對R1寄存器進行任何操作,為什么R1會有值呢?

于是我們引入第二個匯編神器Capstone
其實ARM_CODE = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0"就是對寄存器的操作
我們用Capstone來翻譯看看是什么指令
先插個件
[Asm] 純文本查看 復制代碼
pip install capstone

建個py文件
[Python] 純文本查看 復制代碼
from capstone import *
from capstone.arm import *

CODE = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0"

md = Cs(CS_ARCH_ARM, CS_MODE_ARM)
for i in md.disasm(CODE, 0x1000):
    print("%x:\t%s\t%s" % (i.address, i.mnemonic, i.op_str))

查看運行結果

這個總是看得懂了吧,就是簡單arm的指令R1=R2-R3

接下來你們肯定關心怎么打印地址?怎么讓Unicorn想普通模擬器可以單步調試對不對?

無名大佬寫了一個調試,我們來看看這個調試器的源碼
(本菜是無名大佬的腦殘粉)
[Python] 純文本查看 復制代碼
from unicorn import *
from unicorn import arm_const
from unicorn.arm_const import *
import sys
import hexdump
import capstone as cp

BPT_EXECUTE = 1
BPT_MEMREAD = 2
UDBG_MODE_ALL = 1
UDBG_MODE_FAST = 2

REG_ARM = {arm_const.UC_ARM_REG_R0: "R0",
           arm_const.UC_ARM_REG_R1: "R1",
           arm_const.UC_ARM_REG_R2: "R2",
           arm_const.UC_ARM_REG_R3: "R3",
           arm_const.UC_ARM_REG_R4: "R4",
           arm_const.UC_ARM_REG_R5: "R5",
           arm_const.UC_ARM_REG_R6: "R6",
           arm_const.UC_ARM_REG_R7: "R7",
           arm_const.UC_ARM_REG_R8: "R8",
           arm_const.UC_ARM_REG_R9: "R9",
           arm_const.UC_ARM_REG_R10: "R10",
           arm_const.UC_ARM_REG_R11: "R11",
           arm_const.UC_ARM_REG_R12: "R12",
           arm_const.UC_ARM_REG_R13: "R13",
           arm_const.UC_ARM_REG_R14: "R14",
           arm_const.UC_ARM_REG_R15: "R15",
           arm_const.UC_ARM_REG_PC: "PC",
           arm_const.UC_ARM_REG_SP: "SP",
           arm_const.UC_ARM_REG_LR: "LR"
           }

REG_TABLE = {UC_ARCH_ARM: REG_ARM}


def str2int(s):
    if s.startswith('0x') or s.startswith("0X"):
        return int(s[2:], 16)
    return int(s)


def advance_dump(data, base):
    PY3K = sys.version_info >= (3, 0)
    generator = hexdump.genchunks(data, 16)
    retstr = ''
    for addr, d in enumerate(generator):
        # 00000000:
        line = '%08X: ' % (base + addr * 16)
        # 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
        dumpstr = hexdump.dump(d)
        line += dumpstr[:8 * 3]
        if len(d) > 8:  # insert separator if needed
            line += ' ' + dumpstr[8 * 3:]
        # ................
        # calculate indentation, which may be different for the last line
        pad = 2
        if len(d) < 16:
            pad += 3 * (16 - len(d))
        if len(d) <= 8:
            pad += 1
        line += ' ' * pad

        for byte in d:
            # printable ASCII range 0x20 to 0x7E
            if not PY3K:
                byte = ord(byte)
            if 0x20 <= byte <= 0x7E:
                line += chr(byte)
            else:
                line += '.'
        retstr += line + '\n'
    return retstr


def _dbg_trace(mu, address, size, self):
    self._tracks.append(address)
    if not self._is_step and self._tmp_bpt == 0:
        if address not in self._list_bpt:
            return

    if self._tmp_bpt != address and self._tmp_bpt != 0:
        return

    return _dbg_trace_internal(mu, address, size, self)


def _dbg_memory(mu, access, address, length, value, self):
    pc = mu.reg_read(arm_const.UC_ARM_REG_PC)
    print("memory error: pc: %x access: %x address: %x length: %x value: %x" %
          (pc, access, address, length, value))
    _dbg_trace_internal(mu, pc, 4, self)
    mu.emu_stop()
    return True


def _dbg_trace_internal(mu, address, size, self):
    self._is_step = False
    print("======================= Registers =======================")
    self.dump_reg()
    print("======================= Disassembly =====================")
    self.dump_asm(address, size * self.dis_count)

    while True:
        raw_command = input(">")
        if raw_command == '':
            raw_command = self._last_command
        self._last_command = raw_command
        command = []
        for c in raw_command.split(" "):
            if c != "":
                command.append(c)
        try:
            if command[0] == 'set':
                if command[1] == 'reg':  # set reg regname value
                    self.write_reg(command[2], str2int(command[3]))
                elif command[1] == 'bpt':
                    self.add_bpt(str2int(command[2]))
                else:
                    print("[Debugger Error]command error see help.")

            elif command[0] == 's' or command[0] == 'step':
                # self._tmp_bpt = address + size
                self._tmp_bpt = 0
                self._is_step = True
                break
            elif command[0] == 'n' or command[0] == 'next':
                self._tmp_bpt = address + size
                self._is_step = False
                break

            elif command[0] == 'r' or command[0] == 'run':
                self._tmp_bpt = 0
                self._is_step = False
                break
            elif command[0] == 'dump':
                if len(command) >= 3:
                    nsize = str2int(command[2])
                else:
                    nsize = 4 * 16
                self.dump_mem(str2int(command[1]), nsize)
            elif command[0] == 'list':
                if command[1] == 'bpt':
                    self.list_bpt()
            elif command[0] == 'del':
                if command[1] == 'bpt':
                    self.del_bpt(str2int(command[2]))
            elif command[0] == 'stop':
                exit(0)
            elif command[0] == 't':
                self._castone = self._capstone_thumb
                print("======================= Disassembly =====================")
                self.dump_asm(address, size * self.dis_count)
            elif command[0] == 'a':
                self._castone = self._capstone_arm
                print("======================= Disassembly =====================")
                self.dump_asm(address, size * self.dis_count)
            elif command[0] == 'f':
                print(" == recent ==")
                for i in self._tracks[-10:-1]:
                    print(self.sym_handler(i))
            else:
                print("Command Not Found!")

        except:
            print("[Debugger Error]command error see help.")


class UnicornDebugger:
    def __init__(self, mu, mode=UDBG_MODE_ALL):
        self._tracks = []
        self._mu = mu
        self._arch = mu._arch
        self._mode = mu._mode
        self._list_bpt = []
        self._tmp_bpt = 0
        self._error = ''
        self._last_command = ''
        self.dis_count = 5
        self._is_step = False
        self.sym_handler = self._default_sym_handler
        self._capstone_arm = None
        self._capstone_thumb = None

        if self._arch != UC_ARCH_ARM:
            mu.emu_stop()
            raise RuntimeError("arch:%d is not supported! " % self._arch)

        if self._arch == UC_ARCH_ARM:
            capstone_arch = cp.CS_ARCH_ARM
        elif self._arch == UC_ARCH_ARM64:
            capstone_arch = cp.CS_ARCH_ARM64
        elif self._arch == UC_ARCH_X86:
            capstone_arch = cp.CS_ARCH_X86
        else:
            mu.emu_stop()
            raise RuntimeError("arch:%d is not supported! " % self._arch)

        if self._mode == UC_MODE_THUMB:
            capstone_mode = cp.CS_MODE_THUMB
        elif self._mode == UC_MODE_ARM:
            capstone_mode = cp.CS_MODE_ARM
        elif self._mode == UC_MODE_32:
            capstone_mode = cp.CS_MODE_32
        elif self._mode == UC_MODE_64:
            capstone_mode = cp.CS_MODE_64
        else:
            mu.emu_stop()
            raise RuntimeError("mode:%d is not supported! " % self._mode)

        self._capstone_thumb = cp.Cs(cp.CS_ARCH_ARM, cp.CS_MODE_THUMB)
        self._capstone_arm = cp.Cs(cp.CS_ARCH_ARM, cp.CS_MODE_ARM)

        self._capstone = self._capstone_thumb

        if mode == UDBG_MODE_ALL:
            mu.hook_add(UC_HOOK_CODE, _dbg_trace, self)

        mu.hook_add(UC_HOOK_MEM_UNMAPPED, _dbg_memory, self)
        mu.hook_add(UC_HOOK_MEM_FETCH_PROT, _dbg_memory, self)

        self._regs = REG_TABLE[self._arch]

    def dump_mem(self, addr, size):
        data = self._mu.mem_read(addr, size)
        print(advance_dump(data, addr))

    def dump_asm(self, addr, size):
        md = self._capstone
        code = self._mu.mem_read(addr, size)
        count = 0
        for ins in md.disasm(code, addr):
            if count >= self.dis_count:
                break
            print("%s:\t%s\t%s" % (self.sym_handler(ins.address), ins.mnemonic, ins.op_str))

    def dump_reg(self):
        result_format = ''
        count = 0
        for rid in self._regs:
            rname = self._regs[rid]
            value = self._mu.reg_read(rid)
            if count < 4:
                result_format = result_format + '  ' + rname + '=' + hex(value)
                count += 1
            else:
                count = 0
                result_format += '\n' + rname + '=' + hex(value)
        print(result_format)

    def write_reg(self, reg_name, value):
        for rid in self._regs:
            rname = self._regs[rid]
            if rname == reg_name:
                self._mu.reg_write(rid, value)
                return
        print("[Debugger Error] Reg not found:%s " % reg_name)

    def show_help(self):
        help_info = """
        # commands
        # set reg <regname> <value>
        # set bpt <addr>
        # n[ext]
        # s[etp]
        # r[un]
        # dump <addr> <size>
        # list bpt
        # del bpt <addr>
        # stop
        # a/t change arm/thumb
        # f show ins flow
        """
        print(help_info)

    def list_bpt(self):
        for idx in range(len(self._list_bpt)):
            print("[%d] %s" % (idx, self.sym_handler(self._list_bpt[idx])))

    def add_bpt(self, addr):
        self._list_bpt.append(addr)

    def del_bpt(self, addr):
        self._list_bpt.remove(addr)

    def get_tracks(self):
        for i in self._tracks[-100:-1]:
            # print (self.sym_handler(i))
            pass
        return self._tracks

    def _default_sym_handler(self, address):
        return hex(address)

    def set_symbol_name_handler(self, handler):
        self.sym_handler = handler


def test_arm():
    print("Emulate Thumb code")
    THUMB = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0"
    # sub    sp, #0xc
    # sub    sp, #0xc
    # sub    sp, #0xc
    try:
        # Initialize emulator in ARM mrode
        mu = Uc(UC_ARCH_ARM, UC_MODE_THUMB)

        # map 2MB memory for this emulation
        ADDRESS = 0x10000
        mu.mem_map(ADDRESS, 2 * 0x10000)
        mu.mem_write(ADDRESS, THUMB)

        mu.reg_write(UC_ARM_REG_SP, 0x1234)
        mu.reg_write(UC_ARM_REG_R2, 0x6789)

        # debugger attach
        udbg = UnicornDebugger(mu)
        udbg.add_bpt(ADDRESS)

        # emulate machine code in infinite time
        mu.emu_start(ADDRESS, ADDRESS + len(THUMB))
        r0 = mu.reg_read(UC_ARM_REG_SP)
        r1 = mu.reg_read(UC_ARM_REG_R1)
        print(">>> SP = 0x%x" % r0)
        print(">>> R1 = 0x%x" % r1)
    except UcError as e:
        print("ERROR: %s" % e)

test_arm()

我們看看運行結果

寄存器的值,和反編譯后的指令都顯示出來了
接下來就是輸入指令了,step,run,next,這是不是跟F8,F9,F10,步入,步過,運行很像呢
這個大家可以自己去嘗試以下,我就直接run了

值都打印出來啦。
這些都是Unicorn的基礎,那些大佬已經基于Unicorn寫出很多很強大的逆向工具,大家有興趣可以自己找找

免費評分

參與人數 29吾愛幣 +28 熱心值 +28 收起 理由
kaiserFeng + 1 [email protected]
pjmeepo + 1 + 1 歡迎分析討論交流,吾愛破解論壇有你更精彩!
N0LL + 1 + 1 [email protected]
yixi + 1 + 1 [email protected]
silvanevil + 1 + 1 [email protected]
莫流云 + 1 + 1 &amp;lt;font style=&amp;quot;vertical-align: inherit;&amp;quot;&amp;gt;&amp;lt;font style=
jacky520510 + 2 + 1 我很贊同!
日落繁星 + 1 + 1 熱心回復!
huangyutong + 1 + 1 優秀!
大毛孩 + 1 + 1 來點肝貨
tianzhiya + 1 + 1 [email protected]
yaoyao7 + 1 + 1 我很贊同!
南冥的小鯤 + 1 + 1 用心討論,共獲提升!
jacktimto + 1 我很贊同!
luli1111 + 1 + 1 [email protected]
sxlj + 1 + 1 我很贊同!
Ghostits + 1 [email protected]
thenow + 1 + 1 [email protected]
毛新航 + 1 [email protected]
s3nake + 1 + 1 我很贊同!
15276305588 + 1 + 1 [email protected]
wxue + 1 + 1 [email protected]
fei8255 + 1 + 1 [email protected]
haungyexingdong + 1 + 1 我很贊同!
19塊 + 1 我很贊同!
星辰物語呀 + 1 + 1 我很贊同!
Zeno___Lee + 1 + 1 大佬來一個匯編入門的自學鏈接唄
風繞柳絮輕敲雪 + 3 + 1 我很贊同!
gaosld + 1 + 1 用心討論,共獲提升!

查看全部評分

本帖被以下淘專輯推薦:

發帖前要善用論壇搜索功能,那里可能會有你要找的答案或者已經有人發布過相同內容了,請勿重復發帖。

推薦
 樓主| Richor 發表于 2019-9-19 17:48 <
75769837 發表于 2019-9-19 17:45
https://bbs.pediy.com/thread-253868.htm
和這個帖子 和 內容 以及其公布的調用抖音So的項目有啥大點 ...

沒看清楚內容么?我說了調試器是無名寫的,但是抖音so調用,你有本事倒是做個出來看看啊?
推薦
75769837 發表于 2019-9-19 17:45
Richor 發表于 2019-9-19 16:26
來,你給我搬運一個某音so地址打印

https://bbs.pediy.com/thread-253868.htm
和這個帖子 和 內容 以及其公布的調用抖音So的項目有啥大點的區別嘛?
沙發
高苗苗 發表于 2019-9-19 16:01
3#
TwilightZ 發表于 2019-9-19 16:02
感謝樓主分享,支持一下!
4#
三楓大神 發表于 2019-9-19 16:17
工具呢在哪呢?
5#
75769837 發表于 2019-9-19 16:23
這個不是從看雪那邊搬過來的嘛?
加了幾個圖片
6#
 樓主| Richor 發表于 2019-9-19 16:26 <
75769837 發表于 2019-9-19 16:23
這個不是從看雪那邊搬過來的嘛?
加了幾個圖片

來,你給我搬運一個某音so地址打印
7#
qybl999 發表于 2019-9-19 16:44
工具怎么下
8#
Zeno___Lee 發表于 2019-9-19 17:40
大佬來一個匯編入門的自學鏈接唄
您需要登錄后才可以回帖 登錄 | 注冊[Register]

本版積分規則 警告:禁止回復與主題無關內容,違者重罰!

快速回復 收藏帖子 返回列表 搜索

RSS訂閱|小黑屋|聯系我們|吾愛破解 - LCG - LSG ( 京ICP備16042023號 | 京公網安備 11010502030087號 )

GMT+8, 2019-10-12 04:00

Powered by Discuz!

© 2001-2017 Comsenz Inc.

快速回復 返回頂部 返回列表
宝盈娱乐平台