You are not logged in.
I'm trying to write an assembly language program to print out the first argument that it's called with:
(Using nasm & ld on a Core2Duo running Arch-x86_64)
segment .data
string db 0
segment .text
global _start
_start:
pop rcx ;argc
cmp rcx,2 ;if no arguments, then
jl noArgs ;jump (in order to prevent Segfault)
pop rcx ;argv[0] (program name)
mov [string],rcx ;first argument
mov rdx,0
measureLen: ;measure length of rcx, put result in rdx
cmp byte [rcx], 0
jz endMeasure
inc rdx ;length of rcx
inc rcx
jmp measureLen
endMeasure:
mov rcx,string
mov rax,4
mov rbx,1
int 80h ;print
mov rax,1
mov rbx,0
int 80h ;exit returning 0
noArgs:
mov rax,1
mov rbx,-1
int 80h ;exit returning -1
output:
$ nasm -f elf64 test.asm
$ld -o test test.o
./test firstarg
cLj$
I can find sample code for nasm to do this, but only in 32-bit x86 assembly. I know I can just write it in C or use its libraries, but it's bugging me and I don't want to give in like that.
Thanks for any help.
Last edited by gopher292 (2009-04-07 20:52:13)
Offline
I've never used 64-bit assembly or assembly on linux before but I'll take a guess. In the following snippet...
endMeasure:
mov rcx,string
mov rax,4
mov rbx,1
int 80h ;print
... I think you need to use mov rcx, [string]. Without the brackets, you are moving the address of string to rcx. rcx is like a char * pointer in C. but string is really like a char ** pointer, because earlier you stored an address into it.
So right now you are storing the address of the label "string" into rcx, then the print syscall is printing what that address in rcx points to (what string's value is) which is... another address you had put there earlier.
I think string should be a different datatype as well in order to store an address. For 32bit I'd use a doubleword (dd) but maybe 64bit uses a quadword? (dq?). Since it is only a byte, the address stored is probably overflowing and overwriting instructions.
I hope that helps, I could be wrong I'm a bit rusty.
edit:
Actually after seeing the segment's in your code I realized instructions probably aren't being overwritten because segments are padded with empty data. But if you had more data after "string" that data would be overwritten.
Last edited by juster (2009-04-07 01:06:30)
Offline
There are some differences between x86 and x86_64 at machine level.
This can help you http://www.openvms-rocks.com/~hophet/docs/X86-64.txt
Offline
juster:
Thanks for the help.
I tried mov rcx,[string] but then it doesn't print anything at all.
There should be another pop rcx in order to get the first argument, though to get the first argument instead of the program name.
After changing to mov rcx,[string] and running this through EDB, I found that the int 80h used to print the string returned fffffffffffffff2 to rax instead of the length of the printed string. I haven't been able to find any information on this error code anywhere.
djera:
Thanks for the doc.
I read over it but don't see where there would be any issues with the code I have. I did learn I should use syscall instead of int 80h, but I don't think that's the problem, since was able to use it to print out data I hard-coded in using the old int 80h method.
Offline
ok I downloaded nasm this time and it worked for me when I changed each "segment" to "section"... as well as changing all the registers to 32bit ones of course.
Offline
After reading djgera's very helpful link and playing around with [32bit] assembler myself this morning, it looks like you are mixing together 32bit and 64bit calling convention and instructions. It would probably be less confusing if you just went all the way in either direction. Assigning to 64bit registers might work, since the 32bit EAX register is inside RAX. The point is this could be confusing when, for example, you examine the registers like for your return value of int 80h.
I'm guessing since the int 80h is for 32bit calling convention it returns 32 bit values. So half of fffffffffffffff2 is your 32bit return value. And since ffffffff is -1 in two's complement, this is an error value according to the write(2) manpage. But I don't understand why the ffffffff is in the high dword, when I'd assume it would be in the lower dword (which is the eax register).
You are basically creating a 64 bit executable running in 64 bit mode, with a 64 bit stack of quadwords, calling a 32 bit OS interface... welcome to the black magic of x86 backwards-compatibility! On windows xp I could still execute 16-bit executables, I don't know about linux. I first learned assembly writing 16-bit dos programs on windows xp ... bizarre!
AMD PDFs for x86_64
http://www.amd.com/us-en/Processors/Dev … 44,00.html
NASM Chapter 11: Writing 64-bit Code (Unix, Win64)
http://www.nasm.us/doc/nasmdo11.html
There are also references at the bottom of djgera's link.
Offline
Solved! Apparently it was the 32-bit calls that were messing it up.
This code works:
section .data
string db 0
section .text
global _start
_start:
pop rcx ;argc
cmp rcx,2 ;if no arguments, then
jl noArgs ;error.
pop rcx ;argv[0] (program name)
pop rcx ;first argument
mov [string],rcx
mov rdx,0
measureLen:
cmp byte [rcx], 0
jz endMeasure
inc rdx ;length of rcx
inc rcx
jmp measureLen
endMeasure:
mov rsi,[string]
mov rax,1
syscall
mov rdi,1
mov rax,3ch
syscall
noArgs:
mov rdi,1
mov rax,3ch
syscall
Thanks both of you! I guess the old 32-bit function call was only reading half of the rax register, and so tried to print from an address that didn't exist or was completely different. (?)
It would be simpler just to push it back on the stack, too, then pop it off when needed instead of using a variable for it. (For this trivial piece of code, anyway.)
Thanks.
Offline