Tiny Core Linux

Tiny Core Base => TCB Q&A Forum => Topic started by: Paulo on March 30, 2013, 07:28:08 AM

Title: TC and memory
Post by: Paulo on March 30, 2013, 07:28:08 AM
Hi

I have a need to use memory directly but at a fixed address and must be accessible by various apps (and scripts)
which pretty much rules out using 'malloc'.

Writing to and reading from a fixed location is the easy part and I have decided to use 0x7C00 to 0x7DFF which is the first stage OS boot sector
copied to RAM and used before control is passed to the kernel and thus no longer needed.
However this is only 512 bytes long and I need a bit more so been thinking of also using the area used by the Real Mode Interrupt Vector Table
which will give me another 1024 bytes.

My question then is:
Does TC make use of or ever goes back to Real Mode (or BIOS after booting) for anything including changing screen resolution?
I don't expect that it would but thought it wiser to double check with those that know more about these things.

Thanks.
Title: Re: TC and memory
Post by: curaga on March 30, 2013, 07:59:36 AM
You're trying to access the RAM behind the kernel's back - that will end up blowing up ;)

It sounds like you're doing a driver in userspace for something that needs DMA at that address. In that case, you should get a kernel driver.
Alternatively, you could try restricting the kernel's access with the mem boot option.

If this is not a driver, but merely IPC (inter-process communication), there are much saner ways.


edit: Our kernel reserves the first 64kb for BIOS use, but then you'd be going behind the back of the BIOS.
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 08:12:27 AM
Hi curaga

Quote
You're trying to access the RAM behind the kernel's back - that will end up blowing up

Normally I would agree with you, but remember that the address range/s I want to use are never touched by the kernel anyway, certainly not the first stage loader.
I can't see the kernel allocating this to a process.

Quote
It sounds like you're doing a driver in userspace for something that needs DMA at that address. In that case, you should get a kernel driver.

In a round about way, yes.
What I'm actually doing is reading external data and writing it to RAM as fast as possible.
I'm using assembler to maximize speed and minimize overheads associated with higher level languages.
It's a simple logic analyzer to capture some "glitches" and it just seems that making a kernel driver is quite a long winded way of doing it.
The reasons I'm trying to avoid a kernel driver is due to my limited knowledge of them and the fact that if a kernel driver goes bonkers for whatever reason, it pretty much
takes the kernel with it whereas in userspace that won't happen.

Quote
If this is not a driver, but merely IPC (inter-process communication), there are much saner ways.

I suppose in a way it could be classified as IPC and my motivation for using direct memory access is partly to learn and partly because I need the speed.
I would however, be very interested in other methods that maybe available for IPC.

EDIT:

In reply to your edit,

Quote
edit: Our kernel reserves the first 64kb for BIOS use, but then you'd be going behind the back of the BIOS.

But does TC make use of BIOS at all after booting?
Title: Re: TC and memory
Post by: Rich on March 30, 2013, 11:31:47 AM
Hi Paulo
Curaga is right. This is a very bad idea, not just because you are essentially deceiving the kernel, but because there
are simpler ways to accomplish this.
Quote
I have a need to use memory directly but at a fixed address and must be accessible by various apps (and scripts)
which pretty much rules out using 'malloc'.
If you believe that, it's time to re-examine the architecture of your project.
Quote
It's a simple logic analyzer to capture some "glitches" ...
So you need to watch for a trigger condition, and start collecting data once it's met. This is the "real time" part of your
application. Then there is the post processing (display, instruction dis-assembly, histograms, whatever) which does
not have to occur in "real time".
For the first part, you most certainly can malloc the RAM you need and use that. Or you can declare a local block of
RAM as part of your program, with luck, it will all fit in the processors cache. Once the data has been captured, you
save it to a file, and then the post processing can be handled.
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 01:15:27 PM
Hi Rich

What you and curaga say does make sense, however I as thinking of doing the project in two parts.

1) The "real time" part in either C or ASM, haven't quite decided yet.

2) The "non real time" display stuff in gtkdialog as it's very versatile and easy to make a nice GUI.

Since I'm not a sadomasochist, I don't intent making a GUI in either C or ASM hence the gtkdialog part.
However this creates a bit of a problem as when the first pgm terminates after writing data to memory using malloc,
the memory allocated will also be reclaimed and thus lost.
Hence my idea of writing to a non-used portion of RAM, then I'm free to read it with any other binary or script.

Am I missing something here?
Title: Re: TC and memory
Post by: bmarkus on March 30, 2013, 01:19:56 PM
Am I missing something here?

What you are missing is LINUX itself. You are planning a program for a bare system without considering the operating system. For example LINUX provides interprocess communication mechanisms.
Title: Re: TC and memory
Post by: Rich on March 30, 2013, 01:25:53 PM
Hi Paulo
Quote
Am I missing something here?
Yes you are, read:
Quote
For the first part, you most certainly can malloc the RAM you need and use that. Or you can declare a local block of
RAM as part of your program, with luck, it will all fit in the processors cache. Once the data has been captured, you
save it to a file, and then the post processing can be handled
Prior to terminating, you save the contents of the capture buffer to a file.
Title: Re: TC and memory
Post by: bmarkus on March 30, 2013, 01:27:41 PM
WHy to hack an operating system when it offers solution?
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 03:39:30 PM
Good thing I'm wearing my fire proof suit   ;)

OK, I relent, saving to a file it is.
I just thought the whole idea of Linux was being able to experiment with different ways of doing things
even if some of them may be a bit unorthodox.

I'm still curious though if TC does use BIOS/Real Mode in any way.

Title: Re: TC and memory
Post by: Rich on March 30, 2013, 04:27:42 PM
Hi Paulo
Quote
I'm still curious though if TC does use BIOS/Real Mode in any way.
I think the Linux kernel in general pretty much ignores the BIOS. I have an old machine from around 1997 or so with
a 320Gbyte hard drive in it. I set that drive to be non existent in the BIOS setup because the machine couldn't boot
with a drive that large present, and TInycore finds and mounts it anyway. As far as applications are concerned, maybe
wine uses it, I don't know. I've know that Xorg or one of the video driver packages has an int10 module, and the Xorg
development packages contain xf86int10.h.
Quote
I just thought the whole idea of Linux was being able to experiment with different ways of doing things
even if some of them may be a bit unorthodox.
You are of course free to experiment, but unorthodox solutions are best saved for when an orthodox solution does
not exist.
Title: Re: TC and memory
Post by: tinypoodle on March 30, 2013, 04:45:05 PM
I'm still curious though if TC does use BIOS/Real Mode in any way.

The closest to that I could think of would be elksemu, but I suspect that won't work with the Core stock kernel:
Code: [Select]
# CONFIG_BINFMT_MISC is not set
http://linux.die.net/man/1/elks
https://github.com/lkundrak/dev86/tree/master/elksemu
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 04:47:29 PM
Hi Rich

Quote
I think the Linux kernel in general pretty much ignores the BIOS
My thoughts as well.

Quote
As far as applications are concerned, maybe
wine uses it, I don't know.

Never used Wine myself, but if it's anything like XP, then BIOS functions are emulated via the NTVM
and not the real BIOS.
In fact I have found quite a few dependencies between real BIOS and the way Microsoft thinks it should be implemented.

Quote
I've know that Xorg or one of the video driver packages has an int10 module, and the Xorg
development packages contain xf86int10.h.

Very interesting.
Surely there must be quite a performance hit going back and forth between PM and RM?

Quote
You are of course free to experiment, but unorthodox solutions are best saved for when an orthodox solution does
not exist.

Fair enough.
I did do some experimenting just for fun and the section reserved for the first stage boot loader seems OK to mess about with but I will
use a file to save the memory block allocated by calling 'malloc'.

Thanks for the answers.


Title: Re: TC and memory
Post by: Rich on March 30, 2013, 04:56:51 PM
Hi Paulo
Another advantage to saving to a file is your capture program can save the data in whatever format is convenient for
the post processing program using fprintf, be it binary, hex, ASCII, or something else.
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 04:57:42 PM
Hi tinypoodle

Never heard of ELKS before but looks very interesting.
Still have an old 286 knocking about somewhere, maybe it's just the thing for it.
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 05:01:24 PM
Hi Rich

Quote
Another advantage to saving to a file is your capture program can save the data in whatever format is convenient for
the post processing program using fprintf, be it binary, hex, ASCII, or something else.

Microsoft Excel format?
Only kidding, but yes good point, it will make life easier when it comes to displaying the data.
Never used fprintf in C, only printf, will investigate fprintf.
Title: Re: TC and memory
Post by: Rich on March 30, 2013, 05:11:01 PM
Hi Paulo
Quote
Microsoft Excel format?
You could if you knew the data format used,
Fprintf is exactly the same as printf except it takes a file pointer as its first argument.
Title: Re: TC and memory
Post by: Rich on March 30, 2013, 05:23:17 PM
Hi Paulo
Quote
You could if you knew the data format used,
Nonsense, you could just save it in a comma separated format, Excel should be able to handle reading that.
Title: Re: TC and memory
Post by: Paulo on March 30, 2013, 05:27:27 PM
Hi Rich

Quote
You could if you knew the data format used

Interestingly Microsoft dumped their own crappy Excel format as of Office7 (I think).
The latest Excel (and Word) actually use xml.
Excel creates files with a .xlsx extension but all it is are zipped xml files.
Try it out, create a simple Excel worksheet then save it, then change the extension to .zip
and have a look at the contents.

Actually Microsoft got into quite hot water over using xml which is an open standard and they tried to
use it as their own.

As regards fprintf, it looks pretty handy, thanks for mentioning it.


Title: Re: TC and memory
Post by: tinypoodle on March 30, 2013, 06:20:52 PM
BTW, elksemu is included in the Dev86 extension.
Title: Re: TC and memory
Post by: genec on March 30, 2013, 07:26:06 PM
Office 2007 is Office 12.
Title: Re: TC and memory
Post by: genec on March 30, 2013, 09:48:33 PM
1) The "real time" part in either C or ASM, haven't quite decided yet.

Am I missing something here?
To quote someone's code who has a notable amount of ASM experience "asm-mode sucks".  Unless you know your ASM is highly optimized, my understanding is that GCC's output will easily do at least as good in far less time.  If you compare C to Java or Perl, sure, there's often a difference in overhead.
Title: Re: TC and memory
Post by: Rich on March 31, 2013, 02:10:02 AM
HI genec
Someone proficient in assembly might be able to generate faster or smaller code than GCC, but would probably
be hard pressed to do so. To do so, you need to be familiar with the execution time and the number of bytes required
by each instruction the processor supports, in order to select the optimum instructions for the task at hand. If someone
wants to see the assembly code GCC generates, add:
Code: [Select]
-Wa,-alh=YourProgramsName.lst -masm=intelwhen you invoke GCC. The  .lst  file will contain a mix of the C source and assembly language. Then add the -Os or -O2
directive and try following the assembler produced.
Title: Re: TC and memory
Post by: Paulo on March 31, 2013, 03:09:57 AM
ASM does involve a bit more work to optimize it fully.
As Rich mentioned, this involves looking thru the pages of the Intel datasheets for the processor involved
and checking clock cycles and number of bytes taken by each instruction.

In the "good old days" of the 186, 286 and to some extent the 386, ASM programmers would try and squeeze
every last drop of performance by even saving a clock cycle or two by using:

Code: [Select]
xor eax, eax

Instead of:

Code: [Select]
mov eax, 0

These days, processors are so fast and have features like caching, prefetching and so on that it makes no difference.
(Although that is not to say that optimization is no longer required).

A lot of great optimization tricks can also be learnt from de-compiling and studying the source of computer games from the DOS era.

I know that gcc can produce pretty tight code but I have only used it when I want to mix C and inline ASM.
For purely ASM I use FASM as it has extensive macro capabilities thus making one's life just a bit easier.
By using the macros as includes, it allows the use of highly optimised functions which are easier to "tune" then the equivalent C .h files and built-in functions.
It can also produce executables in PE and ELF formats as well as flat binaries.

I guess at the end of the day, a lot of it also comes down to personal preference and to what is used, gcc, nasm or fasm.
Title: Re: TC and memory
Post by: tinypoodle on March 31, 2013, 06:38:40 PM
A lot of great optimization tricks can also be learnt from de-compiling and studying the source of computer games from the DOS era.

Why that when there are open source Linux games?

e.g.:
http://www.deater.net/weave/vmwprod/tb1/tb_asm.html
Title: Re: TC and memory
Post by: Paulo on April 01, 2013, 06:14:58 AM
Hi tinypoodle

Quote
Why that when there are open source Linux games?

e.g.:
http://www.deater.net/weave/vmwprod/tb1/tb_asm.html

Nothing wrong with "peeking" at Linux games too, however that specific example is not a great one
as it uses AT&T syntax whereas Intel syntax is not only more widespread but less crazy.
(Sorry but in my opinion, AT&T syntax is just weird).

I mentioned DOS games because:

1) There are more of them, I certainly have lots of older ones from when I had time to play them.

2) The older DOS games are written to do more with less as opposed to the Linux ones which generally rely
on the user having more modern processors.
Also remember that some old DOS games were actually in the COM format which made it easier to understand as no header,
relocation tables and all that other stuff, pretty much the same as a flat binary but starting at 0x100.
The mere fact that they were in COM format, restricted their size to 64KB maxwhich in itself forced the programmer to resort to lots
of optimization techniques to get the whole game under that limit.

3) By de-compiling executables, one also learns about reverse engineering and the use of disassemblers
which although not crucial to being a good programmer, does tend to teach more about the make up of executables, their sections
and the data layout whether it be in COM, MZ, PE or ELF format.

4) The challenge of being able to "see" the source from an executable that might be packed and/or protected. ;)
Title: Re: TC and memory
Post by: curaga on April 01, 2013, 09:34:08 AM
Quote
But does TC make use of BIOS at all after booting?

This is not my area of expertise, but at least the vesa modesetting code calls the BIOS. (Xvesa, Xorg vesa module, or vesa framebuffer). The VGA text console setup likewise.

Quote
xor eax, eax

Heh, I remember the first time I saw that in GCC's output and wondered why the heck did it do that. Not exactly a clear way to set it to zero.
Title: Re: TC and memory
Post by: Paulo on April 01, 2013, 01:37:30 PM
Hi curaga

Quote
This is not my area of expertise, but at least the vesa modesetting code calls the BIOS. (Xvesa, Xorg vesa module, or vesa framebuffer). The VGA text console setup likewise.

I'm actually surprised that a modern 32 bit protected mode O.S. like Linux would use BIOS to switch video modes instead of using drivers to interact directly with the VGA registers.
Although having said that I can understand why it's done that way as it's much more work to first query the PCI bus, then find the vendor ID, bus number the card is on, get the subfunctions
then having to see what card/chip set is being used as they all have their quirks, then get the PCI BARs and plug the correct values into the required registers and sometimes it has to be done in a certain order on some cards.

Quote
Heh, I remember the first time I saw that in GCC's output and wondered why the heck did it do that. Not exactly a clear way to set it to zero.

Yup, the idiosyncrasies of the x86 architecture, where it's more "economical" to xor a register with itself rather then load it with an immediate constant of zero.
When we get to direct and indirect addressing (in Real Mode) then it becomes a veritable hornets nest as to the use of 'lea' (and others) as it depends not only on the processor
used (386, 486, 586 and up) but also which segment:offset we are using/loading/updating.
Title: Re: TC and memory
Post by: curaga on April 01, 2013, 04:06:29 PM
Quote
I'm actually surprised that a modern 32 bit protected mode O.S. like Linux would use BIOS to switch video modes instead of using drivers to interact directly with the VGA registers.

You have the choice here - in TC we default to the vesa driver (which uses the BIOS), since it's both smaller and works with most cards. But the proper drivers are also available (see the Xorg extensions) for when you need gpu acceleration or non-vesa resolutions.
Title: Re: TC and memory
Post by: genec on April 01, 2013, 08:39:32 PM
HI genec
Someone proficient in assembly might be able to generate faster or smaller code than GCC, but would probably
be hard pressed to do so.
Agreed (and also my point).  ASM is best for final stages of compilers (like GCC) special items like MBRs and microcode and utilizing interfaces not provided through the language.  An example of the latter I'm familiar with is the implementation of Syslinux's intcall() to call BIOS ISRs (while intcall() should be used in most code)
Title: Re: TC and memory
Post by: genec on April 01, 2013, 09:16:43 PM
In the "good old days" of the 186, 286 and to some extent the 386, ASM programmers would try and squeeze
every last drop of performance by even saving a clock cycle or two by using:

Code: [Select]
xor eax, eax

Instead of:

Code: [Select]
mov eax, 0

These days, processors are so fast and have features like caching, prefetching and so on that it makes no difference.
(Although that is not to say that optimization is no longer required).
It's still common today, especially in optimized (GCC) or specialized (MBR, microcode) code.  An immediate as above is 6 (66 B8 00 00 00 00) as opposed to 3 (66 31 C0).
Title: Re: TC and memory
Post by: Paulo on April 02, 2013, 02:38:37 AM
Hi genec
You are quite right that with microcode and especially mbr where every byte is precious as one only has 507 bytes to play with (512-3 for the jump-2 for the end marker).
I don´t use gcc for asm but  I´m sure there is an option to enable optimization level (-O ?).
When I write bootloaders and kernels for embedded x86 based boards where the absolute position of certain data structures and calls are crucial, I prefer to do my own optimization and not have to wrestle with the compiler as to what code goes where.
That is why I use Fasm.
Sometimes relying on the compiler to automatically do all the optimizations does not yield the expected results.
Title: Re: TC and memory
Post by: genec on April 02, 2013, 08:59:45 PM
Hi genec
You are quite right that with microcode and especially mbr where every byte is precious as one only has 507 bytes to play with (512-3 for the jump-2 for the end marker).
I don´t use gcc for asm but  I´m sure there is an option to enable optimization level (-O ?).
When I write bootloaders and kernels for embedded x86 based boards where the absolute position of certain data structures and calls are crucial, I prefer to do my own optimization and not have to wrestle with the compiler as to what code goes where.
That is why I use Fasm.
Sometimes relying on the compiler to automatically do all the optimizations does not yield the expected results.

440 is an MBR; 507 for a non-FAT VBR.  Floppies only have a VBR.

I've used NASM and GAS (GCC Assembler) and often prefer NASM but GAS has some nice portability.
Title: Re: TC and memory
Post by: Rich on April 13, 2013, 04:32:26 PM
Hi Paulo
Here is something else to consider, gcc also keeps the instruction mix in mind when it generates code. You might
want to check out chapters 12, 13, 20, 21, and 22 of this link:
http://www.phatcode.net/res/224/files/html/index.html
If you want more, check out:
http://www.agner.org/optimize/
Optimizing in assembly should usually be the last choice. Focusing on what you are asking the compiler to do and
how you implement algorithms will yield far higher returns. Case in point, I was sorting ~600,000 records using qsort.
The compare function that I provided qsort with performs a string comparison on the first field, and if those are equal,
does the same for the second field. After some thought, I realized I could compare the pointers for the first field to
determine if they were equal, and skip straight to the second string compare. That one tiny change reduced execution
time from 1.54 seconds to 1.05 seconds. Gcc compiled that compare function into 102 bytes of code. I wrote the
function in assembly and got it down to 67 bytes. The "big" payoff for the time spent hand optimizing the code, run
time dropped to 0.95 seconds. Qsort calls this function 7,258,316 times and I saved a whole 0.1 seconds. Not the
best use of my time. If I really decide I want to make this run faster, I will find a way to change how I deal with the data.
Title: Re: TC and memory
Post by: Paulo on April 14, 2013, 03:35:30 AM
Hi Rich

Quote
and I saved a whole 0.1 seconds. Not the best use of my time.

Sounds like corporate meetings, spending hours to take down minutes.  :)

On a serious note, you are of course correct that one has to make a trade off between time spent
on optimizing the code versus how much it will shave off the execution times.

However there are times where one has to "hand make" the code where a specific function or block of code
has to take a certain amount of time.
An example of this was when a wrote an ATAPI driver for a small microcontroller project several years back.
Some CD-ROMs were very fussy about how much time could elapse between sending the ATAPI packets
and reading the resulting returned data.

Another example is when code is position /offset dependent, I find that the compiler tends to re-organize things
the way it wants too and thus "breaks" the code.
I know there are ways to tell it in a make file but it's just too much bother hence I prefer the manual method.

So yes, manual optimization is sometimes required but for most cases, it's simply not worth it.
Title: Re: TC and memory
Post by: curaga on April 14, 2013, 01:50:15 PM
@Rich

For kicks, try writing it in C++ using std::sort. It inlines the comparison function, and your bottleneck seems to be function calls.
Title: Re: TC and memory
Post by: Paulo on April 14, 2013, 02:25:10 PM
I'm wondering how the Perl version will stack up.
http://perldoc.perl.org/sort.html (http://perldoc.perl.org/sort.html)
Title: Re: TC and memory
Post by: tinypoodle on April 14, 2013, 09:11:48 PM
http://perldoc.perl.org/sort.html (http://perldoc.perl.org/sort.html)


 :o


(http://upurs.us/thumb/47759.jpeg) (http://upurs.us/image/47759.png)



Way to go to discredit perl...   >:( 
Title: Re: TC and memory
Post by: Paulo on April 15, 2013, 09:04:45 AM
Hi tinypoodle
Don't know what happened there.
The site is obviously rubbish but strangely I have not
had trouble with it before.
Anyway the point I was trying to make is perhaps the
sort function of perl is slightly quicker then the C version
once the perl interpreter loads the script and runs.
It would be an interesting exercise to run strace on both
the C and perl versions using the same data and compare results.