WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: Compiler flags and semi-portable extensions for "newer" pcs?  (Read 4526 times)

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Compiler flags and semi-portable extensions for "newer" pcs?
« on: April 13, 2012, 09:53:49 AM »
The wiki page on creating extensions currently indicates that, for compatibility, the -march=i486 and -mtune=i686 options should be set.

This is obviously working very well for compatible extensions with Tiny Core, but I've been searching everywhere and can't find a good explanation of why those options were chosen (aside from it obviously working well!).  For example, why not just use -m32 instead, which seems more generic?  Is it that i486 provide some performance improvements over m32 (which seems to go back to support 386 instruction sets, per the docs), while also going back far enough to meet a minimum supportable hardware platform for Tiny Core?

Related to this, and assuming it is rooted in the minimum hw requirements, what if I want to make a portable TC extension for movement between my own pcs that can squeak out a bit more performance?  For example, I want to make a custom python extension (or even just a portable dir for frugal installs) that I can move around my servers that has presumably better performance than would be obtained with -march=i486 and -mtune=i686, when I am certain that all the machines/CPUs I will be running it on are < 5 years old.

I'm not looking to make this insanely complicated, just looking for a "better" set of compiler flags while remaining reasonably portable.  For example, I'm wary of running into compatibility issues with the support libraries needed for the python compilation.  To scope this, what I mean is that I'd ideally like to be able to stick with something similar to this simple python compilation chain, but with "modernized-but-still-portable" compiler flags:

Code: [Select]
tce-load -wi compiletc bzip2-lib bzip2-dev openssl-1.0.0 openssl-1.0.0-dev sqlite3 sqlite3-dev ncurses ncurses-dev
export CFLAGS="-march=i486 -mtune=i686 -Os -pipe"
export CXXFLAGS="-march=i486 -mtune=i686 -Os -pipe"
export LDFLAGS="-Wl,-O1"
wget http://python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2
tar xjf Python-2.7.3.tar.bz2
cd Python-2.7.3
./configure --prefix=/mnt/sda1/python --enable-ipv6
make
make test
sudo make altinstall

Since the support libs like bzip2 are presumably compiled with -march=i486 and -mtune=i686, does that lock in everything else down the chain to use those as well?  I'm out of my depth here, but do know that binary compatibility is a non-trivial issue.

Also - what if I wanted to port around 64 bit python (or other extension)?  I can't find any good information on best practices for making 64 bit extensions for TC.  I've got my 64-bit TC working (with core64.gz and vmlinuz64, which were tough enough to find already!).  It seems like -m64 would be a good and portable bet there, but don't really know.  Does -m64 overlap and eliminate the need for the -march and -mtune flags?


Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11759
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #1 on: April 13, 2012, 10:14:48 AM »
Hi qopit
The  -march=  option dictates the minimum required processor required for the software to run.
The  -mtune=  option I'm less certain of. I've seen the following in some GCC documentation:
Quote
-march: Generate instructions for the machine type CPU-TYPE. Specifying -march=CPU-TYPE implies -mtune=CPU-TYPE.
And found the following at another site:
Quote
-mtune={cpu} : Tunes the binary for given cpu on x86 architecture. This means that the binary will still execute on a 386 (if it did in the first place), while also including optimization for the cpu supplied to the '-mtune=' switch.
Draw your own conclusions.
I don't think libraries will that have been compiled with the current settings will cause any trouble.
Can't comment on how to deal with 64 bit.

Online Juanito

  • Administrator
  • Hero Member
  • *****
  • Posts: 14876
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #2 on: April 13, 2012, 10:34:35 AM »
-m32 and -m64 let gcc know whether to compile 32-bit or 64-bit applications/libraries.

In core64, the kernel has been compiled 64-bit to allow the use of more than 4gb of ram, but core64 will not work with 64-bit applications/libraries as-is.

As regards optimising using different optimisations (see http://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options), I'm not sure you'd see a big difference unless you were using something cpu-intensive and unless you recompiled the base and dependent libs with the same optimisations

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #3 on: April 13, 2012, 12:10:34 PM »
Thanks, guys.  I suppose the part that is bugging me is that it seems quite restrictive to limit the instruction set used to the 486 instruction set.  Presumably the large number of new instructions since i486 should add a significant performance increase for directly compiled code?

I definitely understand that for TC it is a good idea to limit to i486 so that older hardware can be brought back to life with TC.  What I'm after is figuring out what newer instruction set is further along the commonly supported timeline than i486 such that code compiled with that instruction set would be sure to run on CPUs built in the last 5 years.  That is certainly more suited to a gcc forum, I guess.  What would be awesome is a timeline of instructions sets, showing common Intel+AMD processors and what instruction sets they supported (ideally in gcc -march speak  :)).  I can't find such a beast, though.

Side note: While looking into the instruction set issue I found a great article on the instruction set war between Intel and AMD (and VIA).  It is worth a read/skim:
http://www.agner.org/optimize/blog/read.php?i=25

For the separate 64-bit aspect of my post... Juanito, what exactly did you mean by this:
Quote
In core64, the kernel has been compiled 64-bit to allow the use of more than 4gb of ram, but core64 will not work with 64-bit applications/libraries as-is.

Does that mean that getting python to be able to access > 4GB of memory is not a simple task on core64?  I've certainly not been able to achieve it so far... straight compilation gave me a 32-bit python.  Adding CFLAGS=-m64 didn't work, because the gcc in compiletc.tcz doesn't have 64 bit support and fails immediately.  I don't see a gcc64.tcz, or a compiletc64.tcz, so now I wonder if I'll be able to run any 64-bit apps at all.  But that leaves me completely confused as to what core64 gets you, so my confusion must be more complete than just that :).  I obviously need to look into the 64 bit core in more detail, but have struggled with this quest so far.

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #4 on: April 13, 2012, 12:37:06 PM »
On the core64 thing... is it that core64 enables the OS to support processes that total up to more than 4 GB of RAM, but "as-is" we can only have 32-bit applications due to there only being a support suite of 32-bit libraries easily available?

As a totally crude/artificial example to paraphrase that: is it that core64 could support 3x 32-bit processes, each of which only has access to 4 GB of RAM?  While supporting 64-bit processes that can themselves access > 4GB of RAM would require a full 64-bit support chain of gcc, libs, etc?

I found a related thread with this statement:
There are no 64 bit binaries or libraries.
User space is 32 bit.

and I'm shakily interpreting that as above.  I don't quite get the user space thing or how the 32-bit addressing inside a process would work (if I'm right in my understanding).  I'll need to read up on it.

This is throwing me back to the days of the "extended" vs "expanded" memory and the BS paging involved in accessing that <shudder>.  Is it similar?  I've not had to deal with memory management like this in a very long time and I've avoided the 32-bit/64-bit thing... the OS just makes memory available to me.  But I've not done any 64-bit development, so I'm now confused.

Offline bmarkus

  • Administrator
  • Hero Member
  • *****
  • Posts: 7183
    • My Community Forum
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #5 on: April 13, 2012, 12:56:41 PM »
qopit: Are you sure you need a 64-bit system?
Béla
Ham Radio callsign: HA5DI

"Amateur Radio: The First Technology-Based Social Network."

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11759
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #6 on: April 13, 2012, 01:02:45 PM »
Hi qopit
Changing  -march  to a more recent processor will only provide a performance gain if the software running on it
makes use of faster instructions available on it. And then, the level of increased performance you get is a function
of what percentage of execution time is spent using those faster instructions. If an application spends most of its time
performing disk I/O and a small amount of time executing instructions that could be optimized, you won't see much
of a performance gain. All the above holds true for applications, libraries, and the kernel.


Online curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 11065
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #7 on: April 13, 2012, 01:14:38 PM »
Quote
Is it that i486 provide some performance improvements over m32 (which seems to go back to support 386 instruction sets, per the docs), while also going back far enough to meet a minimum supportable hardware platform for Tiny Core?

It's both that, and that glibc also has 486 as the minimum. 386 did not have some atomic instructions, some swap instructions etc. 486 as a baseline also guarantees we run on about 99% of existing x86 hardware. Some eBoxes for example have only 486 capable cpus, despite being 1Ghz.

Quote
For example, I want to make a custom python extension (or even just a portable dir for frugal installs) that I can move around my servers that has presumably better performance than would be obtained with -march=i486 and -mtune=i686, when I am certain that all the machines/CPUs I will be running it on are < 5 years old.

For custom builds, just do a survey around your hw and use the minimum set ;)
If all your hw is indeed <5 years old, and does not include Atoms, they all should be 64-bit capable. x64 requires sse2.

Therefore you could use "-march=pentium-m".

Quote
On the core64 thing... is it that core64 enables the OS to support processes that total up to more than 4 GB of RAM, but "as-is" we can only have 32-bit applications due to there only being a support suite of 32-bit libraries easily available?

Yes.

@Rich:
Recent gcc is quite good in producing optimized code even when you don't explicitly use say SSE.
The only barriers that can stop you are the ones you create yourself.

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #8 on: April 13, 2012, 01:55:21 PM »
bmarkus:
I'm sure I need a 64-bit system since processes will definitely chew up more than 4GB of memory in total at one time.  Upon further investigation/learning it seems I don't need 64-bit applications, though... as long as each process doesn't need to access more than 3GB of memory (which seems to be the userland default for linux... need to check core64, though).

For python, this means splitting tasks up with multiprocessing (or os.fork) vs threads to avoid total combined memory access > 3 GB (no single task will do it).  I'll also be using postgresql, but it is be fork-based and each process there shouldn't need more than 3 GB either.

So... it appears that 32-bit apps on a 64-bit core should be fine from a memory perspective.  Or at least so I understand so far :).  If I want to look at the trade-off between use of 64bit instructions and dragging 64-bit pointers everywhere, I'll have to do some benchmarking at some point.

Rich:
Thanks for the comments.  What you say makes perfect sense.  I have no idea what % of the time would be spent using the new/faster instruction sets, but I could be quite high beneficial or there isn't a massive point in beefing up instruction sets past what is available for a 80486.  eg: Why not drop down to i386?  The instruction set history shows that not much was added with the 486, at least from an integer instructions perspective.  My guess is that the integrated FPU in the 80486 may have something to do with it, but this guess only makes sense if  -march=i486 includes the i487 instruction set where -march=i386 doesn't include the i387 set... which may make sense since it wasn't integrated.  But then again, not all 486 chips had the FPU either.

If I'm right about -march=i486 being the "first" to provide FPU instructions, it is a clear minimum baseline to require, but has there been nothing beyond that that is of large enough benefit to bump the baseline?

In the end I'm sure I'll stick with the recommended/proven -march=i486 and -mtune=i686, but I figure the questions were worth asking.

At the very least, looking into this has made me have a better understanding of what the Gentoo crowd is about.  It also makes me wonder what the heck the larger distros (Debian, RHEL, or even Windows) do regarding instruction set usage for performance+portability of libs, or how much is abstracted out by the OS-provided libraries.

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #9 on: April 13, 2012, 02:01:58 PM »
Curaga: Awesome.  Thanks!  I read yours after starting/posting my reply #8... I should have edited it based on your post.

Online curaga

  • Administrator
  • Hero Member
  • *****
  • Posts: 11065
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #10 on: April 13, 2012, 02:19:01 PM »
SSE* can have a noticable benefit, but at the cost of dropping support for older cpus.

GCC on x86 always creates float instructions; it's up to the kernel to emulate them if there is no fpu. We have that disabled in our kernel, since it would just add size when we can have a fpu as a minimum requirement.
The only barriers that can stop you are the ones you create yourself.

Offline gerald_clark

  • TinyCore Moderator
  • Hero Member
  • *****
  • Posts: 4254
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #11 on: April 13, 2012, 04:37:46 PM »
qopit, I for example have several machines that run 486 compatible processors.
Anything not compiled with -march=i486 will segfault.
These are not old machines either.
Many embedded computers use the 486 on a chip type processors.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11759
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #12 on: April 14, 2012, 12:45:39 AM »
Hi curaga
I agree, gcc does a fine job when it comes to optimization. Occasionally I will reorder a sequence of instructions
based on the variables they are manipulating to see if gcc will produce smaller output. I'd say about 19 times out
of 20 there is no change because the optimization routines are that good.

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11759
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #13 on: April 14, 2012, 01:49:03 AM »
Hi qopit
When I'm looking for performance in a program I'm working on, compiler flags are the last thing I look at. In fact, even
though I've played with them on occasion out of curiosity, I always use  -march=i486 -mtune=i686 -Os -Wall in the end.
If I want more performance, I look at the program to see which sections lend themselves to optimization. While gcc does
a good job of generating optimized code, a careful review of program flow in many cases will produce a much greater
performance boost than any compiler switches will. All it takes is a suboptimal algorithm, loop, or even worse, compound
loop, to produce a significant impact on performance.

Offline qopit

  • Jr. Member
  • **
  • Posts: 81
Re: Compiler flags and semi-portable extensions for "newer" pcs?
« Reply #14 on: April 14, 2012, 08:18:50 AM »
Rich:
Code quality/structure is clearly FAR more important than incremental benefits obtainable with compiler optimization.  I'm mostly interested in the "best" (for performance AND compatibility) compiler settings from an academic/curiousity perspective, as well as for code which is "done" or that I don't have control over myself.

This thread, and associated side research, has been very informative on that front.  Thanks, all!