Discussion:
What -mfpu option is used with neon, vfpv3 and vfpd32 flag?
Jeffrey Walton
2016-07-22 01:33:38 UTC
Permalink
Hi Everyone,

I'm looking at the features of a BeagleBone Black. Its /proc/cpuinfo is below.

I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags
neon and vfpv3 flags means I want something more than -mfpu=neon-fp16,
but I'm not sure what that is.

My question is, what GCC ARM option is used when we encounter the
neon, vfpv3 and vfpd32 flags?

Thanks in advance.

**********

$ cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 996.14
Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc08
CPU revision : 2

Hardware : Generic AM33XX (Flattened Device Tree)
Revision : 0000
Serial : 0000000000000000
Jim Wilson
2016-07-22 03:30:16 UTC
Permalink
Post by Jeffrey Walton
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags
neon and vfpv3 flags means I want something more than -mfpu=neon-fp16,
but I'm not sure what that is.
neon implies vfvp3 and 32 D-registers and asimd/neon support, so that
part is correct. it isn't obvious to me if you have the
half-precision float support. The "half" printed by the kernel means
that half-word loads are supported, which is only false for some
obsolete parts I think. The kernel doesn't appear to be checking to
see if the hardware has half-precision float support or not, so you
can't determine that from /proc/cpuinfo.

Jim
Jeffrey Walton
2016-07-22 03:48:31 UTC
Permalink
Post by Jim Wilson
Post by Jeffrey Walton
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags
neon and vfpv3 flags means I want something more than -mfpu=neon-fp16,
but I'm not sure what that is.
neon implies vfvp3 and 32 D-registers and asimd/neon support, so that
part is correct. it isn't obvious to me if you have the
half-precision float support. The "half" printed by the kernel means
that half-word loads are supported, which is only false for some
obsolete parts I think. The kernel doesn't appear to be checking to
see if the hardware has half-precision float support or not, so you
can't determine that from /proc/cpuinfo.
Thanks Jim.

Is there an arm-msr-tools or similar that has setuid so we can access the MSRs?

My thinking is, I can tell people to install arm-msr-tools so we can
query for the features directly. I want to avoid telling people to run
a test script as root.

Jeff
Jim Wilson
2016-07-22 04:15:01 UTC
Permalink
Post by Jeffrey Walton
Is there an arm-msr-tools or similar that has setuid so we can access the MSRs?
I'm not familiar with any such tool, but I haven't looked for one
before. I found an x86 msr-tools project at github with a web search.
It seems to be a standard part of debian/ubuntu x86 distros. I don't
see an obvious arm equivalent.

You could ask the kernel developers to add a hardware capability
(hwcap) check for half-precision fp and emit that info into the
/proc/cpuinfo file, though it would take a while for that to be
implemented and propagate to your users.

Jim
Jeffrey Walton
2016-07-22 04:13:11 UTC
Permalink
Post by Jim Wilson
Post by Jeffrey Walton
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags
neon and vfpv3 flags means I want something more than -mfpu=neon-fp16,
but I'm not sure what that is.
neon implies vfvp3 and 32 D-registers and asimd/neon support, so that
part is correct. it isn't obvious to me if you have the
half-precision float support. The "half" printed by the kernel means
that half-word loads are supported, which is only false for some
obsolete parts I think. The kernel doesn't appear to be checking to
see if the hardware has half-precision float support or not, so you
can't determine that from /proc/cpuinfo.
OK, so looking at
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0002a/ch01s03s02.html,
it appears the "minimum" of the -mfpu option is VFPv3-D16. Since I
have 32 D-regs, I can use the one VFPv3-D32, which should equate to
-mfpu=neon-vfp3 (which does not seem to exist).

I can't use -mfpu=neon-vfpv4 because vfpv4 is not signaled, and it
could be missing the half word and fma extensions implied with vfpv4.

So I guess the question is, what do I use for -mfpu=neon-vfp3 (or
-mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?

Thanks again for the help with this.
Jim Wilson
2016-07-22 04:19:18 UTC
Permalink
Post by Jeffrey Walton
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or
-mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?
The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.

Jim
Jeffrey Walton
2016-07-22 04:21:08 UTC
Permalink
Post by Jim Wilson
Post by Jeffrey Walton
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or
-mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?
The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.
Perfect, thanks.

Jeff
Richard Earnshaw
2016-07-22 09:14:44 UTC
Permalink
Post by Jeffrey Walton
Post by Jim Wilson
Post by Jeffrey Walton
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or
-mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?
The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.
Perfect, thanks.
Jeff
_______________________________________________
linaro-toolchain mailing list
https://lists.linaro.org/mailman/listinfo/linaro-toolchain
According to https://beagleboard.org/black, this board contains a
Cortex-A8. So -mfpu=neon is correct.


https://community.arm.com/groups/tools/blog/2013/04/15/arm-cortex-a-processors-and-gcc-command-lines

R.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Loading...