New subject: HiKey: ARM TF BL1 hangs when compiled with GCC 4.9

20 Mar 2015


      ...
On Mar 19, 2015, at 9:24 PM, Tyler Baker tyler.baker@linaro.org wrote:
FYI, not sure if you are on 96boards dev list
Thanks Tyler!
[I'm not on the mailing list, so will have to reply in a new thread.]
...
---------- Forwarded message ----------
From: Jerome Forissier jerome.forissier@linaro.org
Date: 19 March 2015 at 11:18
Subject: [Dev] HiKey: ARM TF BL1 hangs when compiled with GCC 4.9
To: dev dev@lists.96boards.org
Hi all,
I am running the ARM Trusted Firmware on my HiKey board. I found that BL1 hangs if I build it with the version of GCC that comes with my Ubuntu 14.10 distribution [1] or with another 4.9 build from Linaro [2]. However, it works as expected if I use the 4.8 Linaro build [3] as recommended on the HiKey UEFI wiki [4].
Basically I found two separate issues, and I'm not sure if they are bugs in GCC or HiKey ATF. Here is the story...
With GCC 4.9 [1], the boot hangs, LED#2 blinks and "00000000f20003e8" is printed on UART0 about every second. Let's call this bug #1. The hang occurs in hi6220_pll_init() [5], execution never gets passed this line:
    mmio_write_32(0x0, 0xa5a55a5a);
So I checked the objdump outputs (bl1.dump).

Working compiler [3] gives:
  f98041f0:   d2800000        mov     x0, #0x0                        // #0
  f98041f4:   528b4b41        mov     w1, #0x5a5a                     // #23130
  f98041f8:   72b4b4a1        movk    w1, #0xa5a5, lsl #16
  f98041fc:   b9000001        str     w1, [x0]

Bad compiler [1] produces:
  f9804184:   d2800000        mov     x0, #0x0                        // #0
  f9804188:   b900001f        str     wzr, [x0]
  f980418c:   d4207d00        brk     #0x3e8


What?! Is there some kind of smart detection in the compiler assuming that one shouldn't write to address zero?
Yes, this is exactly correct.  Compiler assumes that it is compiling normal application user-level code, and treats references to null address as undefined code, which it is free to optimize as it sees fit. In this case the compiler seems to have decided that value in w1 register is never used because it is live only on the path that leads to write-to-null.  Note, that the compiler still kept the write itself to keep parity in thrown exceptions between original and optimized code.
Any code that references null address as a valid memory location should use -fno-delete-null-pointer-checks compiler flag.  This flag is often added automatically for toolchains that target bare-metal, and, I'm guessing, you are using a "normal" aarch64-linux-gnu toolchain.
...
If I change address to 0x4, the code goes past this location but later hangs with the same LED status as above (b0100) and code "0000000096000021" on the console. This is bug #2.
I tracked it down to the initialization of some structures on the stack when entering usb_handle_control_request() [6]. Looks like an alignment issue, since removing the packed attribute on struct usb_endpoint_descriptor [7] fixes the bug.
Let us (Linaro TCWG) know if this doesn't go away with -fno-delete-null-pointer-checks.  The best way is to file a bug for GCC product in bugs.linaro.org.
...
So... What kind of bugs do you guys think we have here, and who should I report them to?
[1] aarch64-linux-gnu-gcc (Ubuntu/Linaro 4.9.1-16ubuntu6) 4.9.1
[2] aarch64-linux-gnu-gcc (Linaro GCC 2014.11) 4.9.3 20141031 (prerelease)
[3] aarch64-linux-gnu-gcc (crosstool-NG linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04) 4.8.3 20140401 (prerelease)
[4] https://github.com/96boards/documentation/wiki/UEFI
[5] https://github.com/96boards/arm-trusted-firmware/blob/bbd623798cb775c4c0445c...
[6] https://github.com/96boards/arm-trusted-firmware/blob/bbd623798cb775c4c0445c...
[7] https://github.com/96boards/arm-trusted-firmware/blob/bbd623798cb775c4c0445c...
--
Maxim Kuvyrkov
www.linaro.org

Re: [Dev] HiKey: ARM TF BL1 hangs when compiled with GCC 4.9