On Fri, Mar 20, 2015 at 6:58 AM, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> wrote:
> On Mar 19, 2015, at 9:24 PM, Tyler Baker <tyler.baker@linaro.org> wrote:
>
> FYI, not sure if you are on 96boards dev list

Thanks Tyler!

[I'm not on the mailing list, so will have to reply in a new thread.]

>
> ---------- Forwarded message ----------
> From: Jerome Forissier <jerome.forissier@linaro.org>
> Date: 19 March 2015 at 11:18
> Subject: [Dev] HiKey: ARM TF BL1 hangs when compiled with GCC 4.9
> To: dev <dev@lists.96boards.org>
>
>
> Hi all,
>
> I am running the ARM Trusted Firmware on my HiKey board. I found that BL1 hangs if I build it with the version of GCC that comes with my Ubuntu 14.10 distribution [1] or with another 4.9 build from Linaro [2]. However, it works as expected if I use the 4.8 Linaro build [3] as recommended on the HiKey UEFI wiki [4].
>
> Basically I found two separate issues, and I'm not sure if they are bugs in GCC or HiKey ATF. Here is the story...
>
> With GCC 4.9 [1], the boot hangs, LED#2 blinks and "00000000f20003e8" is printed on UART0 about every second. Let's call this bug #1. The hang occurs in hi6220_pll_init() [5], execution never gets passed this line:
>     mmio_write_32(0x0, 0xa5a55a5a);
>
> So I checked the objdump outputs (bl1.dump).
>
> - Working compiler [3] gives:
>     f98041f0:   d2800000        mov     x0, #0x0                        // #0
>     f98041f4:   528b4b41        mov     w1, #0x5a5a                     // #23130
>     f98041f8:   72b4b4a1        movk    w1, #0xa5a5, lsl #16
>     f98041fc:   b9000001        str     w1, [x0]
>
> - Bad compiler [1] produces:
>     f9804184:   d2800000        mov     x0, #0x0                        // #0
>     f9804188:   b900001f        str     wzr, [x0]
>     f980418c:   d4207d00        brk     #0x3e8
>
> What?! Is there some kind of smart detection in the compiler assuming that one shouldn't write to address zero?

Yes, this is exactly correct.  Compiler assumes that it is compiling normal application user-level code, and treats references to null address as undefined code, which it is free to optimize as it sees fit. In this case the compiler seems to have decided that value in w1 register is never used because it is live only on the path that leads to write-to-null.  Note, that the compiler still kept the write itself to keep parity in thrown exceptions between original and optimized code.

Any code that references null address as a valid memory location should use -fno-delete-null-pointer-checks compiler flag.  This flag is often added automatically for toolchains that target bare-metal, and, I'm guessing, you are using a "normal" aarch64-linux-gnu toolchain.

Yes, I am using the "normal" (non-bare-metal) toolchain, with the -ffreestanding flag. Now I understand what's happening

>
> If I change address to 0x4, the code goes past this location but later hangs with the same LED status as above (b0100) and code "0000000096000021" on the console. This is bug #2.
>
> I tracked it down to the initialization of some structures on the stack when entering usb_handle_control_request() [6]. Looks like an alignment issue, since removing the packed attribute on struct usb_endpoint_descriptor [7] fixes the bug.


Let us (Linaro TCWG) know if this doesn't go away with -fno-delete-null-pointer-checks.  The best way is to file a bug for GCC product in bugs.linaro.org.

-fno-delete-null-pointer-checks does make the problem go away. So no compiler issue here ;-)

Thanks for the explanation.

-- 
Jerome