« Some notes on the ODROID U3 | Home | Slackware on the ODROID U3 »

More ODROID adventure, or "Young man, in this household we obey Ohm's Law"

Wed 9 Apr 2014 by mskala Tags used: , ,

As of last update, I had given up on setting up the ODROID U3 without connecting a monitor to it, and was shopping for a micro-HDMI cable. Here are some further notes

Finding the micro-HDMI cable proved to be easy. (London Drugs had it, $6.99.) But I found that what I thought was an HDMI monitor at school, wasn't, and this cable wasn't all I needed. I ended up taking a crash course in modern computer video standards.

Subsequent to SVGA, which was the last time I paid close attention to this stuff, it seems that we've been subjected to DVI, HDMI, and DisplayPort, in that order. The monitor at school that I thought was HDMI was actually a DVI monitor fed by a computer with a DisplayPort port, through an adapter cable. These three standards are sufficiently backward compatible that a more recent computer can drive an older monitor through a passive adapter; but going in the other direction (for instance, an HDMI computer trying to drive a DisplayPort monitor) may require more elaborate electronics, as may interfacing any of these to SVGA. DVI has about five different versions, but they're fortunately all compatible enough that I didn't have to worry about that; and HDMI has at least regular and "micro" versions, probably also a "mini," which are more or less compatible electrically except needed the connectors to be adapted.

When all the shouting was over I ended up with a regular-HDMI to some-flavour-of-DVI adapter - one with a female regular-HDMI connection on it, which apparently is not the most common way for this type of adapter to be built. That plus the micro-HDMI to regular-HDMI cable I already had add up to being able to plug the ODROID into a DVI monitor; and my monitor at home has a DVI port, so I don't even need to drag the whole works to school anymore. While I was shopping for these I also bought a dollar-store $3 USB keyboard, because although I have a couple of spare keyboards at home, they're all mini-PS/2.

And, while I was at it, I picked up a new 8G microSD card, because I was fed up with the bus resets from the Hardkernel-supplied one, and didn't want microSD card misbehaviour to muddy the waters of the other problems I was debugging. It is still on my to-do list to go through my receipts and figure out how much all these extra purchases have cost me. More than the price of the ODROID U3 itself, that much is certain.

I didn't keep detailed notes, and don't really want to go through blow by blow all the subsequent debugging I did, but I'll hit a few highlights.

As I'd guessed, and despite assurances to the contrary from Hardkernel support, Hardkernel's XUbuntu operating system image requires configuration done with a local keyboard and monitor before it will be able to operate headless. It seems to work as well as any other OS image I've tried so far, apart from that.

I mentioned the "Quiet Giant" Debian image last time. Another one I tried, despite my dissatisfaction with Arch Linux, was Mani Dhillon's ARM Arch image. I tried several OS images because of the other problems I'm about to describe.

"ARM" isn't just one architecture, in the sense of a "uname -m" result. Just as a Linux PC may be x86_64 or i386 (or probably a few others), a machine called "ARM" may be any of several different things. And as a 64-bit PC processor can also run in 32-bit mode, which is only partially compatible with 64-bit mode, a recent ARM processor can also run in modes that are compatible with earlier ARM processors, and these different modes are at best partially compatible with each other. It matters which one your kernel and userland are built for.

As far as I can tell, the main difference between the different architecture strings (values of "uname -m") that my hardware could potentially support, is not word length but something called "hard float" or "soft float." Like really old PC processors, some ARM processors do not contain floating point hardware and must emulate it. The difference between "hard float" and "soft float" is not just whether floating-point is done in hardware or software; that would be too easy. Actually the issue is whether floating-point arguments to functions are passed in the hardware FPU registers (which you can't do unless you have an FPU) or in the CPU's space, which doesn't require an FPU but entails extra work moving things back and forth between the CPU and the FPU. Floating point emulation, if necessary, is handled more or less transparently and doesn't require specially compiled software. Even if you think you will seldom, or never, need to pass floating point numbers as arguments to functions, there are various other incompatible changes in the ABI between "hard float" and "soft float"; the floating-point thing is the main but not the only difference between them.

The ODROID U3 is capable of doing hard float, and hard float is better when you can use it, but (despite the insistence of some people who don't really know what they're talking about) hard float is not intrinsically better by a big enough margin for it to be a big deal. The bigger deal for me is that since I want to start with some other distro's OS image (and, importantly, kernel) and then overwrite it with a Slackware userland, both distributions must be the same architecture for that to work. That's why I went through several OS images - I was looking for one that would be compatible with Slackware ARM, which is soft float. I subsequently discovered Alien's ARM, which is Slackware ARM recompiled for hard float, but it's even more experimental and hackish than the other stuff I'm already facing.

Architecture strings I've seen for my hardware apparently include at least "armhf," "armv7l," and "armv7hl." I don't know which of these are hard or soft float, I've been warned that the Linux kernel sometimes lies about its own architecture on ARM, and I have not been able to find anywhere a list of possible ARM architecture names, the differences between them, or their cross-compatibility with each other.

All the OS images I've tried are fragile with respect to unclean shutdowns. If the system goes down due to power loss (or overheating, or general instability, see below...) it's likely to leave the microSD "disk" in such a state that on the next reboot, it will demand user input from the keyboard to proceed, whether the keyboard exists or not. This is, of course, out of the question in an embedded application, and in fact it makes me wonder about my whole idea of using Linux in this synth project at all. I think in order to have satisfactory stability I'm going to have to either switch to something completely different from Linux, or (more likely my choice) organize things so that the system continues running out of its initial RAM disk indefinitely, with all the important "disk" material in read-only mounts, instead of proceeding on the normal trajectory that starts with a RAM disk and then replaces it with a read-write mounted disk partition. I can't afford to mount anything read-write that is required to bring up the system, because it could become inoperable on power loss. And unless I can find an embedded Linux distribution actually intended for real-world use which has already solved this problem, it means I have to hack the distribution's startup scripts fairly deeply, which in turn means I have to acquire a solid understanding of the wacky ARM boot sequence.

There is a known issue involving the CPU temperature sensor. It can become in one of my fellow users' words "stucked" at 50 degrees. The sensor just keeps reporting 50 degrees regardless of the actual temperature. That is a bigger problem than it may sound, because (especially when using the supplied fanless heat sink, which I've been told 10000 times is adequate) the temperature sensor is mission-critical. This board depends heavily on CPU frequency scaling to control its overall heat output. The way things are supposed to work is that as it gets warmer, it automatically slows down, which makes it cool off, and it never overheats. With a stucked sensor the CPU never slows down, so it becomes hotter and hotter until it does overheat and crash, which is an unclean shutdown, which means it is unlikely to come up again without user intervention on the next boot. This happened to me the first time that I got it running in a headless configuration for any significant length of time. Fortunately, it appears that although Hardkernel have not been able to reproduce the problem reliably, the problem is associated with the HDMI connection (specifically, plugging and unplugging it during boot) and since I won't be doing that in actual operation, it's unlikely to affect me once I get past the debug phase.

There is a known issue involving "leakage current" through the ODROID U3 HDMI connection. Apparently HDMI monitors send a non-trivial amount of their own power back at the computer through the cable, and this confuses the power control circuitry on the ODROID U3 sufficiently that if you plug an HDMI monitor into the ODROID before you plug in the ODROID's power supply, then when you do plug in the ODROID's power supply, it will not boot; but its red power LED will shine dimly from the HDMI-supplied current, even before native power is supplied. You have to then manually press the power button built into the ODROID to make it boot. Hardkernel suggests manually soldering a resistor and capacitor onto the board to basically simulate a power-button press when power is applied, if you have what they describe as one of the "few industrial applications" that require the computer to actually work. I've observed this issue, but fortunately, I don't plan to use it with an HDMI monitor in normal operation, so this too will be irrelevant to me once I get it configured. This issue is strangely reminiscent of another known issue in which power coming into the ODROID through its USB ports can not just confuse, but destroy, a power controller chip. The suggested solution to that one involves removing (or at least cutting pins on) the surface-mount chip in question, and adding a jumper - a modification they did at the factory for my board. Next thing, I suppose we'll have the power controller chip confused by power coming in through the power connection.

Funnily enough...

Throughout these adventures I struggled with the general unreliability of the boot process. Even with a perfectly good OS image, freshly fscked, at the best of times when I'd connect power there would be only about a 20% chance that the ODROID would actually boot. In a good boot, the red power light comes on as soon as power is applied, and the blue "heartbeat" LED comes on dimly a split second later. Then after a second or so the heartbeat LED start to flash brightly - pairs of two bright flashes that look like they're separated by about 0.1 second between the flashes in a pair and 1.0 second between pairs. The green and yellow lights on the Ethernet port are also supposed to come on, and blink according to Ethernet traffic. In a bad boot, it goes through the dim-blue stage and usually gets in one or two bright blue flashes, but then the blue LED goes off and stays off. The Ethernet lights never come on. And examining the entrails on the microSD card afterward, sometimes the filesystem has errors, sometimes it's uncleanly unmounted but has no errors, and sometimes it is clealy unmounted (most likely, never mounted in the first place). The overall story seems to be that it is getting partway into the boot and then just dying. It survives for a variable length of time - sometimes it stops before it goes read-write on the filesystem, sometimes after, and occasionally it survives all the way into the network initialization and can accept SSH connections. This, of course, isn't a good way for an embedded system, or any system, to operate.

While we're talking about hearbeats, of course the "heartbleed" news broke in the middle of all this too and I had to spend a bunch of time running around securing everything. Stay safe, kids.

So. It doesn't help at all that a bad software image (such as soft float Slackware overlaid on hard float Ubuntu, which was one of the things I tried before I learned the difference between soft float and hard float) will also produce very similar symptoms to these. So when faced with a bad boot, I have no idea whether it's the software (in which case it'll continue to fail no matter how many times I try) or the apparently hardware-related issue that boots fail about 80% of the time regardless of the software, in which case it might work if I tried again. And each trial requires transferring the microSD card back to the desktop computer to fsck the filesystem. All this made debugging rather stressful.

One interesting thing I noticed was that successive failed boots in a short interval would fail faster. If the first one got off five hearbeat flashes before dying and I pressed the power button to try again, the second might only give three flashes, and the third attempt only two flashes. Waiting several minutes before trying again seemed to improve the chances of a good boot. That sure looked like something accumulating from one boot attempt to the next, but decaying over longer time periods, and given that I'm not operating a nuclear fission reactor here, there aren't many things that could accumulate in this system and later decay according to that pattern except excess heat.

My longest successful run so far was overnight, last night. I SSHed into the device and had it report its temperature readings every two seconds, and (not having had an HDMI monitor plugged in recently) it wasn't "stucked" and actually reported plausible readings. That worked until the morning when, while writing a comment on the Hardkernel support BBS, I tried to open a second SSH connection into the ODROID and it immediately crashed.

Hardkernel's stock response to all boot-related problems is that it must be your power supply's fault, especially if you did not buy your power supply from them. Among other things, it's your power supply's fault because you didn't plug in your power supply at all; your power supply is the wrong voltage; it doesn't have the positive rail on the inside as opposed to the outside of the coaxial plug; and it doesn't supply enough current. The ODROID U3 specifically requires a supply rated to deliver at least 2A at 5.00V plus or minus 0.25V.

Although Hardkernel's credibility isn't great at this point, I did test the power pretty carefully a couple days ago just to rule that out. My power supply is one I got from Digi-Key that is rated for up to 5A at 5V. I tested its open-circuit voltage as 5.18V, which is a little high but within specification. With the ODROID attached, the voltage fluctuates a bit, but a stable reading during boot seemed to be about 4.78V. That's on the low side, and it's worrying for such a powerful supply to droop that much under what should be way less than the rated load, but again, it is within specification. I did see some numbers on my voltmeter that were a fair bit lower, but I ignored them as the digital voltmeter's basic tendency to be confused by rapidly-changing input. (Where's a good old tube-input analog meter when you need one?) I even switched to ammeter mode and measured the current drawn by the ODROID - a little under 200mA in the closest thing to normal operation I managed to observe while doing this, with a spike to maybe 500mA at one point. That's all within spec, too.

But after being told yet again this evening by Hardkernel support that I ought to be using a power supply capable of at least 2A, it occurred to me to test the power in a different way. The thing is, my power supply does not actually plug directly into the ODROID. My plan is that the power supply should plug into a power distribution circuit, with wires from that to the ODROID and to other things that consume 5V. That's why I bought a much beefier power supply than the ODROID needs. That circuitry will all be soldered together if I ever managed to complete the device that the ODROID is meant to be part of; but in this initial testing phase, I have the power supply plugged into a matching power jack, and a pair of alligator clip leads (they look like jumper cables sized for Barbie and Ken's car) from there to the naked pigtails of a different-sized power plug that mates with the ODROID. The clip leads I was using were very thin (no better than 22 AWG, maybe even smaller), and roughly 25 years old. On a hunch, I tried the voltmeter at the other end.

Stable-ish voltage at the ODROID-plug pigtails during a failing boot: 4.78V. At the power jack, the other end of the clip leads, under the same conditions: 5.15V. The power supply wasn't drooping from 5.18V to 4.78V due to the 200mA load. It was only drooping from 5.18V to 5.15V, which is quite reasonable, and then the clip leads were eating the remaining 0.37V, with an apparent resistance of something like 1.85 Ohm. If the ODROID ever really did pull 2A of current through that, it would lose ten times as much, that is 3.70V, possibly seeing (if the power supply under those conditions drooped as far as its own specs would allow) as little as 1.05V delivered to the ODROID's power jack. I suspect things weren't really quite as bad as that, because there are some conservative assumptions in that calculation and the ODROID may possibly have some decent power decoupling (though I sure don't see anything that looks like an electrolytic capacitor on the board...), but it certainly seems likely that at least some of my problems were caused by good old E=IR in the clip leads, leading to a brownout of the ODROID.

It's not easy to measure small resistances without equipment that I either don't own, or don't want to unpack from storage right now; and the test leads plugged into my multimeter at the moment themselves have about half an Ohm of resistance and probably should be replaced, but I did manage to dig out some other clip leads that test as 0.1 Ohm or less. I didn't use these ones first because they are unshrouded and with them in the circuit I have to wrap everything with electrical tape to keep them from shorting out. But the bottom line is that with the replacement clip leads, the ODROID U3 came up and accepted SSH connections on the first attempt. That was about three hours ago. I'm not going to trust it until I see it run overnight and survive several reboots, because this could merely be one of the 20% successful boots I was also getting before the swap, but it looks promising.


(optional field)
(optional field)
Answer "bonobo" here to fight spam. ここに「bonobo」を答えてください。SPAMを退治しましょう!
I reserve the right to delete or edit comments in any way and for any reason.