Tuesday, September 22, 2015

Qt 5.5 on Raspberry Pi 2 (1)

Qt 5.5 cross-compilation for Raspberry Pi in Fedora 22

The Qt framework is one of the best-known solutions for cross-platform C++ development, with quite a bit of history behind it. While it made a significant turn towards (semi-)scripted declarative development for mobile platforms in the recent years, by focusing on QML, it still offers a big, consistent and well-documented library of C++ classes, ranging from process and thread control, networking, unified database interfaces, all the way up to widgets and other complex GUI elements.
Rasberry Pi (RPi) is a family of miniature low-cost ARM-based computers built for enthusiasts and educational purposes, that, thanks to its relatively powerful integrated GPU, can punch quite a bit above its price category, at least in some scenarios. One of the more popular operating system choices for RPi computers is Debian-based Raspbian, whose "official" version is available for download in the form of image file, ready for writing to a (micro)SD card, RPis' non-volatile storage of choice.
Raspberry Pi 2 (RPi2) is the latest and the most powerful member of RPi family. It features a major improvement in the CPU area, scaling up from a single-core 700MHz ARM11 to a quad-core 900MHz ARM Cortex A7. Note that Cortex family is newer and more powerful, so in this case A7 is better than A11. See, for example, this presentation for a simple description of the move from "classic" ARMs to Cortex series. 

Why cross-compile?

With the advent of the quad-core powered RPi2, a lot of people advocated the move of development from workstations and desktops down to RPi2 itself. On one hand, it is a reasonable suggestion: setting up cross-compilation environment is painful. Cross-compilation means that we use one machine (traditionally called "build") to produce binary code for a different type of machine ("host"). We do need it in this scenario, because most desktops are running on Intel (x86) architecture, completely different from ARM. In that respect, developing directly on the "host" makes sense. All the needed libraries and compilers are already present and meant for the ARM architecture. On the other hand, running demanding applications (and any sort of moderately complex development endeavor will include those) on RPi2 is still an exercise in patience, bordering on futility. While RPi2 engineers can be rightfully proud to have mass-produced a $35 quad-core system, it is nowhere near any reasonable (i3, let alone i5 or i7) desktop workstation's performance in terms of raw power. It suffers additionally from relatively slow SD card access, especially visible in large compilation projects with many input and output files. Bottom line is, RPi(2) can do some amazing things - for its price. When it comes to general computing power (think of a task like compiling the Linux Kernel), it is hopelessly outgunned by any semi-serious desktop machine younger than 5 years. The reason is simple and it's the same one that makes it very difficult for laptops to reach desktop performance in the same price range: total available electrical power. While desktops commonly slurp up hundreds of Watts, laptops tend to stay below the 100W mark (including the screen) and RPi2 sits at around 3W! No amount of optimizations will make a computer, belonging to the same or similar technological tier as the competition, gap the two orders of magnitude difference in available power. The difference is such that we can run our development inside a virtual machine on a desktop host and still get much better overall experience than while doing it directly on a RPi.

Environment

I based my attempt on this very detailed tutorial, written for Qt 5.4 and (probably) Linux Mint host. However, I will be running a fully updated 64-bit Fedora 22 host (inside a virtual machine) and compiling Qt 5.5 for RPi.

Qt 5.5

It is available from http://www.qt.io/download/ and you will have to click through several nag screens, indicating that, yes, you are indeed doing this for an open-source or private/educational use. The standard download is just a small GUI installer which you can run directly. It will ask for Qt Account credentials, but that step can be skipped. After it pulls the meta information from Qt servers pick a location for Qt install, and then select a subset of packages for installation. As per the aforementioned tutorial, I chose the sources, to be compiled for RPi, and desktop gcc pre-build components. The tools/Qt Creator item cannot be unselected, apparently, even though Qt Creator is available independently, through Fedora repositories.

Qt component selection



This selection required 2.24GB of space and took a while to complete.

Raspbian 

Get it from https://www.raspberrypi.org/downloads/raspbian/ - at the time of this writing, it is a version based on Debian Wheezy (release date May 5th 2015). The zip file is around 1GB in size and it unpacks to a 3.2GB image (.img) file. Keep the original of either zip or img file! Modifications will be done to the file later on, and it is usually easier to copy the original back from local storage than having to download it again, in case something goes wrong.
The image file is made by copying the data byte-for-byte from an SD card (actually from its corresponding block device). A common method of accessing image file contents under Linux is a so-called loop-mounting. To be able to read from (and, in some cases, write to) files that are inside an image file, the system "pretends" that the image file is actually a block device, like a hard disk. However, if the image file contains more than one file system partition, an additional step is typically required before loop-mounting: finding out the partition layout of the image file. This can be accomplished by the fdisk command. In the case of Raspbian image, the layout looks like this:

[miroslav@localhost Raspbian]$ fdisk -l 2015-05-05-raspbian-wheezy.img
Disk 2015-05-05-raspbian-wheezy.img: 3.1 GiB, 3276800000 bytes, 6400000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xa6202af7

Device                          Boot  Start     End Sectors Size Id Type
2015-05-05-raspbian-wheezy.img1        8192  122879  114688  56M  c W95 FAT32 (LBA)
2015-05-05-raspbian-wheezy.img2      122880 6399999 6277120   3G 83 Linux

This tells us that there is a small (56MB) FAT32 partition at the start of the image, followed by a larger (3GB) Linux partition. We don't want the first one, because it contains only boot information. The full filesystem is located inside the second, Linux, partition. We need to know where exactly is the second partition located inside the image file. Some simple math will help: note that the sector size is 512 bytes, and that the second partition starts at sector 122880. This translates to 512 bytes/sector * 122880 sectors = 62914560 bytes offset from the image start. Note this number. It may change between different Raspbian versions.
We can now perform the loop-mounting procedure. First, we need a mount point which we will use to access the files:


[root@localhost ~]# mkdir /mnt/rasp-pi-rootfs

Then, we perform the loop-mounting itself:

[root@localhost ~]# mount -o loop,offset=62914560 /home/miroslav/Projects/Raspbian/2015-05-05-raspbian-wheezy.img /mnt/rasp-pi-rootfs/


Note the use of calculated offset in this command. Also note that the Qt 5.4 tutorial I used has a nasty typo in that command line. If you just copy & paste it, it will fail for no obvious reason and with no error messages (it just prints out the help). The issue is that the tutorial text uses some sort of long dash character in "-o", which the command line parser doesn't understand, but it's almost indiscernible from the proper character on screen.
We should now be able to see the files from the Linux partition inside the image file:

[miroslav@localhost Raspbian]$ ls -l /mnt/rasp-pi-rootfs/
total 92
drwxr-xr-x.   2 root root  4096 May  7 00:58 bin
drwxr-xr-x.   2 root root  4096 May  7 00:23 boot
drwxr-xr-x.   3 root root  4096 May  7 00:58 dev
drwxr-xr-x. 104 root root  4096 May  7 01:31 etc
drwxr-xr-x.   3 root root  4096 May  7 00:20 home
drwxr-xr-x.  14 root root  4096 May  7 01:15 lib
drwx------.   2 root root 16384 May  7 00:10 lost+found
drwxr-xr-x.   2 root root  4096 May  7 00:12 media
drwxr-xr-x.   2 root root  4096 Jan 11  2015 mnt
drwxr-xr-x.   6 root root  4096 May  7 01:24 opt
drwxr-xr-x.   2 root root  4096 Jan 11  2015 proc
drwx------.   2 root root  4096 May  7 00:12 root
drwxr-xr-x.   7 root root  4096 May  7 01:23 run
drwxr-xr-x.   2 root root  4096 May  7 01:15 sbin
drwxr-xr-x.   2 root root  4096 Jun 20  2012 selinux
drwxr-xr-x.   2 root root  4096 May  7 00:12 srv
drwxr-xr-x.   2 root root  4096 Oct 13  2013 sys
drwxrwxrwt.   2 root root  4096 Jan 11  2015 tmp
drwxr-xr-x.  10 root root  4096 May  7 00:12 usr
drwxr-xr-x.  11 root root  4096 May  7 00:12 var


It looks ok. Different versions of Raspbian might have slightly different folder layout and different timestamps, but in general this is what you would expect to see.


Cross-compiler

To be able to produce binary code for RPi on a desktop PC, we need a cross-compiler. The one used with RPi is publicly available from a git repository, and we can simply clone it in:

[miroslav@localhost Projects]$ mkdir rpi-tools
[miroslav@localhost Projects]$ cd rpi-tools/
[miroslav@localhost rpi-tools]$ git clone https://github.com/raspberrypi/tools.git
Cloning into 'tools'...
remote: Counting objects: 17851, done.
remote: Total 17851 (delta 0), reused 0 (delta 0), pack-reused 17851
Receiving objects: 100% (17851/17851), 325.16 MiB | 364.00 KiB/s, done.
Resolving deltas: 100% (12185/12185), done.
Checking connectivity... done.
Checking out files: 100% (15867/15867), done.

The cross-compiler needs some 32-bit libraries, and since this is a 64-bit Fedora 22, I had to add them using the new package updater command line utility, dnf :

[root@localhost ~]# dnf install glibc.i686 libstdc++.i686 zlib.i686

Accept all the dependencies and install. You can verify that the cross-compiler can be started by asking it for version info:

[miroslav@localhost ~]$ /home/miroslav/Projects/rpi-tools/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-g++ -v
Using built-in specs.
(...)
Thread model: posix
gcc version 4.8.3 20140106 (prerelease) (crosstool-NG linaro-1.13.1-4.8-2014.01 - Linaro GCC 2013.11)

If it fails with a message like "/lib/ld-linux.so.2: bad ELF interpreter: No such file or directory" there is probably a problem with missing 32-bit libraries.

Fixing the absolute symlinks


At this point, we have the Raspbian root filesystem mounted, with all of its libraries, and we also have the cross-compiler. One could think that we can start the Qt cross-compile process, however, that is not the case. For some reason, a problem persists in Raspbian releases: some of the libraries have absolute symbolic links pointing to them (see, for example, this for a short introduction to Linux library placement and naming). Ordinarily, when the root filesystem partition is used on RPi, absolute symbolic links are not a problem. Whether we link to libsomething.so.1.0.0 on a RPi absolutely (e.g. /lib/libsomething.so.1.0.0) or relatively (like, say, ../../lib/libsomething.so.1.0.0) is not an issue, because both ways point to the same file. However, in our setup, the absolute path (/lib/libsomething.so.1.0.0) would either point to a Fedora's own library (which we definitely don't want, since it is probably not even the same version and it is compiled for Intel instead of ARM) or, more commonly, to nothing at all, i.e. we would have a broken link. To fix the situation, we can use a script that is a part of "cross-compile-tools". Sadly, the Gitorious link given by the original tutorial is no longer valid, but we can use some of the clones, such as https://github.com/darius-kim/cross-compile-tools :

[miroslav@localhost Projects]$ git clone https://github.com/darius-kim/cross-compile-tools
Cloning into 'cross-compile-tools'...
remote: Counting objects: 56, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 56 (delta 18), reused 56 (delta 18), pack-reused 0
Unpacking objects: 100% (56/56), done.
Checking connectivity... done.

The point of interest is the script called "fixQualifiedLibraryPaths". However, there is one minor intervention needed on it, before it can help us build Qt. If you examine the script, the key function is "adjustSymLinks" which is called only once, for the $ROOTFS/usr/lib top level folder, $ROOTFS being the folder in which our Raspbian root filesystem is mounted. Since the function acts only with the depth of one (see the "find . -maxdepth 1 (...)" line), it won't change the absolute links in the $ROOTFS/usr/lib/arm-linux-gnueabihf folder, which are also needed and broken. To remedy this, we can simply add another call to "adjustSymLinks", like so (use any text editor):

(...)
adjustSymLinks $ROOTFS/usr/lib "../.."
adjustSymLinks $ROOTFS/usr/lib/arm-linux-gnueabihf "../../.."
(...)

The script is now ready, and we can use it to modify the root filesystem:

[root@localhost ~]# cd /home/miroslav/Projects/cross-compile-tools/
[root@localhost cross-compile-tools]# ./fixQualifiedLibraryPaths /mnt/rasp-pi-rootfs /home/miroslav/Projects/rpi-tools/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-gcc
Passed valid toolchain
Adjusting the symlinks in /mnt/rasp-pi-rootfs/usr/lib to be relative
Adjusting the symlinks in /mnt/rasp-pi-rootfs/usr/lib/arm-linux-gnueabihf to be relative
./libnss_nisplus.so
../../../lib/arm-linux-gnueabihf/libnss_nisplus.so.2
./libresolv.so
../../../lib/arm-linux-gnueabihf/libresolv.so.2
./libanl.so
../../../lib/arm-linux-gnueabihf/libanl.so.1
./libcrypt.so
../../../lib/arm-linux-gnueabihf/libcrypt.so.1
./libnss_compat.so
../../../lib/arm-linux-gnueabihf/libnss_compat.so.2
./libthread_db.so
../../../lib/arm-linux-gnueabihf/libthread_db.so.1
./libz.so
../../../lib/arm-linux-gnueabihf/libz.so.1.2.7
./libutil.so
../../../lib/arm-linux-gnueabihf/libutil.so.1
./libnss_hesiod.so
../../../lib/arm-linux-gnueabihf/libnss_hesiod.so.2
./libdl.so
../../../lib/arm-linux-gnueabihf/libdl.so.2
./libnsl.so
../../../lib/arm-linux-gnueabihf/libnsl.so.1
./libnss_nis.so
../../../lib/arm-linux-gnueabihf/libnss_nis.so.2
./libm.so
../../../lib/arm-linux-gnueabihf/libm.so.6
./libBrokenLocale.so
../../../lib/arm-linux-gnueabihf/libBrokenLocale.so.1
./librt.so
../../../lib/arm-linux-gnueabihf/librt.so.1
./libnss_dns.so
../../../lib/arm-linux-gnueabihf/libnss_dns.so.2
./libnss_files.so
../../../lib/arm-linux-gnueabihf/libnss_files.so.2
./libcidn.so
../../../lib/arm-linux-gnueabihf/libcidn.so.1
./libpng12.so
../../../lib/arm-linux-gnueabihf/libpng12.so.0
Testing for existence of potential debian multi-arch dir: /mnt/rasp-pi-rootfs/usr/lib/arm-linux-gnueabihf
Debian multiarch dir exists, adjusting
Adjusting the symlinks in /mnt/rasp-pi-rootfs/usr/lib/arm-linux-gnueabihf to be relative

We see that a number of symbolic links have been fixed.

Cross-compiling Qt 5.5

Finally, we can start the cross-compilation process. Using neat shortened variables from the original tutorial:

[miroslav@localhost ~]$ export RPI_SYSROOT=/mnt/rasp-pi-rootfs
[miroslav@localhost ~]$ export RPI_TOOLCHAIN=~/Projects/rpi-tools/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin/arm-linux-gnueabihf-



..and the magical configuration line:

[miroslav@localhost Src]$ cd ~/Projects/Qt/5.5/Src/
[miroslav@localhost Src]$ ./configure -opengl es2 -device linux-rasp-pi-g++ -device-option CROSS_COMPILE=$RPI_TOOLCHAIN -sysroot $RPI_SYSROOT -opensource -confirm-license -optimized-qmake -reduce-exports -release -make libs -prefix /usr/local/qt5pi -skip qtwebkit


It will produce an output with a summary of detected and selected features. Good thing to see was this:


EGLFS Raspberry Pi . yes 

Things look promising! We might be able to use the eglfs platform on RPi. Now, to start the actual compilation process (and go get a coffee or two :)):

[miroslav@localhost Src]$ time gmake -j2
(...tons of output...)
real    40m58.931s
user    72m8.167s
sys    6m30.914s

Not too bad for a virtual machine running on 2 cores of an i5 CPU. I've used -j2 to have make start 2 jobs in parallel, because my virtual machine is using 2 cores. In general, if you have N cores to use for building, -jN or -jN+1 are good choices.
We also want to install this:

[root@localhost ~]# cd /home/miroslav/Projects/Qt/5.5/Src
[root@localhost Src]# make install
(...kilos of output...)
[root@localhost Src]# sync

The cross-compiled Qt 5.5 libraries are now installed inside our Raspbian image, in a /usr/local/qt5pi folder, the location chosen by our configure call prior to compilation. We also executed the sync command, to make sure everything is definitely written onto the disk and into the Raspbian image. The image should be saved, but before that we should unmount it:

[root@localhost ~]# umount /mnt/rasp-pi-rootfs

If an error message appears telling you that the filesystem is busy, check if you have any open files or command prompts inside the mounted RPi root filesystem image, close them, and then try again. Copy the Raspbian image and name it in a way that will tell you it contains the cross-compiled Qt 5.5. Next comes the installation of that new Raspbian image to an SD card, configuration of Qt Creator IDE on the development machine and finally a test application executing on RPi!






 

Tuesday, August 25, 2015

Linux bridging of VLAN interfaces and bridge IDs

Or, there can be only one

Linux has been offering a decent set of network bridging features for quite some time now. It may not always boast the absolutely latest and greatest of network protocols, but it gets the job done, and does so with commendable stability. I have had literally hundreds (possibly thousands) of Linux software bridges running in rather complex meshed network topologies for extended periods of time with no issues directly related to the implementation.
Basic bridge configuration under Linux is a reasonably simple affair, provided you have some basic understating of network protocols. Tutorials are available for a number of popular distributions. Delving deeper and handling specific, more complex situations, is, however, a bit more demanding. With the advent of virtualization technologies, Linux bridging became increasingly common, as most host (and guest) operating systems had to implement it in some form, in order to enable relatively unimpeded access to the outside network for all involved parties. This resulted in a slew of bug reports (usually against libvirt) and different solutions for some commonly encountered issues, one of which stems from the way the Bridge ID is (re)calculated when individual ports are dynamically added or removed from the bridge. See, for example, this report.
The bridge ID is an 8-byte value used to uniquely identify a bridge within a network running Spanning Tree Protocol (STP). The STP itself is not mandatory, unless you have a meshed topology (or, actually, any topology that can include closed loops). In that case, STP becomes a must, if you want to prevent the dreaded switching loops and accompanying broadcast storms and network outages. The Bridge ID's interpretation has historically developed from 2 bytes of priority and 6 bytes of lowest of bridge ports' MAC address into a more complex form, a process nicely illustrated here. Note that the "extension", or, more accurately, reinterpretation of the first 2 bytes of Bridge ID to include the Extended system ID was made to cater for different VLANs, so that the same physical bridge/port combination would be mapped to a separate bridge ID, depending on the VLAN it is a member of. The Linux bridge control command line utility, brctl, doesn't make the distinction between old and new interpretations at all, instead allowing one to explicitly control all 2 bytes (16 bits) or the priority/extended ID field, using the brctl setbridgeprio syntax. Important thing to note is that, for STP to function correctly (also known as: to avoid all of the ugliest demons of hell to come forth, spawn offspring with the nastiest of the gremlins and then let them loose on your network), bridge IDs have to be unique! This is why MAC address is used as a part of Bridge ID (those are guaranteed to, or at least assumed to be globally unique), and VLAN IDs were added in later to cover the cases where the same port MACs are used to carry the traffic for otherwise distinct VLANs.
The issue with Linux bridging setup (or, at least, its Centos 6.x implementation) arises when you want to bridge traffic from different VLANs, after the VLANs have been terminated and the tags stripped off. A common use case would be a L2 separation of different traffic groups within one physical Ethernet trunk. The way this is commonly set up under Linux is (shown symmetrically for clarity):

A simple VLAN trunking
The pair of eth0s form a trunk that carries three distinct VLANs (1, 2 and 3, using default naming), but the eth0.x endpoints give us untagged frames belonging only to the appropriate VLANs. This ensures traffic isolation between each of the VLANs, and enables us to do whatever we want with each of the eth0.x endpoints -- give them arbitrary IP addresses, bridge them further, etc. For more examples and a more in-depth explanation of VLAN bridging on Linux, see this article. So far, so good. One thing to note here is that only two distinct, real MACs are in play: those belonging to eth0 interfaces on both systems. All of the eth0.x (virtual) interfaces share the same MAC as their physical trunk interface they are derived from. Now, what we might want to do once we have the VLAN trunking configured as above is, dynamically connect some other actual (physical) interfaces through our VLAN pipes:

Passing external traffic through our VLANs
We want to pipe network traffic between ethA and ethX, and between ethB and ethY, through Systems 1 and 2. One way to do it is by bridging ethA and ethX to input/output points for VLAN 2, eth0.2, and doing similar with ethB and ethY via VLAN 3 (eth 0.3). While this configuration might be somewhat unusual, the ability of Linux bridges to add and remove individual ports on the fly might be worth the effort if the setup needs to be dynamic in nature. Let's assume that we have to run STP on all bridges (both br2s and br3s), because a loop might be present between ethA/ethX and ethB/ethY pairs, in the topology outside of systems 1 and 2. Then, if the numerical value of eth0's MAC on either side is less than corresponding values for ethA and ethB (ethX and ethY for system 2), both br2 and br3 will end up with the same Bridge ID! This happens because the algorithm for bridge ID calculation will go through all the ports that belong to a bridge and pick the MAC with the smallest numerical value as the MAC address portion of the bridge ID. But, in our case, both eth0.2 and eth0.3 ports have the same MAC, copied from the actual eth0 MAC. If it is actually smaller than the other bridge port's MAC, it will be used, leading to multiple bridges with the same Bridge IDs. Afaik, at least on Centos 6.x, no automagic effort will be made by the command line tools or system configuration scripts to use the VLAN ID portion of bridged ports and configure the Extended system ID of bridges they are a part of. This can be overcome by issuing appropriate brctl setbridgeprio commands to split bridge IDs, of course. Centos 6 network configuration scripts don't allow for setting up the priority of bridges, but that ability can be added easily enough, by editing /etc/sysconfig/network-scripts/ifup-eth file to include something along the lines of:

[ -n "${PRIO}" ] && /usr/sbin/brctl setbridgeprio ${DEVICE} ${PRIO}

in the bridge configuration section. After that change, any PRIO=... settings in persistent network configuration files will be applied, whenever the device in question is a bridge. Other distributions may already support bridge priority setting at boot time (well, actually, at network bringup time), or would require a somewhat different solutions to add the feature.
The other issue related to the Bridge IDs and the extended ID field is somewhat more obfuscated. The Kernel version powering Centos 6 series (2.6.32) and, quite possibly, the latest 4.x Kernel versions too, have a quirk when it comes to updating STP settings in a reaction to a change in one of the bridge's port's. The change in question might be adding or removing the entire port from the bridge, or the change of port's MAC address, while it is a member of a bridge. As already explained, this should trigger a set of recalculations, because the smallest numerical MAC value of all the bridge's ports might have changed. The Kernel includes a function called br_stp_change_bridge_id(). It is provided with the new, smallest, numerical MAC value to apply to the bridge, and then it does two things: first it updates the bridge's own MAC address (correctly), and then goes through all of the bridge's ports, checking to see if the old Bridge ID (prior to the MAC change) was the root and/or designated bridge for any of the ports. If so, it updates those values as well. A piece of code that does the second task is:

list_for_each_entry(p, &br->port_list, list) {
    if (ether_addr_equal(p->designated_bridge.addr, oldaddr))
    memcpy(p->designated_bridge.addr, addr, ETH_ALEN);

    if (ether_addr_equal(p->designated_root.addr, oldaddr))
        memcpy(p->designated_root.addr, addr, ETH_ALEN);
    }
}

The oldaddr is the original MAC address, addr is the new address. The ether_addr_equal()function compares the two MAC addresses and returns true if they are equal. Older Kernel versions use the !compare_ether_addr() instead, but the effect is the same. Can you spot the problem?
Having concluded that several bridges on the same system can, in fact, share the MAC portion of their Bridge IDs, we can see that the condition for updating the root and/or designated bridge is too broad: comparing the MAC portions alone might lead to false positives. For example, if two bridges with IDs 8000:00:11:22:33:44:55 and 8001:00:11:22:33:44:55 existed on the same system, updating the MAC of the second bridge from 00:11:22:33:44:55 to 00:66:77:88:99:AA would cause any ports that had root and/or designated bridge as 8000:00:11:22:33:44:55 to change those values to 8000:00:66:77:88:99:AA - a non-existing bridge. In this case, we would have to compare the full Bridge IDs, like so:

bridge_id old_bridge_id;
(...)
memcpy(&old_bridge_id, &br->bridge_id, sizeof(old_bridge_id));
(...)
list_for_each_entry(p, &br->port_list, list) {
    if (!memcmp(&p->designated_bridge, &old_bridge_id, sizeof(old_bridge_id)))
        memcpy(p->designated_bridge.addr, addr, ETH_ALEN);

    if (!memcmp(&p->designated_root, &old_bridge_id, sizeof(old_bridge_id)))
        memcpy(p->designated_root.addr, addr, ETH_ALEN);
}

Whether this is something to be concerned about, I am not sure. Given the scope of Linux network stack deployment in real world, I guess that even this (very) particular scenario would show up sooner or later, prompting a fix. On the other hand, if you're comfortable with rolling your own bridging Kernel module.. it never hurts to be (a bit more) sure.


Thursday, August 20, 2015

Eclipse rendering issues on Fedora 22 Mate

Mate is your best mate in VM Fedora

I've learned the hard way that using modern, 3D-accelerated, effects-blazing desktop environments inside VirtualBox Linux guests is not something one could hope to result in any semblance of speed. I cannot help but wonder if that is a very unusual use-case, because is seldom covered by any of the mandatory Linux news articles praising the latest iteration of Gnome3, KDE5 or even Cinnamon. Anyway, when I decided to try out the latest Fedora, I was happy to notice the official Mate spin is available for download. As the blurb on the linked page says, Mate aims at high productivity and performance (and aging, nostalgic, GTK2 crowd), although you can switch to the built-in Compiz window manager to get some bling from it. I would stick to the default, simple, but very usable and fast, Marco window manager. Even with Virtual Box's haphazard support for graphics acceleration in Linux guests, it would work nicely on my trusty i5.
Things went well, Fedora 22 Mate was installed and fully updated inside the bleeding edge Virtual Box (5.0.2), together with guest additions. The Fedora graphical installer still makes no sense whatsoever, unless you were born on a planet where GUI designers always had key confirmation buttons like Ok and Done tucked away in top corners of their forms, but at least it worked ok with all-default settings. Installing Fedora without having to analyze the latest rage in its default partitioning scheme is a major benefit of using a virtual machine. Had this been a bare metal, or, even worse, a not-so-bare metal (i.e. with some OSes already installed on it) installation, I'd have spent much more time on research. I still remember the time when Fedora 2 happily munched up the MBR of my work PC when I was trying to set up a dual-boot with Windows.

So far, so good. The desktop was fast and responsive, and, after the Virtual Box guest additions were added, resized properly on the fly, together with the VM window. I wanted to do some C/C++ development, so after pulling in and compiling some stuff in command line, I decided to install Eclipse. Now that is one package I cannot help but have love-hate relationship with. On the one hand, it is hugely powerful, with plugins supporting many programming languages and code sharing and maintenance aspects, flexible remote execution and debugging, etc. It was adopted and adapted for use by many commercial vendors (see, for example, this one). In a lot of ways it is a de-facto standard, especially in the field of cross-platform IDEs. On the other hand, it is sometimes insanely slow, difficult to properly configure, subject to arbitrary vanishing of previously working plugins, can bleed memory like a stuck pig (and lasts about as much, when it happens) and is frustratingly prone to errors in configuration files caused by improper shutdowns. I guess that with great power comes great respons... quirkiness and this is where virtual machines come into play. Once you set up Eclipse (and the rest of the development environment) just the way you like it, you can make a copy of the VM and share it with other developers, knowing that it will work on their boxes without much/any additional begging and hair-tearing.

..or is it?

Fedora 22 comes with Eclipse 4.5 (Mars), a brand new edition that I have never used until now. Much to my dismay, I have discovered that it now grows fur when massaged.
You can read that last sentence again, until it sinks in. I'll wait.
Still not convinced? Ok, let me illustrate the point: 

Furry fonts as featured in Eclipse 4.5 Mars on Fedora 22 Mate










Notice the blurry/bold fonts in Project Explorer and Outline windows, the the left and right? No, it's not a feature. Well, a furry feature perhaps. Here's the zoomed up version:

In all of their furry glory





To top it off, the fonts aren't like this (furry) when you start the application. It only happened when I scrolled up and down the Project Explorer window a few times. The more I scrolled, the.. furrier they got. Doing something that would repaint the window, like sizing or moving around, got rid of the effect temporarily.
At this point, the look on my face could have been, pretty accurately, described as o.O -- I am used to various idiosyncrasies of fresh Linux distributions, but I must admit that seeing text rendered and then re-rendered over itself with slight offset when scrolling through a common tree-view control is new. After a while, I sighed and resigned myself to watching the fleeting fun furry fonts until the next update hit. Surely, people have been complaining about this? Onwards and upwards! I went on to install a Mercurial version control plugin for Eclipse. Since it is not, apparently, in any of the official repositories, I would have to add it directly through Eclipse, and it turned out that Eclipse now has a cool new feature called the Marketplace to help with the, sometimes cumbersome, process of adding new plugins directly into Eclipse. The Marketplace, itself, is in the repositories, and it's called eclipse-mpc (really?), so one short command and a few prompts later:

dnf install eclipse-mpc

..we should be in business! Restarting Eclipse and going to Help/Eclipse Marketplace opens up a new marketplace window that promptly proceeds to pull newest plugin data from their servers... and then does nothing.

No plugins listed, not even a single furry font worth of them

Um, ok. Now this is frustrating. I've had my own share of problems with Eclipse accessing the web through firewalls and proxies, but I don't think this is one such case: it seemed to read the pages ok (a progress bar showed it downloading the info), and the old, manual ways of adding plugins still kinda works, even though it needs the Internet access too.
Thinking it might be a rendering issue (again), since it was clearly broken in this version after all, I went to google spree, which rewarded me with this gem. Apparently, people were having similar issues (but not completely the same, check out the gallery of attachments to that bug report for extra hilarity) for more than a year. Now, nowhere does it mention the Mate desktop environment, but the workaround proposed in comment #21 suddenly made a lot of sense. It proposed adding an obscure Eclipse startup setting in eclipse.ini file (located at /usr/lib64/eclipse/eclipse.ini on 64-bit Fedora 22) which apparently forces the use of GTK version 2 by the launcher:

--launcher.GTK_version
2
 
Mate is GTK2-based, with GTK3 support still experimental, which is why that rang a bell. Honestly, I have no idea if this is definitely related to the choice of Mate as a desktop environment, and I'm not installing Gnome3 desktop, just to see if the case could be repeated there, too. Having said that, I figure people would actually report this a lot, if it were happening in the default Fedora 22 Workstation setup, which uses Gnome3. It also has something to do with the version of Eclipse, since people that did complain about it in that bugzilla thread mentioned that it came about after upgrading to Eclipse Mars 4.5 It might be that Mars simply changed the default for GTK version from 2 to 3, and it broke some setups.
The workaround helped in my case, both with the furry fonts (or lack thereof), and the Marketplace finally shone in all of its glory:

Now with actual items!