DIY NAS Server: Refreshing the Odroid XU-4
Sunday, June 9, 2019
Reading time 11 minutes
Continuing with the series of articles dedicated to the Odroid XU-4 and my adventure of trying to convert it into a NAS server without going through installing systems with everything already integrated, now I wanted to share a particularity (or trick) that can help the board components work somewhat better.
In an ideal configuration, the board will always be on. If it only consumes 20 watts (as maximum power), it’s very cheap to keep it operating all the time and it’s much easier, for people who are going to use the server. Otherwise, you’d have to turn it on when you need to use it, but keeping in mind that one of the tasks I wanted the board to be able to do is update a backup of my most important files in a Backblaze Bucket, and that it does this at certain hours of the day, keeping it on is the best idea.
But before leaving all this as is, I’ve noticed that the board heats up a bit more than I’d like, to the point of making it quite noticeable when touching the bottom part of it. It doesn’t become dangerous to have it like this, but I’ve thought that, since it will be on for many days without seeing rest, it would be good to mitigate that problem.
The Odroid XU-4 fan
The folks at HardKernel, the board manufacturers, have thought about this when designing the hardware specifications of this little giant. The extra power that comes with the hardware also makes the temperature rise more quickly. To compensate for this problem, the Odroid XU-4 has a small fan integrated in the top part of the device. This fan turns on when the system determines that the temperature of the 8 CPUs or the GPU is too high for a desirable level, and needs to cool down a bit. The Fan, with its default configuration, has 3 rotation speeds: Off, smooth rotation, intermediate and maximum power.
The system configuration works using a temperature threshold of 3 that are available. If the average temperature of the 5 sensors (the GPU is counted separately) is within a predetermined threshold, the fan speed that corresponds to it is applied. All this happens completely automatically, although it’s also possible to control the fan rotation manually or disable its power-on, although I wouldn’t understand why the latter should be done.
By default, the system will behave like this:
- When the hardware temperature is less than 60 degrees, the fan will be off (speed 0).
- If the temperature is between 60 and 69 degrees, the fan turns on and its speed is 120 (on the Odroid website they use the acronym WPM, but there’s no explanation of what they mean. I suppose it’s the rotation speed it gives in a minute).
- If the temperature is between 70 and 79 degrees, the rotation speed increases to 180 WPM.
- Finally, if the temperature exceeds 79 degrees, the rotation speed is set almost to maximum, at 240 WPM. The maximum allowed for this device is apparently 255 WPM.
From the terminal in Linux, you can check the temperature of the sensors for the 4 A15 processors (the “big” ones) and the GPU. To check them, you simply call 5 files created specially during boot by the kernel. These are the file locations:
- /sys/devices/virtual/thermal/thermal_zone0/temp
- /sys/devices/virtual/thermal/thermal_zone1/temp
- /sys/devices/virtual/thermal/thermal_zone2/temp
- /sys/devices/virtual/thermal/thermal_zone3/temp
- /sys/devices/virtual/thermal/thermal_zone4/temp
The first 4 files (from thermal_zone0 to 3) correspond to the CPU cores, and the last one is for the GPU. In practice, all sensors mark the same temperature (one degree more, one degree less) because they’re located together on the same board, so if one part of the CPU heats up it immediately transfers that heat to the rest of the components. I mean it’s not necessary to check all 5 sensors to determine the temperature, checking zone 0 will be sufficient and the values will be practically the same for the rest of the zones. Here you can see an example of how this is checked from the SSH session. The line where I’ve written the command is the one that starts with the dollar sign, and the command response starts without that sign. From now on, that will be the way I’ll show commands: All those that have a dollar symbol in front are commands, and those that don’t have it are responses to the commands:
$ cat /sys/devices/virtual/thermal/thermal_zone0/temp
58000
This command indicates that in my case, the board is at about 58 degrees. The number returned by this sensor is only divided by 1000 to get the temperature in degrees.
Adjusting temperature thresholds
I live in a fairly hot city, and this default temperature threshold configuration is not very advantageous for my current conditions, as over time having the Odroid XU-4 running (without anything running, simply while it was working with almost nothing installed), something like this happened:
- The powered-on system eventually reached 60 degrees.
- The fan turned on for a couple of minutes.
- The temperature dropped to 58 degrees or a bit more.
- Back to step 1.
The problem was that you couldn’t touch the board for a relatively long time without feeling it was very hot. And the other problem was that 60 degrees seemed to me a somewhat less than ideal temperature for something presumed to last all the time on. So I ended up modifying the temperature thresholds and making them a bit lower.
Since this is Linux, and everything is practically a file, we can check the 3 temperature thresholds for each of the sensors that come in this processor. For this we’ll use a file that’s found in the same directory as in the example to check the temperature. In English, temperature thresholds are called trip points, so we can use this command to display the 3 trip points associated with each zone:
$ cat /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/trip_point_{0,1,2}_temp
# results
60000
70000
80000
60000
70000
80000
60000
70000
80000
60000
70000
80000
This command is a one-line version, but what’s really done is check the trip_point0_temp files, changing the 0 to 1 and 2 in subsequent command executions, and this process is repeated for the 5 available temperature zones. This means that for each of the temperature sensors installed on the Odroid, there are 3 trip_points. Each trip_point is associated with a fan speed and by default the smallest of them, the one marked with number 0, has the lowest fan speed. If the current temperature is less than all these trip_points, then the fan stays off. In the example of my own board, the temperature it marked was 58 degrees, so there wasn’t any of these points active at the time of measuring it.
The trip_points are only text files on which you can write normally. Although the documentation talks about an average, the reality is that if, to give the example, you write that trip_point0 of zone 0 is activated at 30 degrees, that’s exactly what will happen even though the average of the rest of the sensors may indicate something different. Temperature values in all this must always be expressed in the same way (multiplying degrees by 1000). For example, to make the fan activate the first trip_point when reaching 30 degrees, which will make it start rotating, you must write the following command:
$ echo 30000 | sudo tee /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/trip_point_0_temp
$ cat /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/trip_point_0_temp
# results.
30000
30000
30000
30000
Keep in mind that generally, when changing a trip_point you must do it with this syntax, or manually change the same trip_point for thermal_zone0, thermal_zone1, thermal_zone2 and so on. The second executed command simply verifies that the changes have been written correctly. It’s a fairly common practice when touching any parameter of this class of device hardware.
After a couple of seconds, if the temperature reported at that moment is greater than 30 degrees, the fan will start rotating and that will make the board have somewhat more heat dissipation and in general can drop, in my example, about 18 degrees after several minutes of activity, which is quite well received.
$ cat /sys/devices/virtual/thermal/thermal_zone0/temp
40000
Good, now it looks much better!
Apply the changes permanently
An important fact about modifying hardware this way is that the change remains active as long as the board is not restarted or turned off. If either of these two things occurs, the values will return to 60 degrees for the first trip_point. Fortunately, we can achieve a global effect of this by establishing the change in the /etc/rc.local file. This file is executed when the normal system startup finishes, just before executing any user session and, ideally, it should contain most of the time commands that serve to modify some parameter in a system configuration file. That is, simple lines that aren’t so complex as to need a Systemd unit or something similar to be started.
Editing this file has a fairly simple trick, but it’s important to keep in mind. Once open, you’ll see many comment lines (they’re identified by starting with a number sign (#)). The trick is to go to the end of the file, but write the commands you need just before the only line that doesn’t have comments, “exit 0”. This line is always the last one read from this file and whatever is placed below it won’t be interpreted. These are the lines of the “new”, unmodified /etc/rc.local file:
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
if [ -f /aafirstboot ]; then /aafirstboot start ; fi
exit 0
- The line “if [ -f /aafirstboot ]; then /aafirstboot start ; fi” is specific to the Ubuntu system for Odroid. Normally in other Linux systems it doesn’t appear.
Now that it’s clear how to edit this file (just add the lines that are before the last), we continue with the story. To change the temperature points to somewhat more recommended values, and keep the change, I’ve set the first trip_point to 30 degrees, the second to 50 and the third to 70. I pasted this before the “exit 0”, just like this:
# Activate trip_points at the following temperatures: 30°C, 50°C, 70°C
# These variables will be used for convenience in the rest of the script.
# To change the temperatures you just have to edit them.
TRIP_POINT_0=30000
TRIP_POINT_1=50000
TRIP_POINT_2=70000
# here the text is written to the corresponding files.
# First trip_point
echo $TRIP_POINT_0 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
echo $TRIP_POINT_0 > /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_temp
echo $TRIP_POINT_0 > /sys/devices/virtual/thermal/thermal_zone2/trip_point_0_temp
echo $TRIP_POINT_0 > /sys/devices/virtual/thermal/thermal_zone3/trip_point_0_temp
# second trip_point
echo $TRIP_POINT_1 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp
echo $TRIP_POINT_1 > /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp
echo $TRIP_POINT_1 > /sys/devices/virtual/thermal/thermal_zone2/trip_point_1_temp
echo $TRIP_POINT_1 > /sys/devices/virtual/thermal/thermal_zone3/trip_point_1_temp
# third trip_point
echo $TRIP_POINT_2 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_2_temp
echo $TRIP_POINT_2 > /sys/devices/virtual/thermal/thermal_zone1/trip_point_2_temp
echo $TRIP_POINT_2 > /sys/devices/virtual/thermal/thermal_zone2/trip_point_2_temp
echo $TRIP_POINT_2 > /sys/devices/virtual/thermal/thermal_zone3/trip_point_2_temp
exit 0
To check if it worked, it’s possible to restart the operating system with the sudo reboot command.
Emulate temperatures
One last thing I’d like to describe about this topic is that you can emulate the board’s temperature. This way it won’t be necessary to take it to a point of intensive use to notice if the fan responds correctly to the changes made to it. This technique can be useful to test changes in temperature thresholds, or in the fan speed scale. And it simply consists of writing to another file, as is customary. It’s done as follows, for example, to make the Odroid believe we’re at 85 degrees:
$ echo 85000 | sudo tee /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/emul_temp
# results.
85000
Now, you can check the temperature and see if the change has taken effect:
$ cat /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/temp
# results.
85000
85000
85000
85000
This file will pass as valid everything put here and will replace the system temperature. That is, if this has a value different from 0, which means disabled, the temperature sensor is ignored and the value of this file is used instead. Now, to return the board to its normal state, which it’s also not very good to have the fan at full power all the time, I suppose, you set this file to 0 and that’s it:
$ echo 0 | sudo tee /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/emul_temp
# results
0
$ cat /sys/devices/virtual/thermal/thermal_zone{0,1,2,3}/temp
# results
42000
41000
45000
41000
Conclusion
After having reviewed most of the configurations for the integrated fan, I’ve left the trip_points modified and thanks to the rc.local file, this change is applied during system startup. Now the board feels better in uninterrupted daily use, and has been working like this for about 14 days without any problem.
More information about configuring this topic on the board can be found in the Odroid XU-4 Wiki (in English)