Self balancing robot improvements

I was dissatisfied with robots performance, so I experimented a lot with various aspects to improve it. But along the way the robot, however simple in concept, proved to be pretty complex system which depends on many factors that I didn’t anticipated. This variety of factors which altogether influence its movements make it very difficult to troubleshoot problems. So, now I’ll try to recall my adventures with it in order. Oh, and my objective (best case scenario) is to make it stable to the point it stands completely still. This is what I found the most difficult.

Better motors

Long (tall) version with hard tires.

I decided, my first set of motors (Dagu DG02S 48:1) have to low torque on low RPM (driven from DRV8835 H-Bridge again with PWM). The robot was able to stand, but wiggled (video in previous post). So I changed them to JK28HS32 stepper motors, which proved to be even weaker. They simply have to low torque in all RPM range for the weight of my robot. Another disadvantages:

  • Draws tons of current even when robot is stationary. This is because un-geared steppers (at least small ones) spin freely when unpowered and thus need current which hold them in position.
  • More complex electronics (1 H-bridge per motor instead of 1/2 per brushed one), and much more complex programming.
  • Intense vibration even in half stepping mode. And I used 200step ones. The solution for that would be to use smaller wheels with soft tires or implement micro stepping which would in turn reduce its incremental torque. Every time I use one of these steppers, and I dig into the subject I realize how difficult a stupid motor might get.

So, not without disappointment I moved to another set of motors which happened to be Brushless DC Motor with Encoder 12V 159RPM from DFrobot (I found them on Aliexpress BTW). To my consternation it didn’t help much. Advantages of the newly mounted motors are its high torque, ease of interfacing (they have controller stuffed inside), but cons are:

  • They have quite some backlash.
  • They run on 12V, so I had to rework some of the electronics.
  • They might be a little bit faster, though at the end I was able to stabilize the thing pretty satisfactory.

Geared BLDC motors

To sum up : high quality motors eliminate one point of failure and lets you focus on another aspects if something is failing. My ideal motors, that I would like to have are:

  • Precise in terms of control. Wide range of RPMs i.e. able to spin super slow or quite fast with decent torque in all situations. What is a decant torque? Dunno. 2,4 kg*cm?
  • Fast. I would love to have 300RPM.
  • Minimal backlash.
  • Encoders built in.
  • Not to mention power efficiency and ease of programming, but this is not the most important thing.


So then I finally decided, that I stop there with changing motors, and wont replace them for the 4rd time, but try to tune the PID and fiddle with the software. I replaced my fusion algorithm from simple complementary one to Madgwick, which open source implementation is available online, and it is a part of a PhD thesis of some clever guy (Sebastian Madgwick). I cannot stress how far superior this algorithm is over my humble thing. But no luck with that neither. If I increased Kp and Ki it would oscillate vigorously (Kd was of not  much help there), and when I decreased Kp and especially Ki, the robot would always drift away gaining speed and tipping over. Playing with the dreaded thing which would not stand at all, and wiggle as some drunk I recalled, that on many YT videos (user upgrdman among my favorites on the subject) people was driving their robots on soft surfaces like carpets and that made me wonder. So I put my robot on a blanket and it helped a little. It reduced both oscillations and the drifting. I have theory, that firm surface (and/or soft tires) first damps vibrations, but also when type flattens under the weight, it somewhat blocks further movements. Hard tires do not have this effect and on hard surfaces they spin whenever robot is only slightly out of its balance whether on soft ones this slight error can be counteracted by deformed wheel to some very small extent. Think of dry sand on the beach and big soft wheels, I think even when unpowered, the robot couldn’t have a chance to stand still. So it thought about changing the wheels (and tires), because at that time I used pretty hard wheels from hardware store (some furniture ones) and I ordered 70mm and 110mm squishy wheels for model aircrafts. And then, while waiting for the wheels I made hasty decision to shorten the thing.


Bear in mind, that I can be completely wrong here! As far as I understand, when you have normal pendulum, the longer the string is, the faster will be linear velocity of the pendulum bob. I assumed The same thing will be with inverted pendulum, thus the taller the robot, the bigger velocity of it top-end when tipping over. Thus, I concluded, the faster lower-end movements will be necessary to counteract the moving top. At that time I thought my motors are to slow, so It seemed to be a good idea to shorten the body. So I did. And improvement was negligible if at all. Also, from the pendulum frequency equation I knew, that shortening the length increases the pendulum frequency, so i understood that my algorithm must react faster from now on.

Mechanical issues

Then I realized, that the wheels got loose. I ordered bunch of adapters and fixed them in place. Still no luck.

Scratching head and tuning PID

I couldn’t sleep but visualized my dangling robot. I tried to tune the PID controller a lot. And still every time I increased Ki it oscillated, and when decreased it drifted away. Too much Ki and it oscillated left and right, and too much Kd and it oscillated but in like one direction. Like hiccup. Derivative term damps output when error decreases, so in theory the robot should decelerate near sweet point and make full stop, but instead it would stop for a moment (like millisecond moment), and start in the same direction, then stop and then again, all in sudden sequence.

nRF24L01+ distraction

I was frustrated and sick and tired, so decided to do something different, as a break from the main subject. I got into nRF24 code and telemetry. I wanted to make full telemetry for the project as user upgrdman did. I thought that it would help me to debug. Then I run into another problems, procrastinated a little bit more with refactoring my SPI library and so on. I had curious adventures with nRF though, but this is completely other story. I use excellent application called KST2 for plotting the data. From the plots I observed that robot is drifting due to sudden integral peaks. It’s like robot was balancing for a moment, and then the pitch plot would show slight shift towards one direction, and just after that the integral part grew, robot started to move and tried to catch up, the i increased and increased, robot speed up, and then output saturated and thats all. The only thing which helped a little was to increase Ki (it amplified the reaction for integral, so robot drove faster, so was able to catch up and straighten), but it also increased oscillations.

Main loop frequency

In the act of desperation, not hoping for a change for better, I increased main loop frequency from 100Hz to 1kHz. I could do that, because I use very fast STM32F4 (compared to Arduinos and Atmegas that everybody seems to love so much) with a fpu. And that was it. It calmed the robot so much, that I could increase Ki to the values not possible before. Then I mounted the 110mm squishy wheels and it helped even more for the reasons described above, and because bigger wheels gives more speed. Thats all for now, I’m a little bit tired of this project, but in future I plan to:

  • Program encoders.
  • Program remote control (I bought myself nice Devo 7E transmitter).
  • Experiment with all (or most) of the parameters I talked about. Height vs. loop frequency, various wheels and so on.

Wheels galore

Self balancing robot first tests

I have built a self balancing robot, and here I want to post some notes regarding problems that I encountered. It was (as usual) more difficult than I anticipated, but I guess every hacker know this feeling.


Frame is made of 8mm aluminium pipes, 2 sheets of 2mm solid transparent polycarbonate, and aluminium angle bars. The pieces are screwed together with threaded rods and self locking nuts.  Making the frame was the easiest part, so I won’t write much about it.


Motors are labeled “Dagu DG02S 48:1”, and I am disappointed, because they have very low torque at low RPM, and are of low quality I think (their low price also suggests that). The consequences are, that they are unable to move my robot gently and precisely, but instead they start all of a sudden with couple of RPMs. At very low PWM duty cycle they are unable to move wheels at all, even with no load. The lowest RPM i can get out of them is probably a couple of RPMs, which sounds OK, but I think, that many of my problems arise from them being unable to gently move the wheels few millimeters forth and back. Such gentle movement is in my opinion crucial for steady (nearly stationary) operation, that you have on certain YouTube videos. So next tests (if there will be any) will be performed with steppers or geared brushless motors. Oh, and there are no encoders neither.


If motors are the brawl, electronics are the brain. Main board is STM32F4-DISCO and is connected to the battery via custom per-boards with connectors. On the top of the robot there is a single 16850 Samsung cell paired with cheap LiPo charger, and switch. I chose it, because it is a lot cheaper per mAh than those silverish ones. Battery sits inside bracket ordered on AliExpress. As of accelerometer and gyroscope, the super popular MPU 6050 does the job (also AliExpress). Motors are driven by Pololu DRV8835, and last but not least nRF24L01+ is used for connectivity with the robot, which is crucial to tune PID controller without dangling cables which would degrade stability, and make center of mass unstable. Like shooting to a moving target.

Oh, and I wanted to control the robot using cheap CX-10 remote, but It failed completely. I based my code on deviation and others work, but after many hours I gave up. Then I’ve got Syma X5HW for Christmas, and with its remote I finally had success (after first try, like 10 minutes of coding, not everyday something like this happens. But of course I only had to modify parameters for my CX-10 code). At first I confused the binding channel number (I set 0 instead of 8), and I was still able to receive data, but only from approx 5 cm apart. Then after setting it correctly to channel 8 range increased dramatically.


With electronics and motors on its place, there comes programming, the most difficult part. First I made peripherals of the robot to work (Arduino library called i2cdevlib was very helpful), so I was able to read raw data from MPU 6050, send basic commands via nRF, and spin the motors. Then, the most challenging part was to implement:

  • pitch angle calculation (,
  • fusion of gyroscope and accelerometer data (this was helpful :,
  • PID tuning (though implementing a PID is dead simple, tuning it is completely other story). Most helpful for me in this regard was probably the Wikipedia article on PID controller. In my case, the most (positive) impact came from integral part, where D part seems to have minimal influence.

I’ll try to force myself to get into detail on the topics above, as I only scratched the surface.

Implementing zoom with moving pivot point using libclutter

I’m making this post, because I struggled with this functionality a lot! I was implementing it (to the extent I was happy with) in the course of 4 or even 5 days! Ok, first goes an animated gif which shows my desired zoom behavior (check out Inkscape, or other graphics programs, they all work the same in this regard):

Desired zoom functionality

Basically the idea is that, the center of the scale transformation is where the mouse pointer is on the screen, so the object under the cursor does not move while scaling, whereas other objects around it do. My first, and most obvious implementation (which didn’t work as expected) was as such:

 * Point center is in screen coordinates. This is where mouse pointer is on the screen.
 * The layout is : GtkWindow has ClutterStage which contains ClutterActor scaleLayer
 * (in this function named self), which contains those blue
 * circles you see on animated gif.
 * ScaleLayer is simply a huge invisible plane which contains all the objects an user
 * interacts with. User can pan and zoom it as he whishes.
void ScaleLayer::zoomOut (const Point &center)
        double scaleX, scaleY;
        // This gets our actual zoom factor.
        clutter_actor_get_scale (self, &scaleX, &scaleY);
        // We decrease the zoom factor since this is a zoomOut method.
        float newScale = scaleX / 1.1;

        float cx1, cy1;
        // Like I said 'center' is in screen coords, so we convert it to scaleLayer coords
        clutter_actor_transform_stage_point (self, center.x, center.y, &cx1, &cy1);

        float scaleLayerNW, scaleLayerNH;
        clutter_actor_get_size (self, &scaleLayerNW, &scaleLayerNH);
        // We set pivot_point
        clutter_actor_set_pivot_point (self, double(cx1) / scaleLayerNW, double(cy1) / scaleLayerNH);
        // And finalyy perform the scalling. Fair enough, isn't it?
        clutter_actor_set_scale (self, newScale, newScale);

Here is the outcome of the above:

Zoom fail

It is fine until you move the mouse cursor which changes the pivot point (center of the scale transformation) while scale is not == 1.0. I dunno why this happens. Apparently I do not understand affine transformations as well as I thought, or there is a bug in libclutter (I doubt it). The solution is to convert the pivot point from screen to scaleLayer coordinates before scaling (as I did), and again after scaling. The difference is in scaleLayer coordinates, so it must be converted back to screen coordinates, and the result can be used to cancel this offset you see on the second gif. Here is my current implementation:

void ScaleLayer::zoomIn (const Point &center)
        double x, y;
        clutter_actor_get_scale (self, &x, &y);

        if (x >= 10) {

        double newScale = x * 1.1;

        if (newScale >= 10) {
                newScale = 10;

        scale (center, newScale);


void ScaleLayer::zoomOut (const Point &center)
        ClutterActor *stage = clutter_actor_get_parent (self);

        float stageW, stageH;
        clutter_actor_get_size (stage, &stageW, &stageH);

        float dim = std::max (stageW, stageH);
        double minScale = dim / SCALE_SURFACE_SIZE + 0.05;

        double scaleX, scaleY;
        clutter_actor_get_scale (self, &scaleX, &scaleY);

        if (scaleX <= minScale) { return; } scale (center, scaleX / 1.1); 


void ScaleLayer::scale (Point const &c, float newScale) { 
   Point center = c; float cx1, cy1; if (center == Point ()) { if (impl->lastCenter == Point ()) {
                        float stageW, stageH;
                        ClutterActor *stage = clutter_actor_get_parent (self);
                        clutter_actor_get_size (stage, &stageW, &stageH);
                        impl->lastCenter = Point (stageW / 2.0, stageH / 2.0);

                center = impl->lastCenter;
        else {
                impl->lastCenter = center;

        clutter_actor_transform_stage_point (self, center.x, center.y, &cx1, &cy1);
        float scaleLayerNW, scaleLayerNH;
        clutter_actor_get_size (self, &scaleLayerNW, &scaleLayerNH);
        clutter_actor_set_pivot_point (self, double(cx1) / scaleLayerNW, double(cy1) / scaleLayerNH);
        clutter_actor_set_scale (self, newScale, newScale);

        // Idea taken from here :
        float cx2, cy2;
        clutter_actor_transform_stage_point (self, center.x, center.y, &cx2, &cy2);

        ClutterVertex vi1 = { 0, 0, 0 };
        ClutterVertex vo1;
        clutter_actor_apply_transform_to_point (self, &vi1, &vo1);

        ClutterVertex vi2 = { cx2 - cx1, cy2 - cy1, 0 };
        ClutterVertex vo2;
        clutter_actor_apply_transform_to_point (self, &vi2, &vo2);

        float mx = vo2.x - vo1.x;
        float my = vo2.y - vo1.y;

        clutter_actor_move_by (self, mx, my);

The whole project is here :
Here’s the thread which pointed me in right direction :

Cross compilation with GCC & QtCreator for ARM Cortex M

I used Eclipse CDT for years for C/C++ and was disappointed by its bulkiness, slowness and memory usage. I did mostly embedded, and sometimes GTK+ desktop apps (even OpenGL once or twice). I looked for a replacement, tried dozen or more IDEs and editors, and finally found out the QtCreator (my main concerns were : great code navigation – eclipse often get confused wit serious, templated C++ code), great code completion, and CMake integration. I am satisfied for now (It’s been a year now), and I use it for all but Qt. But all of a sudden, wen a new version appeared, I came across a minor flaw in CMake builder, which reported an error like “Can’t link a test program”. Obviously he cannot, because he used host compiler instead of ARM one.

So in version prior to 4.0.0 i used to configure my project with cmake like:

cd build
cmake ..

Then, from Qt, i simply compiled the project, and it worked flawlessly. But since QtCreator 4.0.0 it started to invoke cmake in every possible situation. Be it a IDE startup, or saving a CMakeLists.txt file. And he did it with -DCMAKE_CXX_COMPILER=xyz where “xyz” was a path configured in Tools -> Options -> Build & Run -> Compilers. If I run cmake manually, without this CMAKE_CXX_COMPILER variable set, everything was OK. I saw in the Internet, that many people had the same problem, and used QtCreator for embedded like I do (see comments here).

So I decided, that instead of forcing QtCreator to stop invoking cmake, or invoking it with different parameters, I should fix my CMakeLists.txt so it would run as QtCreator want it. Solution I found was CMAKE_FORCE_C_COMPILER and CMAKE_FORCE_CXX_COMPILER documented here. My CMakeLists.txt looks like this:


include (stm32.cmake)

PROJECT (robot1)


LIST (APPEND APP_SOURCES "src/stm32f4xx_it.c")
LIST (APPEND APP_SOURCES "src/syscalls.c")
LIST (APPEND APP_SOURCES "src/system_stm32f0xx.c")
LIST (APPEND APP_SOURCES "src/config.h")
LIST (APPEND APP_SOURCES "src/stm32f0xx_hal_conf.h")




And the “toolchain file” is like this:

SET (TOOLCHAIN_PREFIX "/home/iwasz/local/share/armcortexm0-unknown-eabi" CACHE STRING "")
SET (TARGET_TRIPLET "armcortexm0-unknown-eabi" CACHE STRING "")
SET (CUBE_ROOT "/home/iwasz/workspace/stm32cubef0")
SET (CRYSTAL_HZ 16000000)
SET (STARTUP_CODE "src/startup_stm32f072xb.s")


SET (CMAKE_C_FLAGS "-std=gnu99 -fdata-sections -ffunction-sections -Wall" CACHE INTERNAL "c compiler flags")
SET (CMAKE_CXX_FLAGS "-std=c++11 -Wall -fdata-sections -ffunction-sections -MD -Wall" CACHE INTERNAL "cxx compiler flags")
SET (CMAKE_EXE_LINKER_FLAGS "-T ${LINKER_SCRIPT} -Wl,--gc-sections" CACHE INTERNAL "exe link flags")



Two most important changes were:

  • Using CMAKE_FORCE_CXX_COMPILER macro instead of simply setting CMAKE_CXX_COMPILER var.
  • Including the toolchain file (and thus setting / forcing the compiler) before PROJECT macro.

My configuration as of writing this:

  • Qt Creator 4.0.2, Based on Qt 5.7.0 (GCC 4.9.1 20140922 (Red Hat 4.9.1-10), 64 bit), Built on Jun 13 2016 01:05:36, From revision 47b4f2c738
  • Host system : Ubuntu 15.10, Linux ingram 4.2.0-38-generic #45-Ubuntu SMP Wed Jun 8 21:21:49 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Host GCC : gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
  • Target GCC : armcortexm0-unknown-eabi-gcc (crosstool-NG crosstool-ng-1.21.0-74-g6ac93ed – iwasz) 5.2.0

BlueNRG + STM32F7-DISCO tests


  • Tested 32bit cross-compiler from launchpad, it worked OK, but binaries were much bigger, and debugger lacked Python support which is required by QtCreator which is an IDE of my choice (for all C/C++ development including embedded).
  • Made my own compiler with crosstool-NG (master from GIT). Used my own tutorial, which can be found somewhere over this blog.
  • Verified that a “hello world” / “blinky” program works. It did not, because I copied code for STM32F4, and I messed the clock configuration. Examples from STM32F7-cube helped.
  • I ported my project which aimed to bring my BlueNRG dev board to life. The project consists of a USB vendor-specific class for debugging (somewhat more advanced, and tuned for the purpose than simple CDC class), slightly adapted BlueNRG example code from ST (for Nucleo), and some glue code.
  • I switched to STM32F7-disco because I couldn’t get BlueNRG to work with STM32F4-disco. F7 has Arduino connector on the back, so I thought it’d be better than bunch of loosely connected wires which was required for F4. At that time I thought that there is something wrong with my wiring.
  • On STM32F7-disco BLE still would not work. I hooked up a logic analyzer, and noticed that MISO and CLK is silent. It turned up that I have to use alternative CLK configuration on BlueNRG board which required to move 0R resistor from R10 to R11. Although it was small (0402), at first I had problem desoldering it, probably because of unleaded solder used. See image.


  • Next problem I had, was with SPI_CS pin. According to the BlueNRG-dev board docs, I wanted to use D8 pin (this is PI2 pin on STM32F7-disco), but it didn’t work. Glimpse on the schematics revealed that the SPI_CS is connected to A1, which is marked as “alternative” in the docs. So IMHO this is an error in the docs.
  • Final pin configuration that worked was:
// SPI Reset Pin
// MISO (Master Input Slave Output)
// MOSI (Master Output Slave Input)
// IRQ
// !!!
  • Next thing that got me puzzled for quite a time was that after initial version query command, which seemed to work OK (I got some reasonably looking responses) the communication hanged. IRQ pin went high, and wouldn’t go low. So µC thought that there is data to read and was continuously reading and reading.
  • Only after I read in UM1755 User manual that IRQ goes HI when there is data from NRG to µC, and it goes HI-Z when there is no data, I double checked the schematics, and found that BlueNRG-dev board has no pull-down resistor in it. The ST example code I had also hadn’t have IRQ pin configured in pull-down state, so I wonder how this could work. Maybe Nucleo boards have pull-down resistor soldered there.
  • For now (after 3 or 4 nights, and I consider myself as quite experienced) I got this (see photo):Screenshot from 2016-03-21 02-09-30
  • “Iwona” is probably my neighbor’s phone.
  • Quick update made the next day morning. My Linux computer is able to connect to the test project, and query it (right now it simply counts up):

Screenshot from 2016-03-21 13-05-25

Internet thermal printer part 2

The printer I bought and described in the previous post really disappointed me. I didn’t spend some huge amount of time on that (say 3-4 evenings), but I dig into the subject so deep so I couldn’t help myself but do some more hacking. First of all I wanted to know if my cool looking, but quite useless printer can be used in some other way (i.e. the printing head) and whether that is the main board which is broken or the head itself. If the former was true, and the head was OK, I would try to communicate with the head and thus make something which would pretty much implement the whole printer main board that was broken. But if the head was broken, I couldn’t do anything but to abandon the project or find another printer. And it’s funny, because, as you may have seen at the end of my previous post, this is exactly I’ve written not to do. But I just love it. When the work you do all day every day is stupid and pointless, when you are constantly bothered with more and more irrelevant things, and after all day you are tired and discouraged, what would you do after arriving home (excluding household duties :D)? Grab a beer, sit and watch TV? Hell no! Grab a beer and tinker some more! It calms me down you know (unless I’m stuck for to long). The printer head used in my Intermec PW40 is a Seiko ( SII LTP3445 (datasheet here) and it is obsolete. New designs are encouraged to use LTPV445.

So what I did is that I soldered a bunch of wires to the main board to be able to speak directly to the printing head. The resulting wiring looks like that:

Connected directly to the thermal printer head.

Connected directly to the thermal printer head.

Then I grabbed signals with a logic analyzer and an oscilloscope to figure out what is malfunctioning (i.e. when the printer was operating). In my opinion, the main board is broken, because printing short strings like ‘A\r\n’ works OK, and all signals seems to be correct (i.e. 832 bits per row are transferred and quite a few rows are present). But when longer strings are submitted, the whole transmission appears to be corrupted at some point. Serial data burst is clearly shorter, like interrupted. Unfortunately I have made a screenshot of correct transmission (A\r\n), and don’t have the corrupted one now (and the board is not operational since I removed the FFC socket). Here’s the screen:

Correct transmission to the thermal head. Letter 'A' is being transmitted.

Correct transmission to the thermal head issued by the original Intermec PW40 main board. Letter ‘A’ is being transmitted.

The next step was to wire up some circuitry to actually drive the head while it was still soldered to the original main board. I didn’t want to break it then, but later it ceased to be a priority :D My setup consists of:

Breadboard looks like this:

The circuit. You can see that the printer is more or less intact i.e. the head is mounted on the main board and the plastic frame. Later on I decided to disconnect the head from the original main board.

The circuit. You can see that the printer is more or less intact i.e. the head is mounted on the main board and the plastic frame. Later on I decided to disconnect the head from the original main board.

Shifters are controlled with 3.3V and output 5V for the head’s logic. The whole contraption is powered from a laboratory power supply which was set to 5V with low current limit to prevent smoke and fire in case of errors in wiring on my side. The setup drawn about 0.1A when idle and 2.5A when feeding the paper. Driving the motor was pretty easy, I did stepper motor before, so I rather quickly caught up with this one. But the head took more time and at some (thankfully short) time I was stuck. First, the DST signal (DST is for power and thus temperature) circuitry on the main board is secured with some (I believe) TTL logic. The idea is that if thermistor says to the µC that he head is overheating, the µC shuts the head down. This is first protection mechanism programmed in software (BTW manual says that if overheat, the head may cause skin burns, smoke and even fire. It is a thermal one after all). But there is another protection mechanism done in hardware which shuts down the head if the software one malfunctions. I believe, that the two mechanisms are AND-ed by some TTLs. The protection mechanisms are pulling down the DST in case of trouble. In my case, when actually two logic circuits were connected to the head, this situation caused problems, because the original main board, which was not powered, pulled the DST down all time. The solution to this was to cut the trace and that was it (if not cut, the DST would stay low no mater what level I was trying to drive it. Oscilloscope shown only 100mV level changes, obviously to small to be useful).

My transmission. A 12*8 pixel wide bar strip. 12 x 0xaa.

My transmission. A 12*8 pixel wide bar strip. 12 x 0xaa.

But still no luck after the DST problem was resolved, so I decided that something else on the original main board is interfering and I need to disconnect the head from it in sake of connecting to it directly. Didn’t have spare FFC socket though (Molex 25 pin 1.25 mm pitch rare and obsolete), so after obtaining a wrong one from farnell (bought 1 mm pitch instead od 1.25 duh!) i soldered the wires directly to the FFC cable. Looks awful, but is rigid:

Wires soldered directly to the FFC strip.

Wires soldered directly to the FFC strip.

Still no luck! What the hell! Logic analyzer still happily shows correct bursts of data, so for the third time rewired the breadboard and checked levels with an oscilloscope. And curious thing revealed : all levels (shifted up) were 0-4V instead od 0-5V. I have completely no idea why? My power supply is a cheap one, but can 1 or 2 amps of load cause 1V drop? Must investigate further. EDIT my cheap counterfeit Saleae logic analyzer must have somewhat low input impedance and it was it that caused significant voltage drop on logic signals. Disappointing. On the picture below you can see (far left) that only after increasing the voltage repeatedly, the printer started to print:

The first successful printout.

The first successful printout.

I’m excited!

Internet thermal printer

The idea is shamelessly stolen from this project. EDIT : It evolved… What this project is intended to be:

  • A toy printer (for my son) with some light and sound signal connected to the Internet and accessible by some web interface. Anyone with password (basic auth configured in .htaccess) could send a graphic and/or text message to the printer which would immediately flash the light, beep the buzzer as well as print the message. Protocol to the printer (on the network level) : whatever a.k.a my own & super-simple.

What this project shall not become EDIT : It evolved.. (note to myself because I tend to complicate things):

  • An over-engineered wireless CUPS compatible Postscript full featured printer which also makes coffee.

After deciding that I would try to make such a thing which took approx. 1 second after seeing Jim’s site I went to (local EBay. BTW we have here in Poland, but allegro seems to be winning the battle) and found something printer-ish alike and seemingly broken, with some parts missing. It is a Intermec PW40 mobile printer. Useful links I found on this printer:

  • Manuals – Intermec site (those are for PW50, but I assume they are compatible in some way).
  • Intermec community – they even have forum, and some community around the site.

Photos after dismantling the thing:

Looks like it uses ESC/P like Jim’s printer and 7.2V battery pack also. Looks promising (at least some standard language). Elements found on the main board of PW-40:

I’ve written the LTC chip looks promising, because it connects the printer to the outside world, and gives a hint where to start hacking. It translates RS232 high voltage levels to TTL, but since I wanted to drive the printer directly from some µC I needed to bypass the LTC. After some research I determined what follows : RS 232 port (this with RJ socket) is connected to pins 14 (232 input) and 15 (232 output). Corresponding TTL pins are : pin 13 (logic output), and 12 (logic input). So as far as I am reasoning correctly :

  • Pin 13 is connected to the Toshiba’s RX pin.
  • Pin 12 is connected to the Toshiba’s TX pin.
  • Whole device can be powered from 12V supply (I read that somewhere).
  • Let’s try it! Seems to work. At least PC and the printer are communicating. Wiring looks like this:

Costs so far:

  • Printer : 25PLN ($8).
  • 10 rolls of thermal paper 20PLN ($7)

Intermec provides a CUPS driver for Linux which enables you to use their printer as regular printer in the OS. Apparently PW40 isn’t supported. I successfully compiled and installed the software, but printing a random text file gave me some gibberish. After that I tried to communicate with te printer in ESC/P language directly, but with no luck. I described my problems on the Intermec forums and still waiting for some reply. In short the problem is, that I don’t really know for sure if this is me doing something wrong, or the printer is broken (it was sold on auction as broken, but seller couldn’t tell for sure if it is really broken or not). So after two evenings the situation looks that I am able to print only one character in a row. If I’m sending more than 1 character to print, it hangs. To make matters worse, my printer won’t print a self test page as it is described in the manual. It feeds paper a little and that’s all. At the other hand I found a datasheet of the printer head used in my printer, but using it directly would be a triumph of form over the content I’m afraid, and I don’t have enough time for that (i.e. making my own printer from scratch). But I’m overambitious you know, so who knows…

This is the only thing It can print. If I try to print more than 1 character in a row, It hangs.

This is the only thing It can print. If I try to print more than 1 character in a row, It hangs.

The any key

…which in fact is a one button HID keyboard which you can reprogram to be any key or combination of keys you wish (open source hardware and software). Links for start:

And quick video (blurry one shoot):

At some point, after few battles I bravely fought with STM32 I wanted to learn something new. I’ve been a few times on Texas Instrument’s site because I wanted to learn more about BeagleBone black and the Sitara CPU that sits on it and spotted the TIVA microcontrolers somewhere on the page. After quick research they looked very promising. It had all I needed that is : can be easily programmed with GCC stack under Linux, has affordable starting platform (they call them launchpads, and they cost $13 and $20 for TM4C123 and TM4C129 respectively) and, what is most important for me, they have well written peripheral libraries and documentation (i.e. at that time I could only rely on opinions from the Web, but after my first project I definitely can confirm that).

My button assembled

My button assembled

So I started a new simple project, which I previously tried to make with STMs and had countless problems with (here is the link). I’ve got EK-TM4C123GXL launchpad and it’s great. Somewhere in near future I’ll try to write another post which would explain how to start development on Linux with GCC toolchain with this board, but for now I can only assure you that getting started was as easy and quick as one evening (I used my own cross-compiler which is described in previous post here). The project aims to construct one button USB-HID keyboard which could be reprogrammed in such a way that pressing the button would send any key code user wishes or even combination of keys if necessary. I imagined, that it would be super cool to have something like that on my desk at work, and if someone comes by and interrupt me with my work, I would ostentatiously hit the big red button which stops the music in my headphones and ask : “what? once again?”. TI provides excellent peripheral library with many examples for many evaluation boards. Furthermore they have great USB library which is neatly divided in four tiers dependent on each other. On the very top is the highest level tier called the “Device Class API” which enables one to implement typical devices in just few lines of code (I mean simple HID, DFU, CDC etc.) ST does not have that! Device class API is great, but in fact quite inflexible. For example HID keyboard can have only one interface which is not enough if one wants to implement something more sophisticated. Here are instructions for designing HID keyboard design with additional multimedia capabilities (which I wanted so bad). Microsoft recommends that there should be at least two USB interfaces in such a device. One should implement a ordinary keyboard compatible with BOOT interface, so such keyboard would be operational during system start up, when there is no OS support yet, and another one would implement the rest of desired keys such as play/pause, volume up, down and so on. I saw quite a few USB interface layouts conforming to this recommendations over the net, including my own keyboard connected to my computer as I write this, so I assume this is the right way to do this. And here is an example of USB interface layout as mentioned earlier. HID reports are also provided. So I moved to lower level tiers and it was not so simple. Here you can find all the code that is inside the button. All the magic is done in main.c which could be split in several smaller files, but who cares. Firstly there are USB descriptors. Standard and HID ones:

const tConfigSection *allConfigSections[] = {

Next you have callbacks. My code is heavily based on TI examples, but in some places it is simplified where no advanced functionality is needed. Custom requests are handled in onRequest where you can find bits responsible for sending and receiving configuration from the host (using another program running on a PC which is linked below). Configuration (i.e. what key combination should be sent to the host when “any-key” is pressed) is stored in eeprom (functions readEeprom and saveEeprom). And of course in main function you can find the main loop with buttons polling and report sending. After connecting the device to a Linux PC it introduces itself as two interface HID device which is silently recognized by Linux (and not so silently by Windows which searches for some drivers for it). What distinguishes this HID keyboard from others is that it recognizes two additional control requests from the host PC which enables user to store and receive combination of keys this device sends when pressed. This requests are prepared in PC application which looks like this: Any key host app   Every button on the main screen can be toggled (on the picture above the “play/pause” one is turned on) which immediately sends the configuration data which is stored in eeprom. After closing the host application (which then releases the USB device to the OS) button works as programmed, in situation depicted above behaving as a play/pause keyboard button. Play/pause was my initial intention and I am using it with this function right now, but friend of mine used in on presentation (as arrow down), and also I tested ctrl-alt-del, ctrl-shift-t (eclipse CDT open element), and power among others. Maximum simultaneously pressed keys which can be simulated is 8 for control ones (like ctrl, shift, alt etc) and 6 for regular ones.

Any key internalsSo there you have it. Feel free to post questions etc. I am also wondering about a “mass production experiment” which would aim to make, say, 10 of those things (with cheaper micro of course!) and sell them on tindie (I have never sold anything made by myself yet). What do you think? Would you buy one of these? What would be reasonable price for this (it is only one button after all… + PC app). I made some very rough calculations and the total cost of one device (assuming production of 100 pcs) would be somewhere around $10, when using MSP430 as a µC and importing casings from China. Not to mention boxes to pack the stuff, soldering (probably in some kind of reflow oven) and sending it all together. So for now it seems overwhelming for me, but who knows.

And for something completely different : what happens when you connect a USB device with VBUS and GND shorted:

Jun  4 08:58:52 diora kernel: [  998.928731] hub 2-1:1.0: over-current condition on port 1
Jun  4 08:58:52 diora kernel: [  999.136827] hub 2-1:1.0: over-current condition on port 2
... and you can hear humming in headphones connected to the PC.

EDIT : User jancumps on the EEVBlog forums pointed out, that there is an ongoing indiegogo campaign for a similar idea. Looks quite the same as mine :D

EDIT2 : Dave did a review of the “serious button” this is not mine design, it only looks the same:

EDIT3 (09/2014) : Another one on indiegogo with goal of $100k!

Toolchain for Cortex-M4/M0/M7

Last edit : Dec 2016

EDIT : do not accidentally press ctrl-c. It exists menu-config immediately.
Important links:

These are brief instructions for creating your own GCC based tool-chain for a Cortex-M4 microcontroller, heavily based on this post. I tried a few precompiled ones which I found on the Internet, but always wondered how to make one which would be configured specifically for my micro, not for “ARM” in general. Tool-chains generated by following method was tested by me on ST STM32F407 and Texas Instruments TIVA-C TM4C123 (i.e. one tool-chain for these two µC since they both include the same CPU). My setup as I write this:

  • Host operating system : Ubuntu 14.04 – 16.10
  • Kernel : 3.13.0-24-generic – 4.8.0-30-generic
  • Few GB of free space on HD.

Making a tool-chain is hard, therefore wise people over the net developed tools to simplify the process. Few years ago, when I attempted to build a GCC tool-chain I struggled with lack of information, complexity of the process, and variety of recipes, which all seemed were extremely complex, and in some point in the process I was struck with problem I couldn’t solve. Then I found crosstool-NG – it may seem funny, but all this stuff was new to me, and I was looking for the best way possible to finish the task, some “canonical” way of building a cross-compiler, and for me, crosstool-NG is exactly this. Lets grab the newest version from its website and follow the installation instructions (this step will build only the crosstool-NG itself, read the EDIT note below before doing this):

mkdir my-toolchain
cd my-toolchain
# Pay attention which version is the newest. As of writing this, the newest was
# 1.19.0, but at the "header-file" 
# incorrectly indicated the 1.18.0 version
tar jxvf crosstool-ng-1.19.0.tar.bz2 
cd crosstool-ng-1.19.0/
# Resolve some dependencies. EDIT Ubuntu 15.04 wants libtool-bin as well.
sudo apt-get install bison flex gperf texinfo gawk libtool libtool-bin automake libncurses5-dev libexpat-dev help2man
# Provide a prefix to some destination which PATH points to.
./configure --prefix=/home/iwasz/local/
make install

EDIT (2016-12) It seems, that the newest official release available today is 1.22, which is more than year old. The older ct-ng is, the older GCC, and libraries it provides, which may cause a problems on newer systems. For example making a cross-gcc version 5.3 using ct-ng version 1.22 on Ubuntu 16.10 (which uses GCC 6.2) resulted in compilation error during GCC stage. Thus I think, that the best option is to start with development ct-ng from github repo. Instructions are here but basically you only need to:

git clone
cd crosstool-ng
./configure --prefix=/home/iwasz/local/
make install

Now we perform some setup. All features of our future tool-chain will be set during this step:

# cd back, so we are in "my-toolchain" directory again.
cd ..
mkdir staging
cd staging
ct-ng  menuconfig

The last command brings the following menu-config tool:


Paths and misc options

  • Try features marked as EXPERIMENTAL : Y
  • Prefix : ${HOME}/local/share/${CT_TARGET} . Provide a destination folder that suits your needs, give descriptive name if you plan to host more than one crosscompilers.
  • Number of parallel jobs : 8 (depends on host capabilities of course).
  • EDIT minor : Uncheck “Render the toolchain read-only” (I find it annoying to have read only directory in my stuff, it’s problematic to delete it later, you have to chmod etc).
  • Check “Debug Crosstool-NG”, “Save intermediate steps”, and “gzip saved states” as described here.


Target options

  • Target Architecture : arm
  • (cortexm4) Suffix to the arch-part (breaks the build!). EDIT : trying with this turned on in 1.21.0, and works.
  • Use the MMU : N
  • Architecture level : armv7-m (EDIT armv7e-m was probably added since gcc-4.9). As you can find here, the ARM architecture for Cortex-M4 is ARMv7E-M. In GCC manual (type /, and -march a few times) we can find that, among many others, available values for -march are armv7, armv7-a, armv7-r, armv7-m. Unfortunately the armv7e-m is invalid (if someone could elaborate on that, it would be perfect), so I choose the most similar armv7-m option. EDIT here : I’ve found that they added armv7e-m in recent version of GCC.
    EDIT : armv6-m for Cortex-M0. You can always check it in the “programming manual” of every STM32 part in chapter entitled like “About the STM32 Cortex-M0 processor and core peripherals”
  • Emit assembly for CPU : cortex-m4 (full list of available options can be found in GCC manual somewhere near -mcpu phrase). Or here.
  • Tune for CPU : empty (empty because -mcpu was provided. -mtune is similar to -mcpu, but -mcpu restricts us to one CPU only, while -mtune tries to do its best to optimize for particular CPU while still retaining the possibility to compile for other CPUs).
  • Use specific FPU : fpv4-sp-d16. Cortex-M4 can have FPU, but not necessarily (with FPU it is called Cortex-M4F, and Cortex-M4 without). But the fact is I found this option somewhere over the net, and I am a little bit confused on the topic of FPU.
  • Floating point : hardware (FPU). EDIT : M0 have no FPU.
  • Default instruction set mode (thumb).

03-target-options Toolchain options.

  • Add some cool Toolchain ID string.
  • Set “Tuple’s vendor string” to none.

04-toolchain-options Operating System. Set Target OS to bare-metal: 05-operating-system
Binary utilities

  • Binary format: (Flat) use ELF
  • binutils version (2.22) – latest which is not marked as EXPERIMENTAL. EDIT just recently I went with 2.24 (EXPERIMENTAL), and everything seems to be OK.


C compiler

  • Show Linaro versions : Y
  • gcc version (linaro-4.8-2013.06-1)
  • C++ : Y

07-c-compiler   C-library

  • C library (newlib)
  • newlib version (2.0.0 (EXPERIMENTAL)) – the newest, and works OK.
  • Disable the syscalls supplied with newlib : Y – I provide my own syscalls in every program. BTW I had some problems when his option was checked (crt0 missing?)

08-c-library Debug facilities

  • gdb : Y


Then dig into “GDB” and check show lianro versions, and choose the newest from linaro, and set Enable python scripting to N (caused build problems for me) EDIT: qtcreator requires python support in GDB:


Exit menu-config (few times ESC, and save when prompted) and finally build the toolchain:

ct-ng build
tail -f build.log # in another console (not necessary if debug options were set)

The build process takes some time (30-60 minutes), and if in some point for some reason the build fail, first place you check is the build.log file in staging directory (therefore I pasted this tail -f command earlier, but of course it does not matter how you display the file). For example, in my case, the crosstool-NG decided to fail with this:

... kilobytes, megabytes of logs ....
[ALL  ]    checking whether to use python... yes
[ALL  ]    checking for python... /usr/bin/python
[ALL  ]    checking for python2.7... no
[ERROR]    configure: error: python is missing or unusable
[ERROR]    make[2]: *** [configure-gdb] Error 1
[ALL  ]    make[2]: Leaving directory `/home/iwasz/Documents/my-toolchain/staging/.build/arm-unknown-eabi/build/build-gdb-cross'
[ERROR]    make[1]: *** [all] Error 2
[ALL  ]    make[1]: Leaving directory `/home/iwasz/Documents/my-toolchain/staging/.build/arm-unknown-eabi/build/build-gdb-cross'
[ERROR]  >>
[ERROR]  >>  Build failed in step 'Installing cross-gdb'
[ERROR]  >>        called in step '(top-level)'
[ERROR]  >>
[ERROR]  >>  Error happened in: CT_DoExecLog[scripts/functions@257]
[ERROR]  >>        called from: do_debug_gdb_build[scripts/build/debug/]
[ERROR]  >>        called from: do_debug[scripts/build/]
[ERROR]  >>        called from: main[scripts/]
[ERROR]  >>
[ERROR]  >>  For more info on this error, look at the file: 'build.log'
[ERROR]  >>  There is a list of known issues, some with workarounds, in:
[ERROR]  >>      '/home/iwasz/local/share/doc/crosstool-ng/ct-ng.1.19.0/B - Known issues.txt'
[ERROR]  (elapsed: 58:52.70)

I didn’t thought long on this one (apt-get install libpython2.7-dev maybe???), but disabled the python support for GDB (I modified the instructions accordingly, so hopefully you haven’t had the same error). But in case you had, you should resolve the error (maybe change the configuration using menuconfig, or resolve the problem in other ways, depending on the cause) and rerun ct-ng, or refer to this stack-overflow thread for more info on speeding up the process after build has failed.

Edit Feb 2015 : I recently made cross-compiler x86_64 -> i686 to be able to make 32bit binaries on by 64bit box. Statically linked binaries made with it crashed with message:

FATAL: kernel too old

Following suggestions found here, I found that indeed, my binaries were (output of file command):

ELF 32-bit LSB  executable, Intel 80386, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.15.4, not stripped

But my uname -a is:

Linux xxx 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

The solution is to instruct crosstool-ng so it compile glibc/eglibc with older kernel support. Invoke ct-ng menuconfig and:

  1. Go into “C-library”
  2. Go into “Minimum supported kernel version (Specific kernel version) —>”
  3. Check “(X) Specific kernel version”
  4. “ESC ESC” and make sure that : “(2.6.9) Minimum kernel version to support”

EDIT (mar 2016). For the second time I encountered an error like this:

[ALL  ]    /usr/bin/install: cannot stat ‘…/.build/src/newlib-linaro-2.2.0-2015.01/libgloss/arm/linux.specs’: No such file or directory

According to this, and especially this , the error is caused by some bug in newlib itself. User ‘bhundvensugested that suffix ‘hf’ to ‘eabi’ (making it ‘eabihf’) is causing problems, so I turned off ‘Target options —> append ‘hf’ to the tuple (EXPERIMENTAL)’, and it helped.

STM32F407 DMA early tests

Research notes. Useful links:

DMA is a peripheral that can copy data between other peripherals and memory or between memory and memory. It used to be implemented in form of separate IC in early days, but in modern µCs it is of course integrated inside the single chip.

STM32 DMA peripherals are able to copy data from memory to peripheral, from peripheral to memory and from memory to the other place in memory (for example from RAM to FLASH as StdPeriph example shows). There are two DMA controllers : DMA1 and DMA2 and both have 8 streams. I see a stream as a some kind of physical, bi-directional connection between the DMA controller and some other peripheral. Those 16 streams cover all (most?) peripherals meaning that one stream is connected to more than one peripheral. For example if one is about to send data using USART1 he has to use exactly DMA2_Stream7, or if he wants to receive data from SPI3_RX he has to use DMA1_Stream0 or DMA1_Stream2, because apparently SPI3_RX is connected to both of those streams (see table 43 and 44 in reference manual of STM32F407).

DMA works automatically meaning that if there is some new data it will be copied without user code (and CPU) involved. It is possible thanks to channels, which I imagine as signals (like GUI signals if you know what I mean) connected between DMA controller and the peripheral (there is also something called “arbiter” between them). Peripheral can send a request (signal) to the DMA if it has new data, and DMA can then process it. At the same time DMA acknowledges it has got this new portion of data, and peripheral releases the request. Each stream can “listen” to its 8 channels, so there are 2 controllers * 8 streams * 8 channels = 128 configuration combinations, and that way every peripheral can have its own communication path with the DMA.

Streams have configurable priorities in case two or more streams request DMA controller attention. If two or more streams have the same priority, then stream with lower number wins. The bit of hardware called “arbiter” manages those priorities and decides which stream goes first.

So here comes the first DMA test I wrote (tested on STM32F407-DISCOVERY). It writes to USART1:

#include &lt;stm32f4xx.h&gt;
#include "logf.h"
 * For printf, and USART1 in general.
void initUsart (void)
        RCC_APB2PeriphClockCmd (RCC_APB2Periph_USART1, ENABLE);
        GPIO_InitTypeDef gpioInitStruct;
        RCC_AHB1PeriphClockCmd (RCC_AHB1Periph_GPIOB, ENABLE);
        gpioInitStruct.GPIO_Pin = GPIO_Pin_6 | GPIO_Pin_7;
        gpioInitStruct.GPIO_Mode = GPIO_Mode_AF;
        gpioInitStruct.GPIO_Speed = GPIO_High_Speed;
        gpioInitStruct.GPIO_OType = GPIO_OType_PP;
        gpioInitStruct.GPIO_PuPd = GPIO_PuPd_UP;
        GPIO_Init (GPIOB, &amp;gpioInitStruct);
        GPIO_PinAFConfig (GPIOB, GPIO_PinSource6, GPIO_AF_USART1); // TX
        GPIO_PinAFConfig (GPIOB, GPIO_PinSource7, GPIO_AF_USART1); // RX
        USART_InitTypeDef usartInitStruct;
        usartInitStruct.USART_BaudRate = 9600;
        usartInitStruct.USART_WordLength = USART_WordLength_8b;
        usartInitStruct.USART_StopBits = USART_StopBits_1;
        usartInitStruct.USART_Parity = USART_Parity_No;
        usartInitStruct.USART_Mode = USART_Mode_Rx | USART_Mode_Tx;
        usartInitStruct.USART_HardwareFlowControl = USART_HardwareFlowControl_None;
        USART_Init (USART1, &amp;usartInitStruct);
        USART_Cmd (USART1, ENABLE);
uint8_t myStrlen (char const *s)
        uint8_t len = 0;
        while (*s++) {
        return len;
 * Test1
void initDma (char const *outputBuffer)
         * Reset DMA Stream registers (for debug purpose). For DMA2_Stream7 exmplanation read on.
         * It also disables the stream. Stream must be disabled prior configure it. Otherwise it can
         * misbehave.
        DMA_DeInit (DMA2_Stream7);
         * Check if the DMA Stream is disabled before enabling it.
         * Note that this step is useful when the same Stream is used multiple times:
         * enabled, then disabled then re-enabled... In this case, the DMA Stream disable
         * will be effective only at the end of the ongoing data transfer and it will
         * not be possible to re-configure it before making sure that the Enable bit
         * has been cleared by hardware. If the Stream is used only once, this step might
         * be bypassed.
        while (DMA_GetCmdStatus (DMA2_Stream7) != DISABLE) {
        /* Configure the DMA stream. */
        DMA_InitTypeDef  dmaInitStructure;
         * Possible values for DMA_Channel are DMA_Channel_[0..7]. Refer to table 44 in reference manual
         * mentioned earlier. USART1_RX is communicate with DMA via streams 2 and 5 (both on channel 4).
         * USART1_TX uses stream7 / channel 4.
        dmaInitStructure.DMA_Channel = DMA_Channel_4;
         * Possible values : DMA_DIR_PeripheralToMemory, DMA_DIR_MemoryToPeripheral,
         * DMA_DIR_MemoryToMemory.
        dmaInitStructure.DMA_DIR = DMA_DIR_MemoryToPeripheral;
        /* Why DMA_PeripheralBaseAddr is of type uint32_t? Shouldn't it be void *? */
        dmaInitStructure.DMA_PeripheralBaseAddr = (uint32_t)&amp;(USART1-&gt;DR);
        dmaInitStructure.DMA_Memory0BaseAddr = (uint32_t)outputBuffer;
         * Only valid values here are : DMA_PeripheralDataSize_Byte, DMA_PeripheralDataSize_HalfWord,
         * DMA_PeripheralDataSize_Word
        dmaInitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Byte;
         * I guess, that for memory is is always good to use DMA_MemoryDataSize_Word (32bits), since this is
         * a 32 bit micro. But haven't checked that. But here I use Byte instead for easier  DMA_BufferSize
         * calculations.
        dmaInitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
         * Length of the data to be transferred by the DMA. Unit of this length is DMA_MemoryDataSize when
         * direction is from memory to peripheral or DMA_PeripheralDataSize otherwise. Since I set both
         * sizes to one byte, I simply put strlen here.
        dmaInitStructure.DMA_BufferSize = myStrlen (outputBuffer);
         * DMA_PeripheralInc_Disable means to read or to write to the same location everytime.
         * DMA_MemoryInc_Enable would increase memory or peripheral location after each read/write.
        dmaInitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
        dmaInitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
        /* DMA_Mode_Normal or DMA_Mode_Circular here. */
        dmaInitStructure.DMA_Mode = DMA_Mode_Normal;
        /* DMA_Priority_Low, DMA_Priority_Medium, DMA_Priority_High or DMA_Priority_VeryHigh */
        dmaInitStructure.DMA_Priority = DMA_Priority_VeryHigh;
        /* DMA_FIFOMode_Disable means direst mode, DMA_FIFOMode_Enable means FIFO mode. FIFO is good. */
        dmaInitStructure.DMA_FIFOMode = DMA_FIFOMode_Disable;
         * DMA_FIFOThreshold_1QuarterFull, DMA_FIFOThreshold_HalfFull, DMA_FIFOThreshold_3QuartersFull or
         * DMA_FIFOThreshold_Full.
        dmaInitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_Full;
         * Specifies whether to use single or busrt mode. If burst, then it specifies how much "beats"
         * to use. DMA_MemoryBurst_Single, DMA_MemoryBurst_INC4, DMA_MemoryBurst_INC8 or
         * DMA_MemoryBurst_INC16.
        dmaInitStructure.DMA_MemoryBurst = DMA_MemoryBurst_Single;
        dmaInitStructure.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;
        /* Configure DMA, but still leave it turned off. */
        DMA_Init (DMA2_Stream7, &amp;dmaInitStructure);
        /* DMA_FlowCtrl_Memory, DMA_FlowCtrl_Peripheral */
        DMA_FlowControllerConfig (DMA2_Stream7, DMA_FlowCtrl_Memory);
        /* Enable DMA interrupts. */
        DMA_ITConfig (DMA2_Stream7, DMA_IT_TC | DMA_IT_HT | DMA_IT_TE | DMA_IT_DME | DMA_IT_FE, ENABLE);
        /* Enable the DMA Stream. */
        DMA_Cmd (DMA2_Stream7, ENABLE);
         * And check if the DMA Stream has been effectively enabled.
         * The DMA Stream Enable bit is cleared immediately by hardware if there is an
         * error in the configuration parameters and the transfer is no started (ie. when
         * wrong FIFO threshold is configured ...)
        uint16_t timeout = 10000;
        while ((DMA_GetCmdStatus (DMA2_Stream7) != ENABLE) &amp;&amp; (timeout-- &gt; 0)) {
        /* Check if a timeout condition occurred */
        if (timeout == 0) {
                /* Manage the error: to simplify the code enter an infinite loop */
                while (1) {
int main (void)
        /* This would be a function parameter or something like that. */
        char *outputBufferA = "Ala ma kota, a kot ma ale, to jest taki wierszyk z czytanki dla dzieci, ktora jest tylko w Polsce.\r\n";
        char *outputBufferB = "Wlazl kotek na plotek i mruga. Ladna to piosenka nie dluga. Nie dluga, nie krotka lecz w sam raz.\r\n";
         * Enable the peripheral clock for DMA2. I want to use DMA with USART1, so according to
         * table 44 in reference manual for STM32F407 (RM0090) this would be DMA2 peripheral.
         * Description in stm32f4xx_dma.c advises to do this as the first operarion.
         * Spend two fu.king nights on this. Docs says to use RCC_AHB1PeriphResetCmd, but use
         * RCC_AHB1PeriphClockCmd instead!!!
//        RCC_AHB1PeriphResetCmd (RCC_AHB1Periph_DMA2, ENABLE);
        RCC_AHB1PeriphClockCmd (RCC_AHB1Periph_DMA2, ENABLE);
         * Enable the USART1 device as usual.
        initUsart ();
        initDma (outputBufferA);
         * The DMA stream is turned on now and waits for DMA requests. As far as I know, if this
         * were to be memory-to-memory transfer, it would start immedialtely without enabling any
         * channels. But for peripherals one has to enable the channel for requests. After following
         * statement, you should see data on serial console.
         * This statement enables the DMA internals in USART (this stuff which communicates with the DMA
         * controller).
        /* Waiting the end of Data transfer */
        while (USART_GetFlagStatus (USART1, USART_FLAG_TC) == RESET)
        while (DMA_GetFlagStatus (DMA2_Stream7, DMA_FLAG_TCIF7) == RESET)
        logf("It worked, and didn't hanged\r\n");
        /* Clear DMA Transfer Complete Flags */
        DMA_ClearFlag (DMA2_Stream7, DMA_FLAG_TCIF7);
        /* Th has to be initialized once again AFAIK to send another portion of data. */
        initDma (outputBufferB);
        /* Try to start it again */
        /* Waiting the end of Data transfer */
        while (USART_GetFlagStatus (USART1, USART_FLAG_TC) == RESET)
        logf("It workedagain\r\n");
        /* Infinite loop */
        while (1) {