The Future of Developer Tools for IoT, ThingMonk 2017

ThingMonk is an annual London conference that brings together the people building and shaping the Internet of Things. This year I spoke at the conference on ‘The Future of Developer Tools for IoT’. This talk looks at emerging and future trends in the developer tools space. Check out the slides and feedback from the audience, as well as reference links at the end. Plus thanks to Marcel Bruch & Codetrails for input on AI tools. Be sure to share your thoughts on how you see developer tools shaping up to scale for building the Internet of Things.

Continue reading “The Future of Developer Tools for IoT, ThingMonk 2017”

GDB’s MI is not a Debug Protocol

While looking to the future of debugger tooling, it is still important to consider the prior art and the solutions that have stood the test of time. For embedded development, gdb is high on that list, so it is worth considering if gdb’s interface could be the basis of a debug protocol.

If you’ve used gdb to debug C/C++ code then you are probably aware of MI, the machine interface layer used to communicate between the debugger backend and the IDE front end. MI is not only used by gdb but also adopted by lldb (the defacto debugger for Swift) and more recently by clrdbg (.NET Core). MI defines a rich set of functionality from standard debug run control and breakpoints up to advanced features for multi-process debug, reverse debugging and dynamic printf. With MI being pretty pervasive and supporting such rich functionality, it is tempting to think it might make the basis of a good debug protocol.  However in practice it lacks some of the qualities of a good protocol:

1. A Specification

We once had the opportunity to work on a project where the brief was to integrate into Eclipse IDE/CDT a custom debugger that ‘implemented the MI spec’. We can tell you we learnt the hard way that MI has plenty of useful documentation but no spec to speak of. This matters when you get into the nitty gritty of implementation details for example: what syntax should be used to notify when a bad condition has been created on a breakpoint?

The documentation does not necessarily reflect what the code does, some command or command variants have inconsistencies with the source code or don’t reflect platform dependent issues. For example, the -exec-step-instruction in practice takes an argument (e.g -exec-step-instruction 1)  even though this is not documented.

The main message here is documentation, even good documentation as in the case of gdb, is not the same as a protocol specification, so one can’t blindly implement to the docs (and if you think it’s just a case of looking at the code… well, which version?- see #4 below).

2. Clean Interfaces with no Idiosyncrasies

This piece of code from Visual Studio’s MIEngine demonstrates how rife MI is with idiosyncrasies. The code launches a debugger which will use MI to communicate i.e. to gdb, lldb or clrdbg. There are special cases for each tool that an IDE just shouldn’t need to know about:

  • Different ways of specifying a working directory depending on the tool
  • Environment variables are set differently: before launch for gdb/lldb after for clrdbg
  • Details of which Operating System the debugger is being run on

miengine

And this is even before you launch MI. In Eclipse CDT just after launching MI, the IDE has to know about and issue commands about all sorts of things e.g. ‘set print sevenbit-strings on’ c’mon, really, seriously?  Tom sums it up nicely:

It is an oddity that currently an MI consumer must check gdb’s host charset in order to know how to decode its output.

Once you get into actual debugging there’s a fair amount of ‘need-to-know’ for special cases & exceptions. A protocol needs to steer-clear of implementation details, but in the case of MI these have all too often leaked in.

3. Fit for Purpose

As MI was not specifically designed to be a protocol,  not suprisingly there are a few behaviour specific things that make it not fit to be a protocol. For example:

  • If your program prints to stdout, then that can corrupt the output stream of MI, breaking the instructions.
  • In some cases GDB responds twice from a single command. In such cases, for example Eclipse CDT has a special MIAsyncErrorProcessor class just to manage such cases.

4. Versioning

A good protocol has defined versions that clients and subscribers can adapt to.  With each new version of GDB,  MI has subtle differences that make client implementation long-winded and difficult to maintain. For example, in Eclipse CDT’s gdb debugger implementation (DSF) separate classes are created to manage differences in MI in different versions of gdb.  There are 5 different breakpoint classes, 7 different run control classes, etc And this is just gdb versions, let alone lldb or clrdb – imagine trying to implement wide-scale support for all those in a new IDE!

debug_versions

Conclusion

While feature-rich and ubiquitous, gdb’s MI is a reasonable syntax, but not a good debug protocol.  A good protocol needs much more than that – clean interfaces, fit for purpose, a spec & versioning – if it is really going to make common debugger implementations easier.

What can Eclipse developers learn from Team Sky’s aggregation of marginal gains?

The concept of marginal gains, made famous by Team Sky, has revolutionized some sports. The principle is that if you make 1% improvements in a number of areas, in the long run the cumulative gains will be hugely significant. And in that vein, a 1% decline here-and-there will lead to significant problems further down the line.

So how could we apply that principle to the user experience (UX) of Eclipse C/C++ Development (CDT) tools? What would happen if we continuously improved lots of small things in Eclipse CDT? Such as the build console speed? Or a really annoying message in the debugger source window? It is still too soon to analyse the impact of these changes but we believe even the smallest positive change will be worth it. Plus it is a great way to get new folks involved with the project. Here’s a guest post from Pierre Sachot, a computer science student at IUT Blagnac who is currently doing open-source work experience with Kichwa Coders. Pierre has written an experience report on fixing his very first CDT UX issue.

Context

This week I worked with Yannick on fixing the CDT CSourceNotFoundEditor problem – the unwanted error message that Eclipse CDT shows when users are running the debugger and jumping into a function which is in another project file. When Eclipse CDT users were running the debugger on the C Project, a window was opening on screen. This window was both alarming in appearance and obtrusive. In addition, the message itself was unclear. For example, it could display “No source available for 0x02547”, which is irrelevent to the user because he/she does not have an access to this memory address. Several users had complained about it and expressed a desire to disable the window (see: stack overflow: “Eclipse often opens editors for hex numbers (addresses?) then fails to load anything”). In this post I will show you how we replaced CSourceUserNot FoundEditor with a better user experience display.

Continue reading “What can Eclipse developers learn from Team Sky’s aggregation of marginal gains?”

Getting Started With Gerrit on Eclipse CDT

This is a guest post from Yannick Mayeur, a computer science student at IUT Blagnac who is currently doing open-source work experience with Kichwa Coders. It was originally one of his weekly write-ups which can be found here.

You are probaly familiar with the pull request system of GitHub that programmers use to contribute to an open-source project. Gerrit (named after its designer Gerrit Rietveld) is basically an improved version of this system. Gerrit allows the committer to give more precise feedback on each line of code edited, and allows other members of the team to review those changes. Gerrit is used by the Eclipse CDT community. In this blog post I will show you how to efficiently get started with it.

The required tools & knowledge

Having Git is basically all you need to clone the sources, and push them. If you want to edit them in a good environment use the Eclipse JAVA IDE. Knowing the basics of Git is also required, though I think you could pick up Git as you go along with a bit of trial and error.

How to get the sources of CDT

Cloning the sources to your computer is an easy but essential task.

The link of the repository is: git://git.eclipse.org/gitroot/cdt/org.eclipse.cdt

To clone use the following command:

git clone git://git.eclipse.org/gitroot/cdt/org.eclipse.cdt

Once you have the files, go to Bugzilla and find a bug you want to fix.

Pushing the changes to Gerrit

Now comes the tricky part. In order for you to be able to push your change a few things have to be respected. Continue reading “Getting Started With Gerrit on Eclipse CDT”

Pushing the Limits of Xtext for C/C++ Linker Scripts

simplemem

Xtext is the popular Eclipse language development framework for domain specific languages. Its sweet spot is JVM-languages and it is excellent for languages where you can define the grammar yourself. But how well can Xtext cope with a non-JVM language that has undergone decades of evolution?

In our case, we want to see if we can take advantage of Xtext to create an editor for C/C++ linker scripts in CDT. Linker scripts are used to specify the memory sections, layouts and how code relates to these sections. Linker scripts consist of the ld command language, and this is what a simple typical script might look like:

MEMORY {
 RAM : ORIGIN = 0x0, LENGTH = 0x2000
 ROM : ORIGIN = 0x80000, LENGTH = 0x10000
}

SECTIONS {
 .text : { *(.text) *(.text.*) } > ROM
 .rodata : { *(.rodata) *(.rodata.*) } > ROM
 .data : { *(.data) *(.data.*) } > RAM
 .bss : { _bss = .; *(.bss) *(.bss.*) *(COMMON) _ebss = .; _end = .; } > RAM
}

Alternatives to Xtext

Besides using Xtext, its worth considering some of the other options there are for this task:

  • Roll-your-own – the existing C/C++ Editor in CDT does this, gives full control, best error-recovery and supports bidirectionality, recreating source from abstract syntax tree (AST), but it is a last resort as it would be an incredible amount of work that would take a long time to get right.
  • Antlr – write your own antlr grammar, but since antlr is already used in Xtext, may as well use Xtext and get benefits of Eclipse editor integration
  • Reuse linker’s bison grammar – would give perfect parsing, but it is a no-go because i) it’s GPL ii) it generates C code not Java & iii) requirements for editing are much more strenuous than for linking and this for example, would not support bidirectionality (i.e you can’t recreate the linker file from the AST).

Benefits of Xtext

The Xtext framework additionally provides these nice features we are interested in:

  • Parsing, lexing & AST generation
    • serialisation support is particularly important to support bidirectionality and preserve users comments, whitespace etc.
  • Rich Editor Features
    • syntax highlighting
    • content assist
    • validation & error markers
    • code folding & bracket matching
  • Integrated Outline editor
  • Ecore model generation which can be used for integration with UI frameworks such as EMF Forms, Sirius, etc.

Linker Script Parsing Challenges

When we talk about the ld command language being a non-JVM language, here are some specific challenges related to what that means.

  1. Crazy Identifiers! The following are valid identifiers in linker scripts:
    • .text
    • *
    • hello*.o
    • “spaces are ok, just quote the identifier”
    • this+is-another*crazy[example]
  2. Identifier or Number? Things that appear to be identifiers may actually be numbers:
    • a123 – identifier
    • a123x – number
    • 123y – identifier
    • 123h -number
  3. Identifier or Expression?

In the grammar 2+3, for example, depending on context, can either be an identifier or an expression:

SECTIONS {
 .out_name : {
  file*.o(.text.*)
  2+3(*)
  symbol = 2+3;
 }
}

The first 2+3 is a filename, so almost anything that can be a filename is allowed there. The second 2+3 is an expression to be assigned to symbol.

Resolutions

Here’s what we did to support the linker language as far as we could:

  1. Custom Xtext grammar – as extending the XType grammar does not make sense, the main job is to craft the grammar to understand all the linker script identifier and expressions specifics. This involves iterating as we add in more and more language feature support, here’s the work in progress.
  2. Limited Identifier Support – in some cases we opted to not support certain identifiers unless they are escaped (double-quoted). While linker scripts theoretically support such identifiers (e.g. 1234abcd) we have not found a single case yet of an identifier that would actually need escaping. If one did crop up, the user could adjust it to work with the editor (e.g. “1234abcd”).
  3. Context Based Lexing – knowing the difference between an identifier or expression would require context based lexing rules. However this will not work with the antlr lexer. We have the option to replace it with a custom or external lexer. This is an option that can be considered in the future if desirable.

Conclusion

Xtext is a great language development framework. While Xtext may not be able to support every theoretical case of the long-lived linker script command language, it can be used to provide a very high level of support for the common features. Support for context based lexing in the future would enable a higher level of language support. Xtext can be used to provide a rich language editor with syntax colouring, command completion, integrated outline view & more in a relatively short space of time. A powerful linker script editor is another great feature for C/C++ developers that use CDT, the reference C/C++ IDE in the industry.