GDB’s MI is not a Debug Protocol

While looking to the future of debugger tooling, it is still important to consider the prior art and the solutions that have stood the test of time. For embedded development, gdb is high on that list, so it is worth considering if gdb’s interface could be the basis of a debug protocol.

If you’ve used gdb to debug C/C++ code then you are probably aware of MI, the machine interface layer used to communicate between the debugger backend and the IDE front end. MI is not only used by gdb but also adopted by lldb (the defacto debugger for Swift) and more recently by clrdbg (.NET Core). MI defines a rich set of functionality from standard debug run control and breakpoints up to advanced features for multi-process debug, reverse debugging and dynamic printf. With MI being pretty pervasive and supporting such rich functionality, it is tempting to think it might make the basis of a good debug protocol.  However in practice it lacks some of the qualities of a good protocol:

1. A Specification

We once had the opportunity to work on a project where the brief was to integrate into Eclipse IDE/CDT a custom debugger that ‘implemented the MI spec’. We can tell you we learnt the hard way that MI has plenty of useful documentation but no spec to speak of. This matters when you get into the nitty gritty of implementation details for example: what syntax should be used to notify when a bad condition has been created on a breakpoint?

The documentation does not necessarily reflect what the code does, some command or command variants have inconsistencies with the source code or don’t reflect platform dependent issues. For example, the -exec-step-instruction in practice takes an argument (e.g -exec-step-instruction 1)  even though this is not documented.

The main message here is documentation, even good documentation as in the case of gdb, is not the same as a protocol specification, so one can’t blindly implement to the docs (and if you think it’s just a case of looking at the code… well, which version?- see #4 below).

2. Clean Interfaces with no Idiosyncrasies

This piece of code from Visual Studio’s MIEngine demonstrates how rife MI is with idiosyncrasies. The code launches a debugger which will use MI to communicate i.e. to gdb, lldb or clrdbg. There are special cases for each tool that an IDE just shouldn’t need to know about:

  • Different ways of specifying a working directory depending on the tool
  • Environment variables are set differently: before launch for gdb/lldb after for clrdbg
  • Details of which Operating System the debugger is being run on

miengine

And this is even before you launch MI. In Eclipse CDT just after launching MI, the IDE has to know about and issue commands about all sorts of things e.g. ‘set print sevenbit-strings on’ c’mon, really, seriously?  Tom sums it up nicely:

It is an oddity that currently an MI consumer must check gdb’s host charset in order to know how to decode its output.

Once you get into actual debugging there’s a fair amount of ‘need-to-know’ for special cases & exceptions. A protocol needs to steer-clear of implementation details, but in the case of MI these have all too often leaked in.

3. Fit for Purpose

As MI was not specifically designed to be a protocol,  not suprisingly there are a few behaviour specific things that make it not fit to be a protocol. For example:

  • If your program prints to stdout, then that can corrupt the output stream of MI, breaking the instructions.
  • In some cases GDB responds twice from a single command. In such cases, for example Eclipse CDT has a special MIAsyncErrorProcessor class just to manage such cases.

4. Versioning

A good protocol has defined versions that clients and subscribers can adapt to.  With each new version of GDB,  MI has subtle differences that make client implementation long-winded and difficult to maintain. For example, in Eclipse CDT’s gdb debugger implementation (DSF) separate classes are created to manage differences in MI in different versions of gdb.  There are 5 different breakpoint classes, 7 different run control classes, etc And this is just gdb versions, let alone lldb or clrdb – imagine trying to implement wide-scale support for all those in a new IDE!

debug_versions

Conclusion

While feature-rich and ubiquitous, gdb’s MI is a reasonable syntax, but not a good debug protocol.  A good protocol needs much more than that – clean interfaces, fit for purpose, a spec & versioning – if it is really going to make common debugger implementations easier.

What about a Common Debug Protocol?

moth-1749439_1920‘From then on, when anything went wrong with a computer, we said it had bugs in it.’ — Grace Hopper

As developers, we all know languages and frameworks are emerging and changing at breakneck speed. And the tools just can’t keep up. On top of that, there’s the move to tools in the cloud, which promise the ultimate in developer convenience. While this will be inevitable, the current tools still have a way to go in terms of functionality offered.

As a result, momentum is building around solutions that work for multi-language support in multiple environments.  The language server protocol (LSP) has emerged as the chosen way for various IDEs and editors to keep pace with all the different language changes. For new cloud IDEs like Eclipse Che, it is a vital part of the roadmap, providing an effective way to deal with the sheer scale of the problem. The LSP solves this problem by having a server for each language, with a common protocol that all front ends can use to communicate with it. One of the key ideas of the LSP is that the IDE knows as little as it can, delegating down to the language server to do the specifics.

Even for established IDEs such as Eclipse, LSP has a lot to offer. When Eclipse came on the scene more than 15 years ago it was a massive step forward when it came to sharing common UI parts between multiple languages. But that framework is far too cumbersome for today’s rate of change. The LSP takes things to the next level and makes the split between UI and backend even more absolute. Lots of progress has been made on the LSP4E & LSP4J projects.

LSP is not just for Eclipse projects. Originating from Microsoft, it also forms the basis for language support in VS Code. Just last month there were 27 protocol implementations, today there are 35 and counting. Truly impressive growth.

With language support taken care of, it’s time to ask the question about debugging. Again, when Eclipse first emerged it offered a state-of-the-art debugger framework. A common debug interface that separated UI and backend and allowed for implementations in multiple languages: Java, C/C++, Python, etc. But with little investment in recent years it is difficult to see this as being up to the job of being a general framework for the next generation of tools.

Similarly to LSP, Microsoft provides a  Debug Protocol with a number of adapters written, many already being used in VS Code.  So far there has not yet been a rush to adopt it in the same way as LSP, but is this the next logical step for tool developers? Especially for nascent platforms such as Eclipse Che, where the debug support is at minimal viable demo levels, and will likely rise up the priority list soon.

Yet the whole area of debugging is pretty substantial. In order to provide rich debug features, tools need to look at supporting a whole range of functionality, for example:

  • Launching
  • Processes, threads, etc
  • Stack Traces
  • Run Control (step, continue, run to, etc)
  • Breakpoints, watchpoints
  • Variables
  • Source Code Lookup
  • I/O, Console support
  • Expressions, etc

So all of those features. For every language we use please. In every IDE we might want to use, thanks. Oh, and don’t forget to make it all work asynchronously.

But hey, do developers still even use richly-featured debuggers? Is it even feasible to have a single protocol or framework cover these for a wide range of languages? The answers to all these may not be so clear at the moment. However as likely as there are still bugs to be found in software, we will still need good quality debuggers and a solution to make them quickly available for multiple languages in multiple tools.

Getting Started with Jekyll (on Windows!)

 This week Kichwa Coders’ intern Jean Philippe found out the hard way that when it comes to building websites, having the right tools for the job is vital to success. Follow his progress as he explores the potential of using Jekyll to build a user-friendly, easy to maintain static website on Windows.

What is Jeykll and why do we use it?

Jekyll uses Markdown – a text-to-HTML conversion tool – to create a a blog-aware static website that doesn’t require a huge amount of maintenance. Once you have created the structure you just have to add your own Markdown file and Jekyll will add it to the website. The appeal of Jeykll for many users is that it allows content editors to edit the site without knowing how to code. After some rudimentary experience I can now create a basic Jeykll website.

How easy is it to get started with Jeykll?

This week I built my first website using Jekyll. I had some initial difficulties understanding how to use it, but once I’d got the basics I was able to come up with ideas on how to get the best out of it. Before you can install Jekyll you need to install Ruby and Bundle. I’m on Windows, so at first it was hard to install Jekyll as it is more suited to Linux, Linux users are most familiar by using command line and it’s easier to install Ruby and Bundle on Linux but I found this website.  However when I attempted to build a new project with the command “Jeykll new newproject” I got this:image

This wasn’t what I was expecting Continue reading “Getting Started with Jekyll (on Windows!)”

Untested Code is like Schrödinger’s Cat – Dead or Alive?

catinbox2

If every line of untested code is like Schrödinger’s cat – Potentially dead or alive – how important is it to ‘open the box’ properly and know for sure if the code will leap out and run?

The perceived wisdom that if a piece of code hasn’t been tested you can assume it won’t work, is proof – if any were needed – that coders will always expect the worst case scenario when creating code. Unlike Schrödinger, a coder will not waste time mulling over the metaphysical possibilities of whether their code might be dead or alive or even dead AND alive at the same time – they need certainty, and as quickly as possible. However any amount of testing will only be worthwhile if the quality of that testing is high.   In this blog Yannick Mayeur, a Kichwa Coders intern, describes how he kept his fur on whilst improving the testing coverage of Eclipse January.

An introduction to JUnit

This week I was reintroduced to JUnit, having forgotten most of what I had learned about it at the University Institute of Technology back home. JUnit is a unit testing framework. It is used to test the different methods of a program to see whether or not the intended behaviour is working. It is often said that a method that is not tested is a method full of bugs, and after a week of testing  I can confirm that this saying is indeed grounded in truth.

My job this week was to improve the test coverage of Eclipse January. You can calculate the coverage of a program using the EclEmma plug-in. I worked on the DatasetUtils class, improving the coverage from 47% to almost 58%, and fixing bugs using two methods:  (https://github.com/eclipse/january/pull/178 and https://github.com/eclipse/january/pull/188).

Seeing that bugs can exist in untested code written by people that know a lot more about what they are doing than I do, really showed me the importance of testing.

How I did it

This is a test I have written for the method “crossings”. Writing this test helped me highlight some unexpected behaviour in the way it works.

@Test
public void testCrossings3() {
	Dataset yAxis = DatasetFactory.createFromObject(new Double[] {
			0.5, 1.1, 0.9, 1.5 });
	Dataset xAxis = DatasetFactory.createFromObject(new Double[] {
			1.0, 2.0, 3.0, 4.0 });
	List<Double> expected = new ArrayList<Double>();
	expected.add(2.5);
	List<Double> actual = DatasetUtils.crossings(xAxis, yAxis, 1,
			0.5);
	assertEquals(expected, actual);
}

This shows what the values look like: behaviour

The expected behaviour of the method as written in the test would be that the 3 crossing points would be merged into one at 2.5, but this wasn’t what was happening, indeed the code was using “>” instead of “>=”. If left untested this code’s bug would probably never have been discovered.

Conclusion

Discovering bugs like this one is crucial. When users employ this method they are almost certainly expecting the same behaviour that I was, and therefore won’t understand why their code isn’t working – especially if they can’t see the original code of the method and only have access to its Javadoc. I hope that correcting bugs like this one will create a smoother user experience for coders in the future.

 

 

 

 

Woohoo! Java 9 has a REPL! Getting Started with JShell and Eclipse January

With Java 9 just around the corner, we explore one of its most exciting new features – the Java 9 REPL (Read-Eval-Print Loop). This REPL is called JShell and it’s a great addition to the Java platform. Here’s why.

With JShell you can easily try out new features and quickly check the behaviour of a section of code. You don’t have to create a long-winded dummy main or JUnit test – simply type away.  To demonstrate the versatility of JShell, I am going to use it in conjunction with the Eclipse January package for data structures. Eclipse January is a set of libraries for handling numerical data in Java, think of it as a ‘numpy for Java’.

Install JShell

JShell is part of Java 9, currently available in an Early Access version from Oracle and other sources. Download and install Java 9 JDK from http://jdk.java.net/9/ or, if you have it available on your platform, you can install with your package manager (e.g. sudo apt-get install openjdk-9-jdk).

Start a terminal and run JShell:capture1

As you can see, JShell allows you to type normal Java statements, leave off semi-colons, run expressions, access expressions from previous outputs, and achieve many other short-cuts. (You can exit JShell with Ctrl-D.)

Using JShell with Eclipse January

To use Eclipse January, you need to:

1. Download January:

Get the January 2.0.2 jar ( or older version January 2.0.1 jar).

2. Download the dependency jars:

The January dependencies are available from Eclipse Orbit, they are:

3. Run JShell again, but add to the classpath all the jars you downloaded (remember to be the in the directory you downloaded the jars to):

Windows:

"c:\Program Files\Java\jdk-9\bin\jshell.exe"  --class-path org.eclipse.january_2.0.2.v201706051401.jar;org.apache.commons.lang_2.6.0.v201404270220.jar;org.apache.commons.math3_3.5.0.v20160301-1110.jar;org.slf4j.api_1.7.10.v20170428-1633.jar;org.slf4j.binding.nop_1.7.10.v20160301-1109.jar

Linux:

jshell --class-path org.eclipse.january_2.0.2.v201706051401.jar:org.apache.commons.lang_2.6.0.v201404270220.jar:org.apache.commons.math3_3.5.0.v20160301-1110.jar:org.slf4j.api_1.7.10.v20170428-1633.jar:org.slf4j.binding.nop_1.7.10.v20160301-1109.jar

Some notes:
Some version of jshell the command line argument is called -classpath instead of --class-path
If you are using git bash as your shell on Windows, add winpty before calling jshell and use colons to separate the path elements.

capture2

Then you can run through the different types of January commands. Note JShell supports completions using the ‘Tab’ key. Also use /! to rerun the last command.

Import classes

Start by importing the needed classes:

import org.eclipse.january.dataset.*

(No need for semi-colons and you can use the normally ill-advised * import)

Array Creation

Eclipse January supports straightforward creation of arrays. Let’s say we want to create a 2-dimensional array with the following data:

[1.0, 2.0, 3.0,
 4.0, 5.0, 6.0,
 7.0, 8.0, 9.0]

First we can create a new dataset:

Dataset dataset = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 })
System.out.println(dataset.toString(true))

This gives us a 1-dimensional array with 9 elements, as shown below:

[1.0000000, 2.0000000, 3.0000000, 4.0000000, 5.0000000, 6.0000000, 7.0000000, 8.0000000, 9.0000000]

We then need to reshape it to be a 3×3 array:

dataset = dataset.reshape(3, 3)
System.out.println(dataset.toString(true))

The reshaped dataset:

 [[1.0000000, 2.0000000, 3.0000000],
 [4.0000000, 5.0000000, 6.0000000],
 [7.0000000, 8.0000000, 9.0000000]]

Or we can do it all in just one step:

Dataset another = DatasetFactory.createFromObject(new double[] { 1, 1, 2, 3, 5, 8, 13, 21, 34 }).reshape(3, 3)
System.out.println(another.toString(true))

Another dataset:

 [[1.0000000, 1.0000000, 2.0000000],
 [3.0000000, 5.0000000, 8.0000000],
 [13.000000, 21.000000, 34.000000]]

There are methods for obtaining the shape and number of dimensions of datasets

dataset.getShape()
dataset.getRank()

Which gives us:

jshell> dataset.getShape()
$8 ==> int[2] { 3, 3 }

jshell> dataset.getRank()
$9 ==> 2

Datasets also provide functionality for ranges and a random function that all allow easy creation of arrays:

Dataset dataset = DatasetFactory.createRange(15, Dataset.INT32).reshape(3, 5)
System.out.println(dataset.toString(true))

[[0, 1, 2, 3, 4],
 [5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14]]


import org.eclipse.january.dataset.Random //specify Random class (see this is why star imports are normally bad)
Dataset another = Random.rand(new int[]{3,5})
System.out.println(another.toString(true))

[[0.27243843, 0.69695728, 0.20951172, 0.13238926, 0.82180144],
 [0.56326222, 0.94307839, 0.43225034, 0.69251040, 0.22602319],
 [0.79244049, 0.15865358, 0.64611131, 0.71647195, 0.043613393]]

Array Operations

The org.eclipse.january.dataset.Maths provides rich functionality for operating on the Dataset classes. For instance, here’s how you could add 2 Dataset arrays:

Dataset add = Maths.add(dataset, another)
System.out.println(add.toString(true))

Or you could do it as an inplace addition. The example below creates a new 3×3 array and then adds 100 to each element of the array.

Dataset inplace = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 }).reshape(3, 3)
inplace.iadd(100)
System.out.println(inplace.toString(true))

[[101.0000000, 102.0000000, 103.0000000],
 [104.0000000, 105.0000000, 106.0000000],
 [107.0000000, 108.0000000, 109.0000000]]

Slicing

Datasets simplify extracting portions of the data, known as ‘slices’. For instance, given the array below, let’s say we want to extract the data 2, 3, 5 and 6.

[1, 2, 3,
 4, 5, 6,
 7, 8, 9]

This data resides in the first and second rows and the second and third columns. For slicing, indices for rows and columns are zero-based. A basic slice consists of a start and stop index, where the start index is inclusive and the stop index is exclusive. An optional increment may also be specified. So this example would be expressed as:

Dataset dataset = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 }).reshape(3, 3)
System.out.println(dataset.toString(true))
Dataset slice = dataset.getSlice(new Slice(0, 2), new Slice(1, 3))
System.out.println(slice.toString(true))

slice of dataset:

[[2.0000000, 3.0000000],
 [5.0000000, 6.0000000]]

Slicing and array manipulation functionality is particularly valuable when dealing with 3-dimensional or n-dimensional data.

Wrap-Up

For more on Eclipse January see the following examples and give them a go in JShell:

  • NumPy Examples shows how common NumPy constructs map to Eclipse Datasets.
  • Slicing Examples demonstrates slicing, including how to slice a small amount of data out of a dataset too large to fit in memory all at once.
  • Error Examples demonstrates applying an error to datasets.
  • Iteration Examples demonstrates a few ways to iterate through your datasets.
  • Lazy Examples demonstrates how to use datasets which are not entirely loaded in memory.

Eclipse January is a ‘numpy for Java’ but until now users have not really been able to play around with it in the same way you would numpy in Python.

JShell provides a great way to test drive libraries like Eclipse January. There are a couple of features that would be nice-to-have such as a magic variable for the last result (maybe $_ or $!) and maybe a shorter way to print a result (maybe /p :-). But overall, it is great to have and finally gives Java the REPL and ability to be used interactively that many have gotten so used to with other programming languages.

In fact we will be making good use of JShell for the Eclipse January workshop being held at EclipseCon France, see details and register here:  https://www.eclipsecon.org/france2017/session/eclipse-january

eclipseConV2