Friday, January 1, 2021

Cypress How-To: different logging mechanisms for debugging.

Introduction

As I wrote earlier in a blogpost now we have the cypress.io test framework integrated into Collabora Online codebase. We use this test framework to run automatic tests against the code base as part of the CI process. When a test fails it's crucial to have proper logging to make it easier to find out, why a test fails. Also, we need a way to add additional logging messages for debugging a test case in detail. In this post, I'll try to summarize the different loggings available for cypress tests.

Cypress log

Cypress itself has a logging capability, which is used to record the steps made by a test case. For all commands that a cypress test runs, we will have a line in the cypress log. This log is the first thing we can see in case of a test failure. From this log, we can find out at which step the test failure occurred. This log is displayed, in one hand, in the command line (see below), but it also saved under cypress_test/cypress/logs folder:

The cypress command log above shows, on one hand, the cause of the failure: Expected to find element: `#fontnamecomboboxs`, but never found it.. It seems we don't have the DOM element with the ID of fontnamecomboboxs. The log also shows that at which step we get that failure. In this specific test case we do the following steps before the failure: 1) load the test document, 2) switch to editing mode (mobile), 3) select all text in the document, 4) open the mobile wizard. After the fourth step we try to push an item on the mobile wizard, that is missing, so the test fails.

By default, this cypress log contains logging for cypress commands. However, we can add additional logging to the test code. We can use cy.log() method for that. Anything we print out with this method will be shown in the cypress log. For example, we use this logging function in helper methods to display the beginning and the end of the method, so it's easier to understand the log. See openMobileWizard() method for example:

It is diplayed in cypress log like this:

loolwsd log

We also save the loolwsd log during a test run. In this log, we can find information about the messages sent between the client and the server. It's the same log file, what we have for the actual Collabora Online under the /tmp/ folder with the name loolwsd.log. In case of cypress tests, we put this log file under /cypress_test/workdir folder. This file always contains the log of the last test run. This "last run" can mean all the tests (e.g. make check), can be only one test suite (make check-mobile spec=writer/apply_font_spec.js) or even only one test. It depends on the command we last run. This log is very verbose, so in most cases, it's best to filter for specific messages to get useful information from it. An example loolwsd.log is shown below:

To add additional debugging information to this log you can use the LOG_DBG(), LOG_TRC(), LOG_INF() methods.

loolwsd's command line output

Besides these logs, we also record the command line output of the running loolwsd server. It's the same output that we get for top-level make run command. This output first displays the details related to the started loolwsd (e.g. browser links for hello word files or admin console). On the other hand, the same output can be used to debug LO core code behind Collabora Online. We can use SAL_DEBUG() to write out anything to this output, called from core code and also crash traces are printed here. This output file can be found at cypress_test/workdir/loolwsd_ouput.log. This output looks like the below example:

Thursday, April 23, 2020

Cypress basics: how do cypress tests work?

Introduction

As I wrote earlier in a blogpost now we have cypress.io test framework integrated into Collabora Online codebase. Since we are testing an online, Javascript-based application that means we are working in an asynchronous environment. So triggering a user event via this framework, does not mean that this event is actually finished at the same time when the related cypress method is called. On the other hand, we try to cover real user scenarios, which are usually a sequence of events, as the user doing things on the UI step-by-step. So the question is, how to write sequential tests in an asynchronous environment. Fortunately, cypress has the answer to that.

The problem of asynchronicity

First, let's see an example of simulating user events in the cypress test framework. The following code triggers a selection of the text by simulating CTRL+A shortcut:

// Select all text
cy.get('textarea.clipboard')
 .type('{ctrl}a')

With calling this method we trigger a keyboard input, which will be handled by the client code (loleaflet). The client JS code sends a message to the server about the keyboard input. On the server-side, we will execute the shortcut, which results in a text selection. Then the server will send back the new text selection to the client-side, which will display it for the user. However the cy.get().type() call won't wait for that, it just triggers the event on the client-side and moves on.

Why is it a problem? Let's extend this code with another event:

// Select all text
cy.get('textarea.clipboard')
 .type('{ctrl}a')

// Apply bold font
cy.get('#tb_editbar_item_bold')
 .click();

Let's assume that this '#tb_editbar_item_bold' item exists. It's a toolbar button which applies bold font on the selected text. The problem here is that the selection is not finished yet when we already triggered the font change. Triggering these events does not mean that they will be executed in the specified order. Both events are handled in an asynchronous way, so if the first event (e.g. typing) takes more time to execute, then it will be finished later. Which means the application won't apply bold font on anything, because text selection is not there yet.

Event-indicator method

What is the solution here? What we need is the event-indicator method. It's very simple. After every event, we trigger in the test code, we should use an indicator that the event is actually processed. Let's see how we can fix the previous scenario:

// Select all text
cy.get('textarea.clipboard')
 .type('{ctrl}a')

// Wait for the text selection to appear
cy.get('.leaflet-marker-icon')
 .should('exist');

// Apply bold font
cy.get('#tb_editbar_item_bold')
 .click();

The '.leaflet-marker-icon' DOM element is part of the text selection markup in the application, so we can use that as an indicator. In cypress, every cy.get().should() call has an implicit retry feature. It tries to get the related DOM element, again and again, until it appears in the DOM (or until a given timeout). Also, it tries to check the assumption described by the should() method, again and again, until the DOM properties meet with this assumption (or until a given timeout). So this specific cy.get().should() call will wait until the selection appears on the client-side. After that, it's safe to apply bold font because we have the selection to apply on. It's a best practice to use an indicator after every event, so you can make sure that the events are processed in the right order.

Bad pracitice

Someone might think of using some kind of waiting after a simulated event, so the application has time to finish the processing. Like bellow:

// Select all text
cy.get('textarea.clipboard')
 .type('{ctrl}a')

// Wait for the text selection to appear (DONT USE THIS)
cy.wait(500);

// Apply bold font
cy.get('#tb_editbar_item_bold')
 .click();

I think it's easy to see why this is not effective. If this constant time is too small, then this test will fail on slower machines or on the same machine when it's overloaded. On the other side, if this constant is too big, then the test will be slow. So if we use a big enough value to make the tests passing on all machines, then these tests will be as slow as they would be on the slowest machine.

Using an indicator makes the test framework more effective. Running these tests will be faster on faster machines since it waits only until the indicator is found in the DOM, which is fast in this case. It also works on slower machines since we have a relatively big timeout for cypress commands (6000 ms), so it's not a problem if it takes time to apply the selection. All in all, we should try to minimize the usage of cy.wait() method in our test code.

One more thing: then() is not an indicator

It's good to know, that not every cypress command can be used as an indicator. For example, we often use then() method, which yields the selected DOM element, so we can work with that item in the test code. An example of this is the following, where we get a specific <p> item and try to check whether it has the specified text content:

cy.get('#copy-paste-container p')
 .then(function(item) {
  expect(item).to.have.lengthOf(1);
  expect(item[0].innerText).to.have.string('\u00a0');
 });

This method call will retry getting the <p> element until it appears in the DOM. However, it won't retry the assumptions specified with expect methods. It will check these expectations only once, when it gets the <p> item first. If we would like to wait for the <p> item to match with the assumptions, then we should use should() method instead.

cy.get('#copy-paste-container p')
 .should('contain.text', '\u00a0');

This method will retry to find a <p> item meet both the selector('#copy-paste-container p') and the assumption. In general, it's always better to use a should() method where it possible. However sometimes then() is also needed. We can use that safely if we use an indicator before it and so we can make sure we check the DOM element at the right time.

In the example below, we have a cy.get('.blinking-cursor') call. If we know that before the text selection we did not have the blinking cursor, then this cy.get() call is a good indicator so we can use then() after that safely.

// Select all text
cy.get('textarea.clipboard')
 .type('{ctrl}a')

// Use blinking cursor position
cy.get('.blinking-cursor')
 .then(function(cursors) {
  var posX = cursors[0].getBoundingClientRect().left;
  ... // Do something with this position
 })

Summary

I hope this short description of the event-indicator method and cypress retry feature is useful to understand how these cypress tests work. This simple rule, to always use indicators, can make things much easier when somebody writes new tests or modifies the existing ones. Sometimes the application does not provide us an obvious indicator and we need to be creative to find one.

Friday, March 27, 2020

New integration test framework in Collabora Online.

Introduction

At Collabora, we invest a lot of hard work to make LibreOffice's features available in an online environment. Recently we greatly improved the Collabora Online mobile UI, so it's more smooth to use it from a mobile device. While putting more and more work into the software, trying to support more and more different platforms, we need also to spend time improving the test frameworks we use for automatic testing. These test frameworks make sure that while we enrich the software with new features, the software remains stable during the continuous development process.

End-to-end testing in the browser

One step on this road was the integration of cypress.io test framework into Collabora Online code. cypress.io is an end-to-end test run in the browser, so any online application can be tested with it. It mainly allows us to simulate user interaction with the UI and check the event's results via the website's DOM elements. That allows us to simulate user stories and catch real-life issues in the software, so our quality measurement matches the actual user experience.

When I investigated the different alternatives for browser testing I also checked the Selenium test framework. I didn't spend more than two days on that, but I had an impression that Selenium is kind of "old" software, which tries to support many configurations, many language bindings which makes it hard to use and also makes it stuck in the current state, where it is. While cypress.io is a newer test framework, which seems more focused. It is easier to integrate into our build system and easier to use, which is a big plus because it's not enough to integrate a test framework, but developers need to learn how to use it too. I saw one advantage of Selenium: the better browser support. It supports all the main browsers (Chrome, Mozilla Firefox, Safari, Internet Explorer), while cypress.io mainly supports only Chrome, but it improves on this area. Now it has a beta Mozilla Firefox support. So finally I decided to use cypress.io and I'm happy I did that because it nicely works.

cypress.io in Collabora Online

So cypress.io is now integrated into the Collabora Online code and we already have 150 tests mainly for mobile UI. As we submit most of our work to upstream, these tests are also available in the community version of the software. It's integrated into the software's GNU make based build system, so a make check will run these tests automatically. This is also part of the continuous integration system, so we can catch any regression instantly, before it actually hits the code. It's recommended to all developers of the online code to get familiar with the test framework, so it will be easier to understand if a test failure indicates an issue in their proposed patch. There are a set of useful notes in the source code, in the readme file: [source_dir]/cypress_test/README. Next to that, I try to add some good advice in the following paragraphs, how to investigate if any cypress test is failing on your patch.

How to check a test failure?


Interactive test runner

When you run make check the cypress tests are run in headless mode, so you can't see what happens on the UI, while the tests are running. If you see a test failure, the easiest way to understand what happens is to check it in the interactive test runner. To do that you can call make run-mobile or make run-desktop depending on what kind of test you are interested in. In interactive mode, you'll get a window, where you can select a test suite (*_spec.js file) and run that test suite only.

After you select a test suite you'll see the tests running in the browser. It's fast so probably you can't follow all the steps, but after the tests are finished you can select the steps and check screenshot for every step, so you can follow the state of the application. This way you can see how the application gets to a failure state.

Can't reproduce a failure in the interactive test runner

Sometimes, it happens that a test failure is reproducible only in headless mode. There are more options, that you can do in this case. First, you can check a screenshot taken at the point when the test failed. This screenshot is added automatically into a separate folder:

[source_dir]/cypress_test/cypress/screenshots/

This screenshot shows only the failure state, which might not be enough. You can also use the cypress command log to write out important information into the console during a test run. You can do that using the cy.log() method, called from the JS test code (this is not equivalent to console.log() method). In the case of test failure, these logs are dumped on the console output. These logs are also available here:

[source_dir]/cypress_test/cypress/logs/

A third option is to enable video recording. With video recording, the cypress test framework will generate a video of the test run, where you can see the same thing that you would see in the interactive test runner. To enable video recording you need to remove "video" : false, line from [source_dir]/cypress_test/cypress.json file. After that, running make check will record videos for all test suites you are running and put them under videos folder:

[source_dir]/cypress_test/cypress/videos/

How to run only one test?

To run one test suite you can use the spec option:

make check-mobile spec=writer/apply_font_spec.js

This spec option can be used with check-mobile and check-desktop rules, depending on what kind of test you intend to run. This is the headless run, but in the interactive test runner, you also can do that by selecting one test suite from the list. With these options, you can run a test suite, but a test suite means more tests. If you would like to run only one test you need to combine a test suite run, with using only() method. You need to open the test suite file and add only() to the definition of the specific test case:

- it('Apply font name.', function() {
+ it.only('Apply font name.', function() {

So both the headless build and the interactive test runner will ignore any other tests in the same test suite. It's useful when somebody investigates why a specific test fails.

Summary

So that's it for now. I hope these pieces of information are useful for getting familiar with the new test framework. Fortunately, the cypress.io test framework provides us nice tools to write new tests and check test failures. I'm still working on the test framework to customize it to our online environment. Hopefully, using this test framework will improve software quality in the long term.

Monday, March 6, 2017

New chart test suit

It's good to see that LibreOffice project has more and more tests improving the functional quality of the software. In this post, I also write about a new test suit added to the project. At Collabora, we are working on a project about a new chart functionality and as a preparation we decided to add a test suit which covers the chart layouting code better than the tests LO already had. With this we can map the actual state of the chart code to test cases and so make sure it's functionality remains intact.

Different kind of automatic tests in LO

There are different forms of automatic tests in the project. The question was which one is the most effective for testing a bigger part of the code (chart layout) not only a small functionality. One form of testing is when we use CppUnit assertions to compare some properties of the actual test case with the expected values. These tests used to test a very specific test case. For example it can check whether one test document has 2 pages when it is imported into LibreOffice. This kind of tests can lead to code duplication when we test the same thing (e.g. page number) on different test documents. That's why this kind of CppUnit test is not effective when we need to test one bigger part of the software sistematically, which may need to test the same thing (e.g. page number) on different test documents, which documents might be important use cases.

Other tests use an XML dump functionality to dump the test document's layout to an XML file and use this file as a reference with which the test document can be compared later. Adding new test cases to this kind of test suits is easier, needs lesser code change compared to first form of tests. However this kind of tests checks the whole layout of the document in general. So when one of these tests fails, you need to check the XML file, understand it's structure and find out why the pointed difference is there. It's clear that this kind of test failures might be more difficult to understand compared to the first form of tests where the test failures point to a very specific document property not to a reference XML file.

New chart test suit

An important difference between these two forms are the generality level of what it tests. First form tests some specific properties of one specific test case, while second form test the whole layout of the test document. To avoid the issues of these two forms I chose to write a test suit designed to test the software functionality on a middle level of generality.

I added a test suit which contains similar tests as the first form, comparing some specific document properties to extected values using CppUnit assertions, but the expected values are not hard coded, but they are written into and read from a simple structured reference file. It's implemented on a way which makes easy to add a new test document for an existing test case and generate the reference file without extending the test code. On the other hand we get a helpful error message when one test failes, since the test case is more specific than an XML dump test.

Testing chart functionality

After I had the form I spent some time on adding the most common use cases to the test suit. I added test cases for different components of charts (axis, chartwall, legend, grid, data series, etc.) and also for common chart types (columnchart, barchart, piechart, etc.). All test cases contains more subcases, testing functionally distinctable use cases. The sistematic testing of the chart functionality also pointed out some issues of the software:

Future possibilities

Now with the new chart test suit bigger part of the chart functionality is covered with tests, but there are still use cases which are not tested. For example some exotic chart types have no tests yet, like bublechart, netchart or 3D charts. Tested document formats are also limited to LibreOffice native formats (ODS, ODP), but these tests are easy to extend to Microsoft Office file formats too, for compatibility testing. You can find new test suit at chart2/qa/extras/chart2dump/ in case you need to add new test cases.

Wednesday, December 21, 2016

A short time spent on LibreOffice accessibility

LibreOffice and Orca

In the last months I have a short time period fixing accessibility issues mainly on Linux. LibreOffice has a bunch of this kind of issues (fdo#36549). This metabug is about those bugs which makes it difficult for Orca screen reader to make LO usable for visually impaired users. As I see Orca has a few workarounds in it's LO related code to handle these issues (e.g. ignoring false or duplicated events), but some times there is no such solution and we need to add improvements on LO side.

Small fixes

So I did some bug fixing on this area. Most of the bugs were about missing accessibility events, which are needed for Orca to handle events which are visible on the screen and so users should notice these changes. For example when the selection is changed on a Calc sheet (fdo#93825) or when the cursor moves inside a text portion (fdo#99687, fdo#71435). These issues can be frustrating for users who used to get feedback about the effect of their keyboard key pressing.

Fixing these issues needed small code changes, which shows LO accessibility code has a good structure in general, but as the code changes in time, some parts of this code just becomes broken, without maintanance.

Spellcheck dialog

A bit bigger change, I added, was related to the spellcheck dialog (fdo#93430). Spellcheck dialog shows the errors spellcheck algorithms find and shows some options to handle these errors (e.g. suggestions for correction). The problem with this dialog was with the text entry which shows the misspelled word. This text entry contains a small part of the text and highlights the misspelled word with red text color. Orca tried to get this small text part and find out which word is the erroneous one, but LO did not return the right text attributes and so Orca did not have the necessary information to handle this situation.

Now, after fixing this issue text color and also the boundaries of misspelled word are accessible for Orca. Great to see that Orca's developer, Joanmarie Diggs already adapted the code to handle these new information and so reading of this spellcheck dialog will be better in the next versions of the two softwares.

BeLin

I added these accessibility improvements working for IT Foundation for the Visually Impaired. One project of the Foundation is an Ubuntu based operating system for visually impaired users called BeLin ("Beszélő Linux", which is "Speaking Linux" in English). Since it's an Ubuntu based distribution it has LibreOffice as default office suite and uses Orca as screen reader.

Hopefully these change will make more comfortable to use LibreOffice and Orca both on BeLin and on other Linux distributions.

Friday, November 25, 2016

New pivot table function in Calc: Median

After have some work with a pivot table related performance issue, I've got a request to implement a new function for pivot tables. It seemed a useful feature to have and also an easy thing to implement at the first sight.

Pivot table functions

Pivot tables are used to analyse a larger amount of data using different statistic functions for that. Both LibreOffice Calc and Microsoft Office Excel have the same function palette with 11 functions like average, sum, count and so on. These aggregate functions can be used for data fields and for row/column fields. Data fields determines which source field would be summarized and which function would be used for that. Row/column fields don't have a function by default, but with setting one user can calculate subtotals too.

Median

Why median? In psychology research pivot tables can be a useful thing to analyze data of the participants. It depends on the type of the data which functions can be used for aggregation. When we have interval variables we can use average, but for ordinal variables we would need median, which was missing from the function list.

This was the starting point, but I've also found some user posts about missing this feature from MS Excel (see links below). It seems Excel users had to face the very same problem again and again over the last ten years. Well, it has some advantages if a software is open source, I guess.

Good news

In LibreOffice 5.3, median is available for pivot tables:

Thanks

... to my professor, Attila Krajcsi (Department of Cognitive Psychology, Eötvös Loránd University) for supporting the LibreOffice project, with the idea to have this new function and also with some course credits!

Thursday, October 20, 2016

Improve pivot table import performance (tdf#102694)

After a short break I got back to Collabora and I continue working on LibreOffice. This time I handled a performance issue in pivot table's import code, as part of our work for SUSE. In some cases importing an XLSX document containing more pivot tables took such a long time, that user could interpret it as a freeze.

After some testing we found that pivot table grouping is one of the points which makes import slow. We means me and Kohei Yoshida, who is an expert in this area of the code and helped me with understanding it. So grouping was in our focus this time.

Pivot table groups

In case of pivot tables we can make groups for columns of the source data. This groups can be name groups (general groups), number groups (e.g. number ranges) and date groups (e.g. grouped by year, month, day, etc.). Since this groups are related to the source, they rather part of the source than the pivot table layout. This is true both for MS Excel and LO Calc implementation. Effectively this means that when we have more pivot tables using the same source they will have the same groups too. In case of LO Calc this groups are stored in the pivot cache linked with the corresponding source range.

Performance

The performance issue here was that however these kind of pivot tables (having the same source) were linked to each other by the pivot cache, XLSX import ignored that and worked expecting pivot tables are fully independent. Which means that same groups were imported so many times as many tables referenced them. What makes it even slower, this kind of tables are linked to each other on a way that when one table's grouping is changed other tables are also affected.

This difference between internal handling of pivot tables an the XLSX import code came from that XLSX import code was written before groups became part of the pivot cache. So I actually needed to update the import code to follow the changes of pivot table internal implementation. With that, pivot table groups' import time became quite good. For example the test document I uploaded to bugzilla (tdf#102694) took more than 20 minutes before and now it takes less than a half of a minute. This document contains a small data table and 20 simple pivot tables. So it's not something which should take so much time to load and now it doesn't.