Skip to content

Speed up coverage report generation #636

@RevanProdigalKnight

Description

@RevanProdigalKnight

As I continue to expand my own project's suite of unit tests I have started to look through the code in bashunit to try and identify pain points. I know I'm penalizing myself by running bashunit on a 15+ year old computer, in Cygwin, on Windows 10 (as opposed to running on proper Linux), as on my PC the test suite takes several minutes to run in serial mode and on the build pipeline agent, the same suite takes under a minute to run.

Even considering all of that, on my PC unit test generation times are... rather long. (I haven't tried introducing it on the build pipeline yet)

$ time ./lib/bashunit tests/* --simple --jobs 4 --coverage --coverage-report-html
bashunit - 0.34.1
# Test output omitted for brevity
Tests:      194 passed, 5 skipped, 2 incomplete, 201 total
Assertions: 1100 passed, 5 skipped, 2 incomplete, 1107 total

 Some tests incomplete
Time taken: 9m 14s

Coverage Report
---------------
# Specific file output omitted for brevity
---------------
Total: 1409/1734 (81%)


Coverage report written to: coverage/lcov.info
Coverage HTML report written to: coverage/html/index.html

real    217m36.766s
user    52m52.586s
sys     247m19.326s

Running with --parallel for unlimited testing concurrency instead of a fixed number of jobs does not increase speed significantly, since my PC is so old and only has 6 CPU cores.

As an aside, for some reason my unit tests ran faster before I split the main file into smaller modules, taking ~5 minutes in serial, but I was unable to get any coverage information out of it. Not sure how/why that is the case, but thought it was worth mentioning as an interesting tidbit.


In my testing so far, bashunit::coverage::is_executable_line can be optimized from running up to 7 subshells of $(echo "$line" | "$GREP" [...] || true) for the various omissions, to at most 1 with the following (this includes the changes made in main for handling loop terminators with redirections/etc. which haven't made it to a released version yet):

_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS_ARRAY=(
  # Skip function declaration lines but not single-line functions with a body
  '(function[[:space:]]+)?[a-zA-Z_][a-zA-Z0-9_:]*[[:space:]]*\(\)[[:space:]]*\{?'
  # Skip lines with only braces
  '[\{\}]'
  # Skip control flow keywords
  '(then|else|fi|do|done|esac|in|;;|;;&|;&)'
  # Skip loop terminator with trailing redirection/pipe/fd (e.g. "done < file", "done | sort", "done 2>&1", "done &")
  'done[[:space:]]*(<(<<)?|\||&|[[:digit:]]?>).*?'
  # Skip case patterns
  '[^\)]+\)'
  # Skip standalone ) for arrays/subshells
  '\)'
)
# Join the array using `|`, then reset IFS afterward
# NOTE: Not sure if this is bash 3 compatible, but not difficult to convert to a loop of string concatenations if it isn't
temp_ifs="$IFS"
IFS='|'
# Pre-compiled regex pattern of patterns (performance optimization)
# Skip any of the above patterns in isolation as well as followed by an optional comment
_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS="^[[:space:]]*((${_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS_ARRAY[*]})[[:space:]]*)?(#.*)?$"
IFS="$temp_ifs"
unset temp_ifs

function bashunit::coverage::is_executable_line() {
  local line="$1"
  local lineno="$2"

 # Unused but kept for API compatibility
  : "$lineno"

  # Skip empty lines (line with only whitespace)
  [ -z "${line// /}" ] && return 1

  # Skip lines matching any of the patterns above in _BASHUNIT_NONEXECUTABLE_LINE_PATTERS
  [ "$(echo "$line" | "$GREP" -cE "$_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS" || true)" -gt 0 ] && return 1

  return 0
}

Making the above changes in my local copy of bashunit reduces the overall time to around 65 minutes on average (~3.8x faster coverage generation). In a smaller single file test (32 executable lines of code) it reduced overall time for lcov + HTML coverage output from a little over 4m7s down to a little under 1m21s.

While I haven't gone looking for much else past this yet, I get the feeling there are probably more cases like this where repeated subshell invocations can be grouped into single, more powerful subshell invocations with a little ingenuity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions