As I continue to expand my own project's suite of unit tests I have started to look through the code in bashunit to try and identify pain points. I know I'm penalizing myself by running bashunit on a 15+ year old computer, in Cygwin, on Windows 10 (as opposed to running on proper Linux), as on my PC the test suite takes several minutes to run in serial mode and on the build pipeline agent, the same suite takes under a minute to run.
Even considering all of that, on my PC unit test generation times are... rather long. (I haven't tried introducing it on the build pipeline yet)
$ time ./lib/bashunit tests/* --simple --jobs 4 --coverage --coverage-report-html
bashunit - 0.34.1
# Test output omitted for brevity
Tests: 194 passed, 5 skipped, 2 incomplete, 201 total
Assertions: 1100 passed, 5 skipped, 2 incomplete, 1107 total
Some tests incomplete
Time taken: 9m 14s
Coverage Report
---------------
# Specific file output omitted for brevity
---------------
Total: 1409/1734 (81%)
Coverage report written to: coverage/lcov.info
Coverage HTML report written to: coverage/html/index.html
real 217m36.766s
user 52m52.586s
sys 247m19.326s
Running with --parallel for unlimited testing concurrency instead of a fixed number of jobs does not increase speed significantly, since my PC is so old and only has 6 CPU cores.
As an aside, for some reason my unit tests ran faster before I split the main file into smaller modules, taking ~5 minutes in serial, but I was unable to get any coverage information out of it. Not sure how/why that is the case, but thought it was worth mentioning as an interesting tidbit.
In my testing so far, bashunit::coverage::is_executable_line can be optimized from running up to 7 subshells of $(echo "$line" | "$GREP" [...] || true) for the various omissions, to at most 1 with the following (this includes the changes made in main for handling loop terminators with redirections/etc. which haven't made it to a released version yet):
_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS_ARRAY=(
# Skip function declaration lines but not single-line functions with a body
'(function[[:space:]]+)?[a-zA-Z_][a-zA-Z0-9_:]*[[:space:]]*\(\)[[:space:]]*\{?'
# Skip lines with only braces
'[\{\}]'
# Skip control flow keywords
'(then|else|fi|do|done|esac|in|;;|;;&|;&)'
# Skip loop terminator with trailing redirection/pipe/fd (e.g. "done < file", "done | sort", "done 2>&1", "done &")
'done[[:space:]]*(<(<<)?|\||&|[[:digit:]]?>).*?'
# Skip case patterns
'[^\)]+\)'
# Skip standalone ) for arrays/subshells
'\)'
)
# Join the array using `|`, then reset IFS afterward
# NOTE: Not sure if this is bash 3 compatible, but not difficult to convert to a loop of string concatenations if it isn't
temp_ifs="$IFS"
IFS='|'
# Pre-compiled regex pattern of patterns (performance optimization)
# Skip any of the above patterns in isolation as well as followed by an optional comment
_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS="^[[:space:]]*((${_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS_ARRAY[*]})[[:space:]]*)?(#.*)?$"
IFS="$temp_ifs"
unset temp_ifs
function bashunit::coverage::is_executable_line() {
local line="$1"
local lineno="$2"
# Unused but kept for API compatibility
: "$lineno"
# Skip empty lines (line with only whitespace)
[ -z "${line// /}" ] && return 1
# Skip lines matching any of the patterns above in _BASHUNIT_NONEXECUTABLE_LINE_PATTERS
[ "$(echo "$line" | "$GREP" -cE "$_BASHUNIT_NONEXECUTABLE_LINE_PATTERNS" || true)" -gt 0 ] && return 1
return 0
}
Making the above changes in my local copy of bashunit reduces the overall time to around 65 minutes on average (~3.8x faster coverage generation). In a smaller single file test (32 executable lines of code) it reduced overall time for lcov + HTML coverage output from a little over 4m7s down to a little under 1m21s.
While I haven't gone looking for much else past this yet, I get the feeling there are probably more cases like this where repeated subshell invocations can be grouped into single, more powerful subshell invocations with a little ingenuity.
As I continue to expand my own project's suite of unit tests I have started to look through the code in bashunit to try and identify pain points. I know I'm penalizing myself by running bashunit on a 15+ year old computer, in Cygwin, on Windows 10 (as opposed to running on proper Linux), as on my PC the test suite takes several minutes to run in serial mode and on the build pipeline agent, the same suite takes under a minute to run.
Even considering all of that, on my PC unit test generation times are... rather long. (I haven't tried introducing it on the build pipeline yet)
Running with
--parallelfor unlimited testing concurrency instead of a fixed number of jobs does not increase speed significantly, since my PC is so old and only has 6 CPU cores.As an aside, for some reason my unit tests ran faster before I split the main file into smaller modules, taking ~5 minutes in serial, but I was unable to get any coverage information out of it. Not sure how/why that is the case, but thought it was worth mentioning as an interesting tidbit.
In my testing so far,
bashunit::coverage::is_executable_linecan be optimized from running up to 7 subshells of$(echo "$line" | "$GREP" [...] || true)for the various omissions, to at most 1 with the following (this includes the changes made inmainfor handling loop terminators with redirections/etc. which haven't made it to a released version yet):Making the above changes in my local copy of bashunit reduces the overall time to around 65 minutes on average (~3.8x faster coverage generation). In a smaller single file test (32 executable lines of code) it reduced overall time for lcov + HTML coverage output from a little over 4m7s down to a little under 1m21s.
While I haven't gone looking for much else past this yet, I get the feeling there are probably more cases like this where repeated subshell invocations can be grouped into single, more powerful subshell invocations with a little ingenuity.