Experiments for ``Certification of Confluence- and Commutation-Proofs via Parallel Critical Pairs''
confluence tool | Hakusan version 0.10 with termination tool TTT2 version 1.20.1 and SMT solver Z3 version 4.12.2. |
---|---|
certifier | CeTA version 2.46 (FORT-h uses FORTify version 2.0) |
problem set | problems.zip, containing 462 left-linear TRSs in COPS (Nos. 1 – 1655). |
environment | a PC with Intel Core i7-8500Y CPU (1.5 GHz) and 16 GiB memory; Ubuntu 22.04.3 LTS |
timeout | 60 seconds timeout for each problem |
R | Theorem 3.11 with Rφ,0 = Rψ,0 = ∅ | hakusan -R 5 file.trs | |
---|---|---|---|
C | Successive application of Theorem 3.15 | hakusan -C 5 file.trs | |
RC | Theorem 3.11 where a subsystem is handled by Theorem 3.15 (with C = ∅) | hakusan -RC 5 file.trs | |
CR | Theorem 3.15 where a subsystem is handled by Theorem 3.11 (with Rφ,0 = Rψ,0 = ∅) | hakusan -CR 5 file.trs | |
ACP | ACP version 0.72 (+ CeTA) | ACP1.sh file.trs 60 | |
CSI | CSI version 1.2.7 (+ CeTA) | CSI1.sh file.trs 60 | |
FORT-h | FORT-h + FORTify version 2.0 | FORT-h1.sh file.trs 60 | |
Hakusan | successive application of R and C | hakusan file.trs | |
k | Knuth and Bendix' criterion (Knuth and Bendix 1970), based on ACP | k1.sh file.trs 60 | |
s | strong closedness (Huet 1980), based on ACP | s1.sh file.trs 60 | |
a | almost development closedness (van Oostrom 1997), based on ACP | a1.sh file.trs 60 | |
o | original rule labeling for linear TRSs (van Oostrom 2008), based on CSI | o1.sh file.trs 60 | |
rt | relative-termination-based rule labeling (Zankl et al. 2011), based on CSI | rt1.sh file.trs 60 | |
cc | critical pair closing systems (Hirokawa et al. 2019), based on CSI | cc1.sh file.trs 60 | |
T81 | Toyama's confluence criterion based on parallel critical pairs (Toyama 1981), based on ACP | T81.sh file.trs 60 | |
X1 | generation of a certificate by method/tool X | ||
X2 | verification of X's certificate by CeTA (if X is not FORT-h) | tail -n +2 output-of-X1 > file.xml; ceta file.xml | |
FORT-h2 | verification of FORT-h's certificate by FORTify | tail -n +2 output-of-FORT-h1 > file.txt; fortify `head -1 output-of-FORT-h1` CR file.txt file.trs |
Y: YES | proved/certified confluence |
---|---|
N: NO | proved/certified non-confluence |
M: MAYBE | failed to prove/certify |
T: TIMEOUT | exceeded time limit 60 seconds |
E: ERROR | tool terminated with some error |
Note
- Hakusan uses TCAP (Zankl et al. 2011) for disproving confluence.
- For 944.trs and 948.trs CSI answers YES for but fails to produce certificates.
- CSI's results for 141.trs, 939.trs, and 1036.trs were obtained by using another PC (Intel Core i7-1165G7 CPU 2.80 GHz and 16 GB memory) due to some problem of the PC mentioned above.
- Concerning YES results, we have:
- ACP2 ⊆ Hakusan2
- |CSI2 ∪ Hakusan2| = 177
- |Hakusan2 - CSI2| = 20
- |CSI2 - Hakusan2| = 33
- |FORT-h2 - Hakusan2| = 2
- |ACP2 ∪ T81| = 82
- |CSI2 ∪ R2| = 173
status | R1 | R2 | C1 | C2 | RC1 | RC2 | CR1 | CR2 | ACP1 | ACP2 | CSI1 | CSI2 | FORT-h1 | FORT-h2 | Hakusan1 | Hakusan2 | k1 | k2 | s1 | s2 | a1 | a2 | o1 | o2 | rt1 | rt2 | cc1 | cc2 | T81 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
YES | 133 | 133 | 107 | 107 | 141 | 141 | 138 | 138 | 73 | 73 | 159 | 157 | 34 | 33 | 144 | 144 | 50 | 50 | 62 | 62 | 52 | 52 | 90 | 90 | 96 | 96 | 77 | 77 | 72 |
(sec) | 140.01 | 7.20 | 177.00 | 5.18 | 152.05 | 8.20 | 173.47 | 8.83 | 4.04 | 4.48 | 319.84 | 43.78 | 2.80 | 134.82 | 196.61 | 7.98 | 28.61 | 2.52 | 1.93 | 4.34 | 1.75 | 3.29 | 16.86 | 4.97 | 25.57 | 5.46 | 25.42 | 4.28 | 2.08 |
NO | 68 | 68 | 68 | 68 | 68 | 68 | 68 | 68 | 133 | 133 | 171 | 171 | 87 | 77 | 68 | 68 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(sec) | 1.08 | 3.05 | 2.48 | 3.10 | 1.03 | 3.17 | 2.63 | 3.27 | 22.99 | 6.54 | 170.79 | 8.69 | 10.82 | 442.66 | 3.38 | 3.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
MAYBE | 231 | 261 | 221 | 287 | 137 | 253 | 214 | 256 | 256 | 256 | 132 | 132 | 340 | 341 | 134 | 250 | 310 | 412 | 400 | 400 | 407 | 410 | 276 | 372 | 264 | 366 | 283 | 385 | 389 |
(sec) | 380.16 | 3.39 | 752.47 | 4.10 | 721.25 | 3.53 | 899.63 | 3.52 | 259.47 | 3.82 | 6873.81 | 1.60 | 4.13 | 3.63 | 856.69 | 3.21 | 445.03 | 5.85 | 59.02 | 5.25 | 61.48 | 6.89 | 63.31 | 4.84 | 122.81 | 4.84 | 132.22 | 5.39 | 24.40 |
TIMEOUT | 30 | 0 | 66 | 0 | 116 | 0 | 42 | 0 | 0 | 0 | 0 | 0 | 1 | 11 | 116 | 0 | 102 | 0 | 0 | 0 | 3 | 0 | 96 | 0 | 102 | 0 | 102 | 0 | 1 |
(sec) | 1800.00 | 0.00 | 3960.00 | 0.00 | 6960.00 | 0.00 | 2520.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 60.00 | 660.00 | 6960.00 | 0.00 | 6120.00 | 0.00 | 0.00 | 0.00 | 180.00 | 0.00 | 5760.00 | 0.00 | 6120.00 | 0.00 | 6120.00 | 0.00 | 60.00 |
ERROR | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(sec) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.04 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |