TER1 (Statement): 100.00%
TER2 (Branch): 100.00%
TER3 (LCSAJ): 100.0% (21/21)
Approximate LCSAJ segments: 49
โ Covered โ this LCSAJ path was executed during testing.
โ Not covered โ this LCSAJ path was never executed. These are the paths to focus on.
Multiple dots on a line indicate that multiple control-flow paths begin at that line. Hovering over any dot shows:
start โ end โ jump
Uncovered paths show [NOT COVERED] in the tooltip.
1: package Schema::Validator; 2: 3: # --------------------------------------------------------------------------- 4: # Schema::Validator -- ISO 8601 datetime validation and Schema.org vocabulary 5: # loading. Purely functional; all symbols are opt-in via import list. 6: # --------------------------------------------------------------------------- 7: 8: use strict; 9: use warnings; 10: use autodie qw(:all); 11: 12: use Carp qw(carp croak); 13: use DateTime::Format::ISO8601; 14: use Encode qw(decode encode); 15: use File::Spec; 16: use JSON::MaybeXS qw(decode_json); 17: use LWP::UserAgent; 18: use Params::Get qw(get_params); 19: use Params::Validate::Strict qw(validate_strict); 20: use Readonly; 21: use Scalar::Util qw(reftype); 22: 23: use base 'Exporter'; 24: 25: # Only these two symbols may be imported by callers via 'use ... qw(...)'. 26: our @EXPORT_OK = qw(is_valid_datetime load_dynamic_vocabulary); 27: 28: our $VERSION = '0.03'; 29: 30: # --------------------------------------------------------------------------- 31: # Package globals: both are populated as a side-effect of 32: # load_dynamic_vocabulary(). Callers may read them after that call. 33: # --------------------------------------------------------------------------- 34: 35: # rdfs:Class items from the Schema.org JSON-LD graph, keyed by class label 36: our %dynamic_schema; 37: 38: # rdf:Property items from the Schema.org JSON-LD graph, keyed by property label 39: our %dynamic_properties; 40: 41: # =========================================================================== 42: # CONSTANTS 43: # =========================================================================== 44: # All magic strings and numbers are confined here; nothing below uses bare 45: # literals. Every constant mirrors a key in %config so runtime overrides 46: # are possible without re-opening the Readonly namespace. 47: # --------------------------------------------------------------------------- 48: 49: # Default cache directory: $CACHEDIR env var if set, otherwise the system 50: # temporary directory. Evaluated once at module load time. 51: Readonly::Scalar my $DEFAULT_CACHE_DIR => 52: (defined $ENV{CACHEDIR} && length $ENV{CACHEDIR}) 53: ? $ENV{CACHEDIR} 54: : File::Spec->tmpdir(); 55: 56: # Default cache filename -- stored in $DEFAULT_CACHE_DIR, never in CWD. 57: Readonly::Scalar my $DEFAULT_CACHE_FILE => 58: File::Spec->catfile($DEFAULT_CACHE_DIR, 'schemaorg_dynamic_vocabulary.jsonld'); 59: 60: # 86400 == 60 * 60 * 24: cache is considered fresh for one full day. 61: Readonly::Scalar my $DEFAULT_CACHE_DURATION => 86_400; 62: 63: # Canonical URL for the Schema.org full vocabulary in JSON-LD format. 64: Readonly::Scalar my $DEFAULT_VOCAB_URL => 'https://schema.org/version/latest/schemaorg-current-https.jsonld'; 65: 66: # HTTP timeout for the vocabulary download request, in seconds. 67: Readonly::Scalar my $DEFAULT_UA_TIMEOUT => 30; 68: 69: # JSON-LD structural keys and RDF type labels used when traversing @graph. 70: Readonly::Scalar my $AT_GRAPH => '@graph'; 71: Readonly::Scalar my $RDF_CLASS => 'rdfs:Class'; 72: Readonly::Scalar my $RDF_PROPERTY => 'rdf:Property'; 73: Readonly::Scalar my $RDFS_LABEL => 'rdfs:label'; 74: Readonly::Scalar my $RDFS_LABEL_FULL => 'http://www.w3.org/2000/01/rdf-schema#label'; 75: 76: # =========================================================================== 77: # CONFIGURATION 78: # =========================================================================== 79: # Callers may override any key before calling an exported function, or inject 80: # a full replacement via Object::Configure->configure('Schema::Validator', \%h). 81: # --------------------------------------------------------------------------- 82: our %config = ( 83: cache_file => $DEFAULT_CACHE_FILE, 84: cache_duration => $DEFAULT_CACHE_DURATION, 85: vocab_url => $DEFAULT_VOCAB_URL, 86: ua_timeout => $DEFAULT_UA_TIMEOUT, 87: ); 88: 89: # =========================================================================== 90: # PUBLIC INTERFACE (POD + code) 91: # =========================================================================== 92: 93: =head1 NAME 94: 95: Schema::Validator - Tools for validating and loading Schema.org vocabulary definitions 96: 97: =head1 VERSION 98: 99: Version 0.03 100: 101: =head1 SYNOPSIS 102: 103: use Schema::Validator qw(is_valid_datetime load_dynamic_vocabulary); 104: 105: # Validate a date or datetime string 106: if (is_valid_datetime('2024-11-14')) { 107: print "Valid date\n"; 108: } 109: 110: # Load and query the Schema.org vocabulary 111: my $classes = load_dynamic_vocabulary(); 112: if (exists $classes->{'Person'}) { 113: print "Person class is defined\n"; 114: } 115: 116: # Override a config value for a single call 117: my $classes = load_dynamic_vocabulary(ua_timeout => 60); 118: 119: =head1 DESCRIPTION 120: 121: C<Schema::Validator> provides two utilities for working with Schema.org 122: structured data: 123: 124: =over 4 125: 126: =item * L</is_valid_datetime> -- validates a string against the ISO 8601 127: date/datetime subset used by Schema.org. 128: 129: =item * L</load_dynamic_vocabulary> -- downloads (and caches for 24 hours) 130: the full Schema.org JSON-LD vocabulary and exposes all class and property 131: definitions as a hashref and via package globals. 132: 133: =back 134: 135: =head2 Configuration 136: 137: Runtime behaviour is controlled by the package-level C<%Schema::Validator::config> 138: hash. Supported keys and their defaults: 139: 140: cache_file => "$CACHEDIR/schemaorg_dynamic_vocabulary.jsonld" # or tmpdir 141: cache_duration => 86400 # seconds 142: vocab_url => 'https://schema.org/.../schemaorg-current-https.jsonld' 143: ua_timeout => 30 # seconds 144: 145: Override any key before calling an exported function: 146: 147: $Schema::Validator::config{ua_timeout} = 60; 148: 149: Or supply a complete replacement via L<Object::Configure>: 150: 151: Object::Configure->configure('Schema::Validator', \%my_config); 152: 153: =head1 PACKAGE VARIABLES 154: 155: =head2 %dynamic_schema 156: 157: Package hash keyed by Schema.org class label (e.g. C<Person>, C<Event>). 158: Values are the raw item hashrefs from the JSON-LD C<@graph> array. 159: Populated as a side-effect of L</load_dynamic_vocabulary>. 160: 161: =head2 %dynamic_properties 162: 163: Package hash keyed by Schema.org property label (e.g. C<name>, C<startDate>). 164: Values are the raw item hashrefs from the JSON-LD C<@graph> array. 165: Populated as a side-effect of L</load_dynamic_vocabulary>. 166: 167: =head1 FUNCTIONS 168: 169: =head2 is_valid_datetime 170: 171: =head3 PURPOSE 172: 173: Tests whether a scalar string conforms to one of the ISO 8601 174: date or datetime formats accepted by Schema.org: 175: 176: YYYY-MM-DD (date only) 177: YYYY-MM-DDTHH:MM (T separator, no seconds) 178: YYYY-MM-DD HH:MM (space separator, no seconds) 179: YYYY-MM-DDTHH:MM:SS (T separator, with seconds) 180: YYYY-MM-DD HH:MM:SS (space separator, with seconds) 181: 182: Optional timezone designators (C<Z>, C<+HH:MM>, C<-HH:MM>) are B<accepted>. 183: Calendar sanity B<is> enforced: out-of-range values (e.g. month 99) are B<rejected>. 184: 185: =head3 ARGUMENTS 186: 187: =over 4 188: 189: =item * C<string> (required, scalar) -- the candidate string to test. 190: Both positional (C<is_valid_datetime('2024-11-14')>) and named 191: (C<is_valid_datetime(string =E<gt> '2024-11-14')>) calling conventions 192: are accepted. 193: 194: =back 195: 196: =head3 RETURNS 197: 198: C<1> if the string is in a supported format; C<0> otherwise. 199: Returns C<0> for C<undef> or an empty string without throwing. 200: 201: =head3 SIDE EFFECTS 202: 203: None. 204: 205: =head3 NOTES 206: 207: Delegates to C<DateTime::Format::ISO8601->parse_datetime()> for semantic 208: validation, so out-of-range values (e.g. month 99) are rejected. 209: The space-separator variant (C<YYYY-MM-DD HH:MM>) is normalised to a T 210: separator before parsing since the module requires strict ISO 8601. 211: Timezone designators (C<Z>, C<+HH:MM>, C<-HH:MM>) are now accepted. 212: 213: =head3 EXAMPLE 214: 215: use Schema::Validator qw(is_valid_datetime); 216: 217: is_valid_datetime('2024-11-14'); # 1 218: is_valid_datetime('2024-11-14T15:30:00'); # 1 219: is_valid_datetime('2024-11-14 15:30'); # 1 (space sep normalised) 220: is_valid_datetime('2024-11-14T15:30:00Z'); # 1 (UTC timezone) 221: is_valid_datetime('2024-11-14T15:30:00+01:00'); # 1 (offset timezone) 222: is_valid_datetime('2024-99-01'); # 0 (invalid month) 223: is_valid_datetime('28/06/2025'); # 0 224: is_valid_datetime(undef); # 0 (no exception) 225: is_valid_datetime(''); # 0 (no exception) 226: 227: # Named calling convention 228: is_valid_datetime(string => '2024-11-14'); # 1 229: 230: =head3 API SPECIFICATION 231: 232: =head4 Input (Params::Validate::Strict) 233: 234: { 235: string => { 236: type => 'string', 237: optional => 0, 238: }, 239: } 240: 241: =head4 Output (Return::Set) 242: 243: { 244: type => 'boolean' 245: description => '1 (valid) or 0 (invalid, undef, or empty input)' 246: } 247: 248: =cut 249: 250: sub is_valid_datetime { 251: # Accept both positional (is_valid_datetime($s)) and named 252: # (is_valid_datetime(string => $s)) calling conventions. 253: # Validate: value must be a scalar or undef (undef returns 0 cleanly below). 254: my $p = validate_strict( 255: input => get_params('string', \@_), 256: schema => { 'string' => { type => 'string', optional => 0 } }, 257: ); 258: 259: my $string = $p->{string}; 260: 261: # Treat undef or empty string as invalid without throwing. 262: return 0 unless defined $string && length $string;263: 264: # Normalise the space-separator variant to T before handing off to the 265: # module, which requires strict ISO 8601 (T separator only). 266: (my $normalised = $string) =~ s/^(\d{4}-\d{2}-\d{2}) (?=\d{2}:)/$1T/; 267: 268: # Delegate to DateTime::Format::ISO8601 for full semantic validation; 269: # a truthy (DateTime) object means valid, undef/$@ means invalid. 270: return eval { DateTime::Format::ISO8601->parse_datetime($normalised) } ? 1 : 0; 271: } 272: 273: # =========================================================================== 274: 275: =head2 load_dynamic_vocabulary 276: 277: =head3 PURPOSE 278: 279: Downloads the complete Schema.org vocabulary from the official JSON-LD 280: endpoint, parses it into class and property lookup tables, caches the raw 281: JSON-LD locally, and returns the class table as a hashref. 282: 283: The cache is considered fresh for C<cache_duration> seconds (default 24 hours). 284: On network failure the function falls back to a stale cache rather than 285: returning an empty result, and emits a C<carp> warning. 286: 287: =head3 ARGUMENTS 288: 289: All arguments are optional; defaults come from C<%Schema::Validator::config>. 290: 291: =over 4 292: 293: =item * C<cache_file> (optional, scalar) -- path to the local cache file. 294: Defaults to C<$config{cache_file}>: C<$CACHEDIR/schemaorg_dynamic_vocabulary.jsonld> 295: if C<$ENV{CACHEDIR}> is set, otherwise C<File::Spec-E<gt>tmpdir()> is used. 296: 297: =item * C<cache_duration> (optional, scalar) -- cache validity window in seconds. 298: Defaults to C<$config{cache_duration}>. 299: 300: =item * C<vocab_url> (optional, scalar) -- URL of the JSON-LD vocabulary endpoint. 301: Defaults to C<$config{vocab_url}>. 302: 303: =item * C<ua_timeout> (optional, scalar) -- LWP::UserAgent timeout in seconds. 304: Defaults to C<$config{ua_timeout}>. 305: 306: =back 307: 308: Both zero-argument and named calling conventions are supported: 309: 310: load_dynamic_vocabulary(); 311: load_dynamic_vocabulary(ua_timeout => 60); 312: 313: =head3 RETURNS 314: 315: A hashref mapping class labels (e.g. C<'Person'>) to their raw JSON-LD 316: definition hashrefs from the C<@graph> array. 317: 318: Returns an empty hashref C<{}> on all failure paths (network unreachable, 319: no cache, JSON parse error). Never throws. 320: 321: =head3 SIDE EFFECTS 322: 323: =over 4 324: 325: =item * Populates C<%Schema::Validator::dynamic_schema> with class definitions. 326: 327: =item * Populates C<%Schema::Validator::dynamic_properties> with property definitions. 328: 329: =item * Creates or updates the local cache file on a successful download. 330: 331: =item * Emits C<carp> warnings on network failures, I/O errors, or JSON 332: parse errors. 333: 334: =back 335: 336: =head3 NOTES 337: 338: The default cache directory is determined once at module load time: the 339: C<$CACHEDIR> environment variable is used if set; otherwise C<File::Spec-E<gt>tmpdir()> 340: is used (typically C</tmp> on Unix). Override for the session with: 341: 342: $Schema::Validator::config{cache_file} = '/my/path/vocab.jsonld'; 343: 344: The C<bin/validate-schema> CLI tool imports this function from the module and 345: uses C<cache_file =E<gt> $path> to store its cache under C<~/.cache/schema_validator/>. 346: 347: =head3 EXAMPLE 348: 349: use Schema::Validator qw(load_dynamic_vocabulary); 350: 351: my $classes = load_dynamic_vocabulary(); 352: printf "%d classes loaded\n", scalar keys %{$classes}; 353: 354: # Check for a specific class in the returned hashref 355: print "Has Person\n" if exists $classes->{'Person'}; 356: 357: # Or query the package globals directly after the call 358: Schema::Validator::load_dynamic_vocabulary(); 359: my @names = sort keys %Schema::Validator::dynamic_schema; 360: 361: =head3 API SPECIFICATION 362: 363: =head4 Input (Params::Validate::Strict) 364: 365: { 366: cache_file => { type => 'string', optional => 1 }, 367: cache_duration => { type => 'string', optional => 1 }, 368: vocab_url => { type => 'string', optional => 1 }, 369: ua_timeout => { type => 'string', optional => 1 }, 370: } 371: 372: =head4 Output (Return::Set) 373: 374: { 375: type => 'hashref', 376: description => 'class-label => JSON-LD item hashref' 377: # ON_FAILURE => 'empty hashref {}; never throws' 378: # SIDE_EFFECTS => 'populates %dynamic_schema and %dynamic_properties' 379: } 380: 381: =cut 382: 383: sub load_dynamic_vocabulary { โ384 โ 387 โ 400โ384 โ 387 โ 0 384: my $params; 385: 386: # Validate types of any supplied overrides (all are optional scalars). 387: if(scalar(@_)) {Mutants (Total: 2, Killed: 0, Survived: 2)
- BOOL_NEGATE_262_2: Negate boolean return expression
MEDIUM: Add tests asserting both true and false outcomes๐งช Suggested Test# Boolean branch test suggestion ok( !func(INPUT), 'Verify boolean branch behaviour' );- RETURN_UNDEF_262_2: Replace return expression with undef
LOW: Mutation survived, but impact may be minor๐งช Suggested Test# Return value assertion is( func(INPUT), EXPECTED, 'Verify correct return value' );388: $params = validate_strict( 389: input => get_params(undef, \@_), 390: schema => { 391: cache_file => { type => 'string', optional => 1 }, 392: cache_duration => { type => 'integer', optional => 1 }, 393: vocab_url => { type => 'string', optional => 1 }, 394: ua_timeout => { type => 'integer', optional => 1 }, 395: } 396: ); 397: } 398: 399: # Merge caller overrides with module-level configuration defaults. โ400 โ 409 โ 415โ400 โ 409 โ 0 400: my $cache_file = $params->{cache_file} // $config{cache_file}; 401: my $cache_duration = $params->{cache_duration} // $config{cache_duration}; 402: my $vocab_url = $params->{vocab_url} // $config{vocab_url}; 403: my $ua_timeout = $params->{ua_timeout} // $config{ua_timeout}; 404: 405: my $content; 406: 407: # Attempt to read a fresh cache file. Open directly to avoid the TOCTOU 408: # race that would exist between a separate -e test and the open call. 409: if (-e $cache_file && (time - (stat($cache_file))[9] < $cache_duration)) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_387_2: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes410: eval { $content = _slurp_file($cache_file) }; 411: carp "Could not read cache '$cache_file': $@" if $@; 412: } 413: 414: # If no usable content yet, try to download the vocabulary. โ415 โ 415 โ 436โ415 โ 415 โ 0 415: unless (defined $content) {Mutants (Total: 4, Killed: 0, Survived: 4)
- NUM_BOUNDARY_409_55_>: Numeric boundary flip < to >
HIGH: Likely missing edge-case test (boundary value)๐งช Suggested Test# Boundary test suggestion is( func(VALUE_AT_BOUNDARY), EXPECTED, 'Test boundary behaviour' );- NUM_BOUNDARY_409_55_<=: Numeric boundary flip < to <=
HIGH: Likely missing edge-case test (boundary value)๐งช Suggested Test# Boundary test suggestion is( func(VALUE_AT_BOUNDARY), EXPECTED, 'Test boundary behaviour' );- NUM_BOUNDARY_409_55_>=: Numeric boundary flip < to >=
HIGH: Likely missing edge-case test (boundary value)๐งช Suggested Test# Boundary test suggestion is( func(VALUE_AT_BOUNDARY), EXPECTED, 'Test boundary behaviour' );- COND_INV_409_2: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes416: $content = _fetch_url($vocab_url, $ua_timeout); 417: 418: if (defined $content) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_415_2: Invert condition unless to if
MEDIUM: Add tests asserting both true and false outcomes419: # Persist the download to the cache (best-effort; warn, do not die). 420: eval { _spit_file($cache_file, $content) }; 421: carp "Could not write cache '$cache_file': $@" if $@; 422: } else { 423: # Network failed; fall back to a stale cache if one exists. 424: if (-e $cache_file) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_418_3: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes425: eval { $content = _slurp_file($cache_file) }; 426: if ($@) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_424_4: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes427: carp "Could not read stale cache '$cache_file': $@"; 428: } else { 429: carp "Network unavailable; using stale cache '$cache_file'"; 430: } 431: } 432: } 433: } 434: 435: # All content-acquisition strategies failed; return empty result. โ436 โ 436 โ 442โ436 โ 436 โ 0 436: unless (defined $content) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_426_5: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes437: carp 'load_dynamic_vocabulary: no vocabulary content available'; 438: return {}; 439: } 440: 441: # Parse the JSON; treat errors as non-fatal warnings. โ442 โ 443 โ 451โ442 โ 443 โ 0 442: my $data = eval { decode_json($content) }; 443: if ($@) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_436_2: Invert condition unless to if
MEDIUM: Add tests asserting both true and false outcomes444: carp "Failed to parse vocabulary JSON: $@"; 445: return {}; 446: } 447: 448: # Guard against decode_json returning a non-object (e.g. a JSON array, 449: # a bare number, or any other non-hash type). Calling exists on a 450: # non-hashref dies; catching it here keeps the "never throws" contract. โ451 โ 451 โ 457โ451 โ 451 โ 0 451: unless (ref($data) eq 'HASH') {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_443_2: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes452: carp "Vocabulary JSON is not a JSON object"; 453: return {}; 454: } 455: 456: # Confirm the expected JSON-LD graph structure is present. โ457 โ 457 โ 463โ457 โ 457 โ 0 457: unless (exists $data->{$AT_GRAPH} && ref($data->{$AT_GRAPH}) eq 'ARRAY') {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_451_2: Invert condition unless to if
MEDIUM: Add tests asserting both true and false outcomes458: carp "Vocabulary JSON is missing the '\@graph' array"; 459: return {}; 460: } 461: 462: # Delegate parsing to the internal graph processor. โ463 โ 477 โ 0 463: my ($classes, $props) = _parse_graph($data->{$AT_GRAPH}); 464: 465: # Populate package globals as documented side-effects. 466: %dynamic_schema = %{$classes}; 467: %dynamic_properties = %{$props}; 468: 469: # Report the result count via carp (informational, not an error). 470: carp sprintf( 471: 'Dynamic vocabulary loaded: %d classes, %d properties', 472: scalar(keys %dynamic_schema), 473: scalar(keys %dynamic_properties), 474: ); 475: 476: # Return the class hashref; callers needing properties use the global. 477: return $classes;Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_457_2: Invert condition unless to if
MEDIUM: Add tests asserting both true and false outcomes478: } 479: 480: # =========================================================================== 481: # INTERNAL HELPERS 482: # All routines below begin with _ and are not part of the public API. 483: # =========================================================================== 484: 485: # --------------------------------------------------------------------------- 486: # _slurp_file($path) 487: # 488: # Purpose: Read the complete contents of a file into a scalar. 489: # Entry: $path is a path to an existing, readable file. 490: # Returns: The file contents as a scalar string. 491: # Side fx: None beyond reading the file. 492: # Notes: autodie causes open/close to throw on failure; callers should 493: # wrap in eval { } and handle $@ if a non-fatal path is needed. 494: # --------------------------------------------------------------------------- 495: sub _slurp_file { 496: my ($path) = @_; 497: 498: # Open the file; autodie will throw if this fails. 499: open my $fh, '<', $path; 500: 501: # Temporarily undefine $/ to read the whole file in one operation. 502: local $/; 503: my $content = <$fh>; 504: 505: close $fh; 506: return $content;Mutants (Total: 2, Killed: 0, Survived: 2)
- BOOL_NEGATE_477_2: Negate boolean return expression
MEDIUM: Add tests asserting both true and false outcomes๐งช Suggested Test# Boolean branch test suggestion ok( !func(INPUT), 'Verify boolean branch behaviour' );- RETURN_UNDEF_477_2: Replace return expression with undef
LOW: Mutation survived, but impact may be minor๐งช Suggested Test# Return value assertion is( func(INPUT), EXPECTED, 'Verify correct return value' );507: } 508: 509: # --------------------------------------------------------------------------- 510: # _spit_file($path, $content) 511: # 512: # Purpose: Write a scalar string to a file, creating or truncating it. 513: # Entry: $path is a writable path; $content is a defined scalar. 514: # Returns: 1 on success. 515: # Side fx: Creates or overwrites $path. 516: # Notes: autodie causes open/close to throw on failure; wrap in eval 517: # when the write is non-critical (e.g. cache population). 518: # --------------------------------------------------------------------------- 519: sub _spit_file { 520: my ($path, $content) = @_; 521: 522: # Open for writing; autodie throws on permission or path errors. 523: open my $fh, '>', $path; 524: print $fh $content; 525: close $fh; 526: 527: return 1;Mutants (Total: 2, Killed: 0, Survived: 2)
- BOOL_NEGATE_506_2: Negate boolean return expression
MEDIUM: Add tests asserting both true and false outcomes๐งช Suggested Test# Boolean branch test suggestion ok( !func(INPUT), 'Verify boolean branch behaviour' );- RETURN_UNDEF_506_2: Replace return expression with undef
LOW: Mutation survived, but impact may be minor๐งช Suggested Test# Return value assertion is( func(INPUT), EXPECTED, 'Verify correct return value' );528: } 529: 530: # --------------------------------------------------------------------------- 531: # _fetch_url($url, $timeout) 532: # 533: # Purpose: Perform an HTTP GET and return the decoded response body. 534: # Entry: $url is a valid absolute HTTP/HTTPS URL; $timeout is a positive 535: # integer (seconds). 536: # Returns: Decoded response content on success; undef on HTTP error. 537: # Side fx: Network I/O; emits carp on non-success HTTP status. 538: # Notes: Transport-level errors (DNS failure, TLS error) may propagate as 539: # exceptions from LWP::UserAgent; callers should wrap in eval if 540: # they need a guaranteed non-throwing call. 541: # --------------------------------------------------------------------------- 542: sub _fetch_url { โ543 โ 550 โ 555โ543 โ 550 โ 0 543: my ($url, $timeout) = @_; 544: 545: # Build a minimal UA; timeout prevents indefinite hangs. 546: my $ua = LWP::UserAgent->new(timeout => $timeout); 547: my $res = $ua->get($url); 548: 549: # Treat any non-2xx status as a soft failure so callers can try fallbacks. 550: unless ($res->is_success) {Mutants (Total: 2, Killed: 0, Survived: 2)
- BOOL_NEGATE_527_2: Negate boolean return expression
MEDIUM: Add tests asserting both true and false outcomes๐งช Suggested Test# Boolean branch test suggestion ok( !func(INPUT), 'Verify boolean branch behaviour' );- RETURN_UNDEF_527_2: Replace return expression with undef
LOW: Mutation survived, but impact may be minor๐งช Suggested Test# Return value assertion is( func(INPUT), EXPECTED, 'Verify correct return value' );551: carp "Failed to fetch '$url': ", $res->status_line; 552: return; 553: } 554: โ555 โ 555 โ 0 555: return $res->decoded_content;Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_550_2: Invert condition unless to if
MEDIUM: Add tests asserting both true and false outcomes556: } 557: 558: # --------------------------------------------------------------------------- 559: # _extract_label($item) 560: # 561: # Purpose: Extract the rdfs:label string from a JSON-LD graph item hashref. 562: # Entry: $item is a hashref that may contain 'rdfs:label' or the full 563: # URI equivalent key. 564: # Returns: The label as a plain string, or undef if no label is found. 565: # Side fx: None. 566: # Notes: Schema.org JSON-LD may represent the label as a scalar string or 567: # as an array (for multi-language entries); this function always 568: # returns the first (or only) value. 569: # --------------------------------------------------------------------------- 570: sub _extract_label { 571: my ($item) = @_; 572: 573: # Try the compact key first; fall back to the full RDF URI form. 574: my $label = $item->{$RDFS_LABEL} // $item->{$RDFS_LABEL_FULL}; 575: return unless defined $label; 576: 577: # If the label is multi-valued, take the first entry. 578: return ref($label) eq 'ARRAY' ? $label->[0] : $label; 579: } 580: 581: # --------------------------------------------------------------------------- 582: # _parse_graph(\@graph) 583: # 584: # Purpose: Iterate over a JSON-LD @graph array and partition items into 585: # Schema.org class definitions and property definitions. 586: # Entry: $graph_ref is an arrayref of item hashrefs as decoded from the 587: # Schema.org JSON-LD vocabulary. 588: # Returns: Two hashrefs: (\%classes, \%properties), each keyed by label. 589: # Items are also indexed by the short name extracted from their 590: # @id URI so that both 'MusicEvent' and its label resolve correctly. 591: # Side fx: None. 592: # Notes: Items with no recognisable label or @type are silently skipped. 593: # The @id short-name index uses //= so the label always wins if 594: # it differs. 595: # --------------------------------------------------------------------------- 596: sub _parse_graph { โ597 โ 602 โ 637โ597 โ 602 โ 0 597: my ($graph_ref) = @_; 598: 599: my (%classes, %props); 600: 601: # Iterate every item in the JSON-LD graph array. 602: for my $item (@{$graph_ref}) { 603: 604: # Skip items that do not declare an RDF type. 605: next unless exists $item->{'@type'}; 606: my $item_type = $item->{'@type'}; 607: 608: # Normalise @type: the spec allows either a scalar or an array. 609: my @types = ref($item_type) eq 'ARRAY' ? @{$item_type} : ($item_type); 610: 611: # Extract the human-readable label; skip items with none. 612: my $label = _extract_label($item) or next; 613: 614: # Index rdfs:Class items under their label and their @id short name. 615: if (grep { $_ eq $RDF_CLASS } @types) {Mutants (Total: 2, Killed: 0, Survived: 2)
- BOOL_NEGATE_555_2: Negate boolean return expression
MEDIUM: Add tests asserting both true and false outcomes๐งช Suggested Test# Boolean branch test suggestion ok( !func(INPUT), 'Verify boolean branch behaviour' );- RETURN_UNDEF_555_2: Replace return expression with undef
LOW: Mutation survived, but impact may be minor๐งช Suggested Test# Return value assertion is( func(INPUT), EXPECTED, 'Verify correct return value' );616: $classes{$label} = $item; 617: 618: # Secondary index by short URI fragment (e.g. 'MusicGroup'). 619: if (my $id = $item->{'@id'}) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_615_3: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes620: (my $short = $id) =~ s{.*/}{}; 621: $classes{$short} //= $item; 622: } 623: } 624: 625: # Index rdf:Property items under their label and @id short name. 626: if (grep { $_ eq $RDF_PROPERTY } @types) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_619_4: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes627: $props{$label} = $item; 628: 629: # Secondary index by short URI fragment (e.g. 'startDate'). 630: if (my $id = $item->{'@id'}) {Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_626_3: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes631: (my $short = $id) =~ s{.*/}{}; 632: $props{$short} //= $item; 633: } 634: } 635: } 636: โ637 โ 637 โ 0 637: return (\%classes, \%props); 638: } 639: 640: # =========================================================================== 641: # END OF MODULE POD 642: # =========================================================================== 643: 644: =encoding utf-8 645: 646: =head1 FILES 647: 648: =head2 schemaorg_dynamic_vocabulary.jsonld 649: 650: Cache file written to C<$CACHEDIR> (if set) or the system temporary directory 651: (C<File::Spec-E<gt>tmpdir()>), unless overridden via C<$config{cache_file}>. 652: Contains the downloaded Schema.org vocabulary in JSON-LD format. Refreshed 653: when older than C<$config{cache_duration}> seconds. 654: 655: =head1 ERROR HANDLING 656: 657: The module uses C<carp> rather than C<die> for recoverable failures: 658: 659: =over 4 660: 661: =item * Failed HTTP requests emit C<carp> and trigger the stale-cache fallback. 662: 663: =item * JSON parse errors emit C<carp> and return C<{}>. 664: 665: =item * File I/O errors emit C<carp>; the download path is attempted next. 666: 667: =item * C<croak> is reserved for programmer errors (bad argument types). 668: 669: =back 670: 671: =head1 BUGS 672: 673: =over 4 674: 675: =item * Cache invalidation is time-based only; no checksum or version check. 676: 677: =back 678: 679: =head1 SEE ALSO 680: 681: =over 4 682: 683: =item * L<Test Dashboard|https://nigelhorne.github.io/Schema-Validator/coverage/> 684: 685: =back 686: 687: =head1 REPOSITORY 688: 689: L<https://github.com/nigelhorne/schema-validator> 690: 691: =head2 FORMAL SPECIFICATION 692: 693: =head3 is_valid_datetime 694: 695: Let CHAR denote the set of all Unicode code points and 696: DIGIT = { c : CHAR | c in {'0'..'9'} }. 697: Let seqN(S) = { s : seq S | #s = N }. 698: 699: YEAR â seqN(4, DIGIT) 700: MONTH â seqN(2, DIGIT) 701: DAY â seqN(2, DIGIT) 702: HOUR â seqN(2, DIGIT) 703: MINUTE â seqN(2, DIGIT) 704: SECOND â seqN(2, DIGIT) 705: SEP â { 'T', ' ' } 706: 707: DATE â { d : seq CHAR | â y â YEAR; mo â MONTH; dy â DAY 708: ⢠d = y ⢠â¨'-'⩠⢠mo ⢠â¨'-'⩠⢠dy } 709: 710: HHMM â { t : seq CHAR | â h â HOUR; m â MINUTE 711: ⢠t = h ⢠â¨':'⩠⢠m } 712: 713: HHMMSS â { t : seq CHAR | â h â HOUR; m â MINUTE; s â SECOND 714: ⢠t = h ⢠â¨':'⩠⢠m ⢠â¨':'⩠⢠s } 715: 716: TIMEFRAG â { tf : seq CHAR | â sep â SEP; hm â (HHMM ⪠HHMMSS) 717: ⢠tf = â¨sep⩠⢠hm } 718: 719: DATETIME â DATE ⪠{ dt : seq CHAR | â d â DATE; tf â TIMEFRAG 720: ⢠dt = d ⢠tf } 721: 722: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 723: IsValidDatetime 724: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 725: str? : seq CHAR 726: result! : B 727: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 728: result! ⺠str? â DATETIME 729: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 730: 731: =head3 load_dynamic_library 732: 733: Let FILE, DUR, URL be the resolved config values. 734: Let now : N be the current UNIX epoch time. 735: Let mtime : PATH -> N map a path to its last-modification time. 736: Let readable, writeable : PATH -> B be filesystem predicates. 737: Let reachable : URL -> B test HTTP reachability. 738: Let slurp : PATH -> seq CHAR and spit : PATH x seq CHAR -> 1. 739: Let fetch : URL x N -> seq CHAR (second arg is timeout). 740: Let decode_json : seq CHAR -> ITEM. 741: Let label : ITEM -> (LABEL | {}) extract rdfs:label. 742: Let types : ITEM -> P TYPE extract @type values. 743: 744: FRESH â ( -e(FILE) ) â§ ( (now - mtime(FILE)) < DUR ) 745: 746: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 747: LoadDynamicVocabulary 748: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 749: ÎVocabularyStore 750: cache_file? : PATH 751: cache_duration? : N 752: vocab_url? : URL 753: ua_timeout? : N 754: result! : CLASS_LABEL ⸠ITEM 755: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 756: content : seq CHAR 757: 758: FRESH â§ readable(cache_file?) 759: â content = slurp(cache_file?) 760: 761: ¬FRESH â§ reachable(vocab_url?) 762: â content = fetch(vocab_url?, ua_timeout?) 763: â§ ( writeable(cache_file?) â spit(cache_file?, content) ) 764: 765: ¬FRESH ⧠¬reachable(vocab_url?) â§ -e(cache_file?) 766: â content = slurp(cache_file?) 767: 768: graph â (decode_json content)[AT_GRAPH] 769: 770: dynamic_schema' = 771: { item â graph | RDF_CLASS â types(item) â§ label(item) â â 772: ⢠label(item) ⦠item } 773: 774: dynamic_properties' = 775: { item â graph | RDF_PROPERTY â types(item) â§ label(item) â â 776: ⢠label(item) ⦠item } 777: 778: result! = dynamic_schema' 779: ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ 780: 781: =head1 AUTHOR 782: 783: Nigel Horne, C<< <njh at nigelhorne.com> >> 784: 785: =head1 LICENCE AND COPYRIGHT 786: 787: Copyright 2025-2026 Nigel Horne. 788: 789: Usage is subject to the GPL2 licence terms. 790: If you use it, 791: please let me know. 792: 793: =cut 794: 795: 1;Mutants (Total: 1, Killed: 0, Survived: 1)
- COND_INV_630_4: Invert condition if to unless
MEDIUM: Add tests asserting both true and false outcomes