File Coverage

File:blib/lib/App/Test/Generator.pm
Coverage:57.5%

linestmtbrancondsubtimecode
1package App::Test::Generator;
2
3# TODO: Test validator from Params::Validate::Strict 0.16
4# TODO: $seed should be passed to Data::Random::String::Matches
5# TODO: positional args - when config_undef is set, see what happens when not all args are given
6
7
6
6
393231
8
use 5.014;
8
9
6
6
6
10
6
50
use strict;
10
6
6
6
11
6
120
use warnings;
11
6
6
6
1125
34665
13
use autodie qw(:all);
12
13
6
6
6
38682
561
13
use utf8;
14binmode STDOUT, ':utf8';
15binmode STDERR, ':utf8';
16
17
6
6
6
1181
2971
14
use open qw(:std :encoding(UTF-8));
18
19
6
6
6
36968
8
101
use App::Test::Generator::Template;
20
6
6
6
12
4
144
use Carp qw(carp croak);
21
6
6
6
1299
97722
88
use Config::Abstraction 0.36;
22
6
6
6
514
5713
133
use Data::Dumper;
23
6
6
6
11
5
80
use Data::Section::Simple;
24
6
6
6
10
4
133
use File::Basename qw(basename);
25
6
6
6
9
5
52
use File::Spec;
26
6
6
6
1317
53812
171
use Module::Load::Conditional qw(check_install can_load);
27
6
6
6
1315
45437
102
use Template;
28
6
6
6
709
4829
148
use YAML::XS qw(LoadFile);
29
30
6
6
6
12
4
168
use Exporter 'import';
31
32our @EXPORT_OK = qw(generate);
33
34our $VERSION = '0.20';
35
36use constant {
37
6
27468
        DEFAULT_ITERATIONS => 50,
38        DEFAULT_PROPERTY_TRIALS => 1000
39
6
6
12
2
};
40
41 - 1180
=head1 NAME

App::Test::Generator - Generate fuzz and corpus-driven test harnesses

=head1 VERSION

Version 0.20

=head1 SYNOPSIS

From the command line:

  fuzz-harness-generator -r t/conf/add.yml

  extract-schemas bin/extract-schemas lib/Sample/Module.pm; fuzz-harness-generator -r schemas/greet.yaml

From Perl:

  use App::Test::Generator qw(generate);

  # Generate to STDOUT
  App::Test::Generator::generate("t/conf/add.yml");

  # Generate directly to a file
  App::Test::Generator::generate('t/conf/add.yml', 't/add_fuzz.t');

=encoding utf8

=head1 OVERVIEW

This module takes a formal input/output specification for a routine or
method and automatically generates test cases. In effect, it allows you
to easily add comprehensive black-box tests in addition to the more
common white-box tests that are typically written for CPAN modules and other
subroutines.

The generated tests combine:

=over 4

=item * Random fuzzing based on input types

=item * Deterministic edge cases for min/max constraints

=item * Static corpus tests defined in Perl or YAML

=back

This approach strengthens your test suite by probing both expected and
unexpected inputs, helping you to catch boundary errors, invalid data
handling, and regressions without manually writing every case.

=head1 DESCRIPTION

This module implements the logic behind L<fuzz-harness-generator>.
It parses configuration files (fuzz and/or corpus YAML), and
produces a ready-to-run F<.t> test script to run through C<prove>.

It reads configuration files in any format,
and optional YAML corpus files.
All of the examples in this documentation are in C<YAML> format,
other formats may not work as they aren't so heavily tested.
It then generates a L<Test::Most>-based fuzzing harness combining:

=over 4

=item * Randomized fuzzing of inputs (with edge cases)

=item * Optional static corpus tests from Perl C<%cases> or YAML file (C<yaml_cases> key)

=item * Functional or OO mode (via C<$new>)

=item * Reproducible runs via C<$seed> and configurable iterations via C<$iterations>

=back

=head1 CONFIGURATION

The configuration file,
for each set of tests to be produced,
is a file containing a schema that can be read by L<Config::Abstraction>.

=head2 SCHEMA

The schema is split into several sections.

=head3 C<%input> - input params with keys => type/optional specs

When using named parameters

  input:
    name:
      type: string
      optional: false
    age:
      type: integer
      optional: true

Supported basic types used by the fuzzer: C<string>, C<integer>, C<float>, C<number>, C<boolean>, C<arrayref>, C<hashref>.
See also L<Params::Validate::Strict>.
You can add more custom types using properties.

For routines with one unnamed parameter

  input:
    type: string

For routines with more than one named parameter, use the C<position> keyword.

  module: Math::Simple::MinMax
  fuction: max

  input:
    left:
      type: number
      position: 0
    right:
      type: number
      position: 1

  output:
    type: number

The keyword C<undef> is used to indicate that the C<function> takes no arguments.

=head3 C<%output> - output param types for L<Return::Set> checking

  output:
    type: string

If the output hash contains the key _STATUS, and if that key is set to DIES,
the routine should die with the given arguments; otherwise, it should live.
If it's set to WARNS,
the routine should warn with the given arguments.
The output can be set to the string 'undef' if the routine should return the undefined value:

  ---
  module: Scalar::Util
  function: blessed

  input:
    type: string

  output: undef

The keyword C<undef> is used to indicate that the C<function> returns nothing.

=head3 C<%config> - optional hash of configuration.

The current supported variables are

=over 4

=item * C<test_nuls>, inject NUL bytes into strings (default: 1)

=item * C<test_undef>, test with undefined value (default: 1)

=item * C<test_empty>, test with empty strings (default: 1)

=item * C<test_non_ascii>, test with strings that contain non ascii characters (default: 1)

=item * C<dedup>, fuzzing can create duplicate tests, go some way to remove duplicates (default: 1)

=item * C<properties>, enable L<Test::LectroTest> Property tests (default: 0)

=back

=head3 C<%transforms> - list of transformations from input sets to output sets

Transforms allow you to define how input data should be transformed into output data.
This is useful for testing functions that convert between formats, normalize data,
or apply business logic transformations on a set of data to different set of data.
It takes a list of subsets of the input and output definitions,
and verifies that data from each input subset is correctly transformed into data from the matching output subset.

=head4 Transform Validation Rules

For each transform:

=over 4

=item 1. Generate test cases using the transform's input schema

=item 2. Call the function with those inputs

=item 3. Validate the output matches the transform's output schema

=item 4. If output has a specific 'value', check exact match

=item 5. If output has constraints (min/max), validate within bounds

=back

=head4 Example 1

  ---
  module: builtin
  function: abs

  config:
    test_undef: no
    test_empty: no
    test_nuls: no
    test_non_ascii: no

  input:
    number:
      type: number
      position: 0

  output:
    type: number
    min: 0

  transforms:
    positive:
      input:
        number:
          type: number
          position: 0
          min: 0
      output:
        type: number
        min: 0
    negative:
      input:
        number:
          type: number
          position: 0
          max: 0
      output:
        type: number
        min: 0
    error:
      input:
        undef
      output:
        _STATUS: DIES

If the output hash contains the key _STATUS, and if that key is set to DIES,
the routine should die with the given arguments; otherwise, it should live.
If it's set to WARNS, the routine should warn with the given arguments.

The keyword C<undef> is used to indicate that the C<function> returns nothing.

=head4 Example 2

  ---
  module: Math::Utils
  function: normalize_number

  input:
    value:
      type: number
      position: 0

  output:
    type: number

  transforms:
    positive_stays_positive:
      input:
        value:
          type: number
          min: 0
          max: 1000
      output:
        type: number
        min: 0
        max: 1

    negative_becomes_zero:
      input:
        value:
          type: number
          max: 0
      output:
        type: number
        value: 0

    preserves_zero:
      input:
        value:
          type: number
          value: 0
      output:
        type: number
        value: 0

=head3 C<$module>

The name of the module (optional).

Using the reserved word C<builtin> means you're testing a Perl builtin function.

If omitted, the generator will guess from the config filename:
C<My-Widget.conf> -> C<My::Widget>.

=head3 C<$function>

The function/method to test.

This defaults to C<run>.

=head3 C<%new>

An optional hashref of args to pass to the module's constructor.

  new:
    api_key: ABC123
    verbose: true

To ensure C<new()> is called with no arguments, you still need to define new, thus:

  module: MyModule
  function: my_function

  new:

=head3 C<%cases>

An optional Perl static corpus, when the output is a simple string (expected => [ args... ]).

Maps the expected output string to the input and _STATUS

  cases:
    ok:
      input: ping
      _STATUS: OK
    error:
      input: ""
      _STATUS: DIES

=head3 C<$yaml_cases> - optional path to a YAML file with the same shape as C<%cases>.

=head3 C<$seed>

An optional integer.
When provided, the generated C<t/fuzz.t> will call C<srand($seed)> so fuzz runs are reproducible.

=head3 C<$iterations>

An optional integer controlling how many fuzz iterations to perform (default 50).

=head3 C<%edge_cases>

An optional hash mapping of extra values to inject.

        # Two named parameters
        edge_cases:
                name: [ '', 'a' x 1024, \"\x{263A}" ]
                age: [ -1, 0, 99999999 ]

        # Takes a string input
        edge_cases: [ 'foo', 'bar' ]

Values can be strings or numbers; strings will be properly quoted.
Note that this only works with routines that take named parameters.

=head3 C<%type_edge_cases>

An optional hash mapping types to arrayrefs of extra values to try for any field of that type:

        type_edge_cases:
                string: [ '', ' ', "\t", "\n", "\0", 'long' x 1024, chr(0x1F600) ]
                number: [ 0, 1.0, -1.0, 1e308, -1e308, 1e-308, -1e-308, 'NaN', 'Infinity' ]
                integer: [ 0, 1, -1, 2**31-1, -(2**31), 2**63-1, -(2**63) ]

=head3 C<%edge_case_array>

Specify edge case values for routines that accept a single unnamed parameter.
This is specifically designed for simple functions that take one argument without a parameter name.
These edge cases supplement the normal random string generation, ensuring specific problematic values are always tested.
During fuzzing iterations, there's a 40% probability that a test case will use a value from edge_case_array instead of randomly generated data.

  ---
  module: Text::Processor
  function: sanitize

  input:
    type: string
    min: 1
    max: 1000

  edge_case_array:
    - "<script>alert('xss')</script>"
    - "'; DROP TABLE users; --"
    - "\0null\0byte"
    - "emoji😊test"
    - ""
    - " "

  seed: 42
  iterations: 50

=head3 Semantic Data Generators

For property-based testing with L<Test::LectroTest>,
you can use semantic generators to create realistic test data.
Fuzz testing support for C<semantic> entries is being developed.

  input:
    email:
      type: string
      semantic: email

    user_id:
      type: string
      semantic: uuid

    phone:
      type: string
      semantic: phone_us

=head4 Available Semantic Types

=over 4

=item * C<email> - Valid email addresses (user@domain.tld)

=item * C<url> - HTTP/HTTPS URLs

=item * C<uuid> - UUIDv4 identifiers

=item * C<phone_us> - US phone numbers (XXX-XXX-XXXX)

=item * C<phone_e164> - International E.164 format (+XXXXXXXXXXXX)

=item * C<ipv4> - IPv4 addresses (0.0.0.0 - 255.255.255.255)

=item * C<ipv6> - IPv6 addresses

=item * C<username> - Alphanumeric usernames with _ and -

=item * C<slug> - URL slugs (lowercase-with-hyphens)

=item * C<hex_color> - Hex color codes (#RRGGBB)

=item * C<iso_date> - ISO 8601 dates (YYYY-MM-DD)

=item * C<iso_datetime> - ISO 8601 datetimes (YYYY-MM-DDTHH:MM:SSZ)

=item * C<semver> - Semantic version strings (major.minor.patch)

=item * C<jwt> - JWT-like tokens (base64url format)

=item * C<json> - Simple JSON objects

=item * C<base64> - Base64-encoded strings

=item * C<md5> - MD5 hashes (32 hex chars)

=item * C<sha256> - SHA-256 hashes (64 hex chars)

=back

=head2 EDGE CASE GENERATION

In addition to purely random fuzz cases, the harness generates
deterministic edge cases for parameters that declare C<min>, C<max> or C<len> in their schema definitions.

For each constraint, three edge cases are added:

=over 4

=item * Just inside the allowable range

This case should succeed, since it lies strictly within the bounds.

=item * Exactly on the boundary

This case should succeed, since it meets the constraint exactly.

=item * Just outside the boundary

This case is annotated with C<_STATUS = 'DIES'> in the corpus and
should cause the harness to fail validation or croak.

=back

Supported constraint types:

=over 4

=item * C<number>, C<integer>, C<float>

Uses numeric values one below, equal to, and one above the boundary.

=item * C<string>

Uses strings of lengths one below, equal to, and one above the boundary.

=item * C<arrayref>

Uses references to arrays of with the number of elements one below, equal to, and one above the boundary.

=item * C<hashref>

Uses hashes with key counts one below, equal to, and one above the
boundary (C<min> = minimum number of keys, C<max> = maximum number
of keys).

=item * C<memberof> - arrayref of allowed values for a parameter

This example is for a routine called C<input()> that takes two arguments: C<status> and C<level>.
C<status> is a string that must have the value C<ok>, C<error> or C<pending>.
The C<level> argument is an integer that must be one of C<1>, C<5> or C<111>.

  ---
  input:
    status:
      type: string
      memberof:
        - ok
        - error
        - pending
    level:
      type: integer
      memberof:
        - 1
        - 5
        - 111

The generator will automatically create test cases for each allowed value (inside the member list),
and at least one value outside the list (which should die or C<croak>, C<_STATUS = 'DIES'>).
This works for strings, integers, and numbers.

=item * C<boolean> - automatic boundary tests for boolean fields

  input:
    flag:
      type: boolean

The generator will automatically create test cases for 0 and 1; true and false; off and on, and values that should trigger C<_STATUS = 'DIES'>.

=back

These edge cases are inserted automatically, in addition to the random
fuzzing inputs, so each run will reliably probe boundary conditions
without relying solely on randomness.

=head1 EXAMPLES

See the files in C<t/conf> for examples.

=head2 Adding Scheduled fuzz Testing with GitHub Actions to Your Code

To automatically create and run tests on a regular basis on GitHub Actions,
you need to create a configuration file for each method and subroutine that you're testing,
and a GitHub Actions configuration file.

This example takes you through testing the online_render method of L<HTML::Genealogy::Map>.

=head3 t/conf/online_render.yml

  ---

  module: HTML::Genealogy::Map
  function: onload_render

  input:
    gedcom:
      type: object
      can: individuals
    geocoder:
      type: object
      can: geocode
    debug:
      type: boolean
      optional: true
    google_key:
      type: string
      optional: true
      min: 39
      max: 39
      matches: "^AIza[0-9A-Za-z_-]{35}$"

  config:
    test_undef: 0

=head3 .github/actions/fuzz.t

  ---
  name: Fuzz Testing

  permissions:
    contents: read

  on:
    push:
      branches: [main, master]
    pull_request:
      branches: [main, master]
    schedule:
      - cron: '29 5 14 * *'

  jobs:
    generate-fuzz-tests:
      strategy:
        fail-fast: false
        matrix:
          os:
            - macos-latest
            - ubuntu-latest
            - windows-latest
          perl: ['5.42', '5.40', '5.38', '5.36', '5.34', '5.32', '5.30', '5.28', '5.22']

      runs-on: ${{ matrix.os }}
      name: Fuzz testing with perl ${{ matrix.perl }} on ${{ matrix.os }}

      steps:
        - uses: actions/checkout@v5

        - name: Set up Perl
          uses: shogo82148/actions-setup-perl@v1
          with:
            perl-version: ${{ matrix.perl }}

        - name: Install App::Test::Generator this module's dependencies
          run: |
            cpanm App::Test::Generator
            cpanm --installdeps .

        - name: Make Module
          run: |
            perl Makefile.PL
            make
          env:
            AUTOMATED_TESTING: 1
            NONINTERACTIVE_TESTING: 1

        - name: Generate fuzz tests
          run: |
            mkdir t/fuzz
            find t/conf -name '*.yml' | while read config; do
              test_name=$(basename "$config" .conf)
              fuzz-harness-generator "$config" > "t/fuzz/${test_name}_fuzz.t"
            done

        - name: Run generated fuzz tests
          run: |
            prove -lr t/fuzz/
          env:
            AUTOMATED_TESTING: 1
            NONINTERACTIVE_TESTING: 1

=head2 Fuzz Testing your CPAN Module

Running fuzz tests when you run C<make test> in your CPAN module.

Create a directory <t/conf> which contains the schemas.

Then create this file as <t/fuzz.t>:

  #!/usr/bin/env perl

  use strict;
  use warnings;

  use FindBin qw($Bin);
  use IPC::Run3;
  use IPC::System::Simple qw(system);
  use Test::Needs 'App::Test::Generator';
  use Test::Most;

  my $dirname = "$Bin/conf";

  if((-d $dirname) && opendir(my $dh, $dirname)) {
        while (my $filename = readdir($dh)) {
                # Skip '.' and '..' entries and vi temporary files
                next if ($filename eq '.' || $filename eq '..') || ($filename =~ /\.swp$/);

                my $filepath = "$dirname/$filename";

                if(-f $filepath) {      # Check if it's a regular file
                        my ($stdout, $stderr);
                        run3 ['fuzz-harness-generator', '-r', $filepath], undef, \$stdout, \$stderr;

                        ok($? == 0, 'Generated test script exits successfully');

                        if($? == 0) {
                                ok($stdout =~ /^Result: PASS/ms);
                                if($stdout =~ /Files=1, Tests=(\d+)/ms) {
                                        diag("$1 tests run");
                                }
                        } else {
                                diag("$filepath: STDOUT:\n$stdout");
                                diag($stderr) if(length($stderr));
                                diag("$filepath Failed");
                                last;
                        }
                        diag($stderr) if(length($stderr));
                }
        }
        closedir($dh);
  }

  done_testing();

=head2 Property-Based Testing with Transforms

The generator can create property-based tests using L<Test::LectroTest> when the
C<properties> configuration option is enabled.
This provides more comprehensive
testing by automatically generating thousands of test cases and verifying that
mathematical properties hold across all inputs.

=head3 Basic Property-Based Transform Example

Here's a complete example testing the C<abs> builtin function:

B<t/conf/abs.yml>:

  ---
  module: builtin
  function: abs

  config:
    test_undef: no
    test_empty: no
    test_nuls: no
    properties:
      enable: true
      trials: 1000

  input:
    number:
      type: number
      position: 0

  output:
    type: number
    min: 0

  transforms:
    positive:
      input:
        number:
          type: number
          min: 0
      output:
        type: number
        min: 0

    negative:
      input:
        number:
          type: number
          max: 0
      output:
        type: number
        min: 0

This configuration:

=over 4

=item * Enables property-based testing with 1000 trials per property

=item * Defines two transforms: one for positive numbers, one for negative

=item * Automatically generates properties that verify C<abs()> always returns non-negative numbers

=back

Generate the test:

  fuzz-harness-generator t/conf/abs.yml > t/abs_property.t

The generated test will include:

=over 4

=item * Traditional edge-case tests for boundary conditions

=item * Random fuzzing with 50 iterations (or as configured)

=item * Property-based tests that verify the transforms with 1000 trials each

=back

=head3 What Properties Are Tested?

The generator automatically detects and tests these properties based on your transform specifications:

=over 4

=item * B<Range constraints> - If output has C<min> or C<max>, verifies results stay within bounds

=item * B<Type preservation> - Ensures numeric inputs produce numeric outputs

=item * B<Definedness> - Verifies the function doesn't return C<undef> unexpectedly

=item * B<Specific values> - If output specifies a C<value>, checks exact equality

=back

For the C<abs> example above, the generated properties verify:

  # For the "positive" transform:
  - Given a positive number, abs() returns >= 0
  - The result is a valid number
  - The result is defined

  # For the "negative" transform:
  - Given a negative number, abs() returns >= 0
  - The result is a valid number
  - The result is defined

=head3 Advanced Example: String Normalization

Here's a more complex example testing a string normalization function:

B<t/conf/normalize.yml>:

  ---
  module: Text::Processor
  function: normalize_whitespace

  config:
    properties:
      enable: true
      trials: 500

  input:
    text:
      type: string
      min: 0
      max: 1000
      position: 0

  output:
    type: string
    min: 0
    max: 1000

  transforms:
    empty_preserved:
      input:
        text:
          type: string
          value: ""
      output:
        type: string
        value: ""

    single_space:
      input:
        text:
          type: string
          min: 1
          matches: '^\S+(\s+\S+)*$'
      output:
        type: string
        matches: '^\S+( \S+)*$'

    length_bounded:
      input:
        text:
          type: string
          min: 1
          max: 100
      output:
        type: string
        min: 1
        max: 100

This tests that the normalization function:

=over 4

=item * Preserves empty strings (C<empty_preserved> transform)

=item * Collapses multiple spaces into single spaces (C<single_space> transform)

=item * Maintains length constraints (C<length_bounded> transform)

=back

=head3 Interpreting Property Test Results

When property-based tests run, you'll see output like:

  ok 123 - negative property holds (1000 trials)
  ok 124 - positive property holds (1000 trials)

If a property fails, Test::LectroTest will attempt to find the minimal failing
case and display it:

  not ok 123 - positive property holds (47 trials)
  # Property failed
  # Reason: counterexample found

This helps you quickly identify edge cases that your function doesn't handle correctly.

=head3 Configuration Options for Property-Based Testing

In the C<config> section:

  config:
    properties:
      enable: true     # Enable property-based testing (default: false)
      trials: 1000     # Number of test cases per property (default: 1000)

You can also disable traditional fuzzing and only use property-based tests:

  config:
    properties:
      enable: true
      trials: 5000

  iterations: 0  # Disable random fuzzing, use only property tests

=head3 When to Use Property-Based Testing

Property-based testing with transforms is particularly useful for:

=over 4

=item * Mathematical functions (C<abs>, C<sqrt>, C<min>, C<max>, etc.)

=item * Data transformations (encoding, normalization, sanitization)

=item * Parsers and formatters

=item * Functions with clear input-output relationships

=item * Code that should satisfy mathematical properties (commutativity, associativity, idempotence)

=back

=head3 Requirements

Property-based testing requires L<Test::LectroTest> to be installed:

  cpanm Test::LectroTest

If not installed, the generated tests will automatically skip the property-based
portion with a message.

=head3 Testing Email Validation

  ---
  module: Email::Valid
  function: rfc822

  config:
    properties:
      enable: true
      trials: 200
    test_undef: no
    test_empty: no
    test_nuls: no

  input:
    email:
      type: string
      semantic: email
      position: 0

  output:
    type: boolean

  transforms:
    valid_emails:
      input:
        email:
          type: string
          semantic: email
      output:
        type: boolean

This generates 200 realistic email addresses for testing, rather than random strings.

=head3 Combining Semantic with Regex

You can combine semantic generators with regex validation:

  input:
    corporate_email:
      type: string
      semantic: email
      matches: '@company\.com$'

The semantic generator creates realistic emails, and the regex ensures they match your domain.

=head3 Custom Properties for Transforms

You can define additional properties that should hold for your transforms beyond
the automatically detected ones.

=head4 Using Built-in Properties

  transforms:
    positive:
      input:
        number:
          type: number
          min: 0
      output:
        type: number
        min: 0
      properties:
        - idempotent       # f(f(x)) == f(x)
        - non_negative     # result >= 0
        - positive         # result > 0

Available built-in properties:

=over 4

=item * C<idempotent> - Function is idempotent: f(f(x)) == f(x)

=item * C<non_negative> - Result is always >= 0

=item * C<positive> - Result is always > 0

=item * C<non_empty> - String result is never empty

=item * C<length_preserved> - Output length equals input length

=item * C<uppercase> - Result is all uppercase

=item * C<lowercase> - Result is all lowercase

=item * C<trimmed> - No leading/trailing whitespace

=item * C<sorted_ascending> - Array is sorted ascending

=item * C<sorted_descending> - Array is sorted descending

=item * C<unique_elements> - Array has no duplicates

=item * C<preserves_keys> - Hash has same keys as input

=back

=head4 Custom Property Code

Custom properties allows the definition additional invariants and relationships that should hold for their transforms,
beyond what's auto-detected.
For example:

=over 4

=item * Idempotence: f(f(x)) == f(x)

=item * Commutativity: f(x, y) == f(y, x)

=item * Associativity: f(f(x, y), z) == f(x, f(y, z))

=item * Inverse relationships: decode(encode(x)) == x

=item * Domain-specific invariants: Custom business logic

=back

Define your own properties with custom Perl code:

  transforms:
    normalize:
      input:
        text:
          type: string
      output:
        type: string
      properties:
        - name: single_spaces
          description: "No multiple consecutive spaces"
          code: $result !~ /  /

        - name: no_leading_space
          description: "No space at start"
          code: $result !~ /^\s/

        - name: reversible
          description: "Can be reversed back"
          code: length($result) == length($text)

The code has access to:

=over 4

=item * C<$result> - The function's return value

=item * Input variables - All input parameters (e.g., C<$text>, C<$number>)

=item * The function itself - Can call it again for idempotence checks

=back

=head4 Combining Auto-detected and Custom Properties

The generator automatically detects properties from your output spec, and adds
your custom properties:

  transforms:
    sanitize:
      input:
        html:
          type: string
      output:
        type: string
        min: 0              # Auto-detects: defined, min_length >= 0
        max: 10000
      properties:           # Additional custom checks:
        - name: no_scripts
          code: $result !~ /<script/i
        - name: no_iframes
          code: $result !~ /<iframe/i

=head2 GENERATED OUTPUT

The generated test:

=over 4

=item * Seeds RND (if configured) for reproducible fuzz runs

=item * Uses edge cases (per-field and per-type) with configurable probability

=item * Runs C<$iterations> fuzz cases plus appended edge-case runs

=item * Validates inputs with Params::Get / Params::Validate::Strict

=item * Validates outputs with L<Return::Set>

=item * Runs static C<is(... )> corpus tests from Perl and/or YAML corpus

=item * Runs L<Test::LectroTest> tests

=back

=head1 METHODS

  generate($schema_file, $test_file)

Takes a schema file and produces a test file (or STDOUT).

=cut
1181
1182sub generate
1183{
1184
21
337663
        if($_[0] && ($_[0] eq __PACKAGE__)) {
1185
0
0
                shift;
1186        }
1187
1188
21
26
        my ($schema_file, $test_file) = @_;
1189
1190        # Globals loaded from the user's conf (all optional except function maybe)
1191
21
38
        my (%input, %output, %config, $module, $function, $new, %cases, $yaml_cases, %transforms);
1192
21
0
        my ($seed, $iterations);
1193
21
0
        my (%edge_cases, @edge_case_array, %type_edge_cases);
1194
1195
21
17
        @edge_case_array = ();
1196
1197
21
25
        if(defined($schema_file)) {
1198
20
22
                if(my $schema = _load_schema($schema_file)) {
1199                        # Parse the schema file and load into our structures
1200
20
20
149
25
                        %input = %{_load_schema_section($schema, 'input', $schema_file)};
1201
19
19
12
17
                        %output = %{_load_schema_section($schema, 'output', $schema_file)};
1202
18
18
14
14
                        %transforms = %{_load_schema_section($schema, 'transforms', $schema_file)};
1203
1204
17
2
22
4
                        %cases = %{$schema->{cases}} if(exists($schema->{cases}));
1205
17
0
20
0
                        %edge_cases = %{$schema->{edge_cases}} if(exists($schema->{edge_cases}));
1206
17
0
18
0
                        %type_edge_cases = %{$schema->{type_edge_cases}} if(exists($schema->{type_edge_cases}));
1207
1208
17
25
                        $module = $schema->{module} if(exists($schema->{module}));
1209
17
21
                        $function = $schema->{function} if(exists($schema->{function}));
1210
17
20
                        if(exists($schema->{new})) {
1211
2
3
                                $new = defined($schema->{'new'}) ? $schema->{new} : '_UNDEF';
1212                        }
1213
17
19
                        $yaml_cases = $schema->{yaml_cases} if(exists($schema->{yaml_cases}));
1214
17
15
                        $seed = $schema->{seed} if(exists($schema->{seed}));
1215
17
16
                        $iterations = $schema->{iterations} if(exists($schema->{iterations}));
1216
1217
17
2
16
2
                        @edge_case_array = @{$schema->{edge_case_array}} if(exists($schema->{edge_case_array}));
1218
17
23
                        _validate_config($schema);
1219
1220
14
2
18
4
                        %config = %{$schema->{config}} if(exists($schema->{config}));
1221                } else {
1222
0
0
                        croak "Failed to load schema from $schema_file";
1223                }
1224        } else {
1225
1
6
                croak 'Usage: generate(schema_file [, outfile])';
1226        }
1227
1228        # dedup: fuzzing can easily generate repeats, default is to remove duplicates
1229
14
30
        foreach my $field ('test_nuls', 'test_undef', 'test_empty', 'test_non_ascii', 'dedup') {
1230
70
52
                if(exists($config{$field})) {
1231
7
17
                        if(($config{$field} eq 'false') || ($config{$field} eq 'off') || ($config{$field} eq 'no')) {
1232
6
3
                                $config{$field} = 0;
1233                        } elsif(($config{$field} eq 'true') || ($config{$field} eq 'on') || ($config{$field} eq 'yes')) {
1234
1
1
                                $config{$field} = 1;
1235                        }
1236                } else {
1237
63
50
                        $config{$field} = 1;
1238                }
1239        }
1240
1241        # Guess module name from config file if not set
1242
14
18
        if($module eq 'builtin') {
1243
3
4
                undef $module;
1244        } elsif(!$module) {
1245
0
0
                (my $guess = basename($schema_file)) =~ s/\.(conf|pl|pm|yml|yaml)$//;
1246
0
0
                $guess =~ s/-/::/g;
1247
0
0
                $module = $guess || 'builtin';
1248        }
1249
1250
14
23
        if($module && ($module ne 'builtin')) {
1251
11
12
                _validate_module($module, $schema_file)
1252        }
1253
1254        # sensible defaults
1255
14
18
        $function ||= 'run';
1256
14
30
        $iterations ||= DEFAULT_ITERATIONS;              # default fuzz runs if not specified
1257
14
18
        $seed = undef if defined $seed && $seed eq '';  # treat empty as undef
1258
1259        # --- YAML corpus support (yaml_cases is filename string) ---
1260
14
7
        my %yaml_corpus_data;
1261
14
17
        if (defined $yaml_cases) {
1262
4
39
                croak("$yaml_cases: $!") if(!-f $yaml_cases);
1263
1264
3
10
                my $yaml_data = LoadFile(Encode::decode('utf8', $yaml_cases));
1265
3
157
                if ($yaml_data && ref($yaml_data) eq 'HASH') {
1266                        # Validate that the corpus inputs are arrayrefs
1267                        # e.g: "FooBar":      ["foo_bar"]
1268
3
3
                        my $valid_input = 1;
1269
3
3
2
6
                        for my $expected (keys %{$yaml_data}) {
1270
5
5
                                my $outputs = $yaml_data->{$expected};
1271
5
9
                                unless($outputs && (ref $outputs eq 'ARRAY')) {
1272
2
12
                                        carp("$yaml_cases: $expected does not point to an array ref, ignoring");
1273
2
176
                                        $valid_input = 0;
1274                                }
1275                        }
1276
1277
3
8
                        %yaml_corpus_data = %$yaml_data if($valid_input);
1278                }
1279        }
1280
1281        # Merge Perl %cases and YAML corpus safely
1282        # my %all_cases = (%cases, %yaml_corpus_data);
1283
13
16
        my %all_cases = (%yaml_corpus_data, %cases);
1284
13
14
        for my $k (keys %yaml_corpus_data) {
1285
3
11
                if (exists $cases{$k} && ref($cases{$k}) eq 'ARRAY' && ref($yaml_corpus_data{$k}) eq 'ARRAY') {
1286
1
1
1
1
1
2
                        $all_cases{$k} = [ @{$yaml_corpus_data{$k}}, @{$cases{$k}} ];
1287                }
1288        }
1289
1290        # render edge case maps for inclusion in the .t
1291
13
17
        my $edge_cases_code = render_arrayref_map(\%edge_cases);
1292
13
13
        my $type_edge_cases_code = render_arrayref_map(\%type_edge_cases);
1293
1294
13
12
        my $edge_case_array_code = '';
1295
13
12
        if(scalar(@edge_case_array)) {
1296
2
6
3
18
                $edge_case_array_code = join(', ', map { q_wrap($_) } @edge_case_array);
1297        }
1298
1299        # Render configuration - all the values are integers for now, if that changes, wrap the $config{$key} in single quotes
1300
13
9
        my $config_code = '';
1301
13
28
        foreach my $key (sort keys %config) {
1302                # Skip nested structures like 'properties' - they're used during
1303                # generation but don't need to be in the generated test
1304
66
54
                if(ref($config{$key}) eq 'HASH') {
1305
1
1
                        next;
1306                }
1307
65
66
                if((!defined($config{$key})) || !$config{$key}) {
1308                        # YAML will strip the word 'false'
1309                        # e.g. in 'test_undef: false'
1310
6
4
                        $config_code .= "'$key' => 0,\n";
1311                } else {
1312
59
57
                        $config_code .= "'$key' => $config{$key},\n";
1313                }
1314        }
1315
1316        # Render input/output
1317
13
13
        my $input_code = '';
1318
13
41
        if(((scalar keys %input) == 1) && exists($input{'type'}) && !ref($input{'type'})) {
1319                # %input = ( type => 'string' );
1320
4
4
                foreach my $key (sort keys %input) {
1321
4
5
                        $input_code .= "'$key' => '$input{$key}',\n";
1322                }
1323        } else {
1324                # %input = ( str => { type => 'string' } );
1325
9
15
                $input_code = render_hash(\%input);
1326        }
1327
13
18
        if(defined(my $re = $output{'matches'})) {
1328
0
0
                if(ref($re) ne 'Regexp') {
1329
0
0
                        $re = qr/$re/;
1330
0
0
                        $output{'matches'} = $re;
1331                }
1332        }
1333
13
22
        my $output_code = render_args_hash(\%output);
1334
13
21
        my $new_code = ($new && (ref $new eq 'HASH')) ? render_args_hash($new) : '';
1335
1336
13
9
        my $transforms_code;
1337
13
13
        if(keys %transforms) {
1338
1
1
                foreach my $transform(keys %transforms) {
1339
2
2
                        if($transforms_code) {
1340
1
1
                                $transforms_code .= "},\n";
1341                        }
1342                        $transforms_code .= "$transform => {\n" .
1343                                "\t'input' => { " .
1344                                render_args_hash($transforms{$transform}->{'input'}) .
1345                                "\t}, 'output' => { " .
1346
2
4
                        render_args_hash($transforms{$transform}->{'output'}) .
1347                        "\t},\n";
1348                }
1349
1
1
                $transforms_code .= "}\n";
1350        }
1351
1352
13
9
        my $transform_properties_code = '';
1353
13
10
        my $use_properties = 0;
1354
1355
13
19
        if (keys %transforms && ($config{properties}{enable} // 0)) {
1356
1
1
                $use_properties = 1;
1357
1358                # Generate property-based tests for transforms
1359
1
2
                my $properties = _generate_transform_properties(
1360                        \%transforms,
1361                        $function,
1362                        $module,
1363                        \%input,
1364                        \%config,
1365                        $new
1366                );
1367
1368                # Convert to code for template
1369
1
2
                $transform_properties_code = _render_properties($properties);
1370        }
1371
1372        # Setup / call code (always load module)
1373
13
14
        my $setup_code = ($module) ? "BEGIN { use_ok('$module') }" : '';
1374
13
22
        my $call_code;  # Code to call the function being test when used with named arguments
1375        my $position_code;      # Code to call the function being test when used with position arguments
1376
13
15
        if(defined($new)) {
1377                # keep use_ok regardless (user found earlier issue)
1378
2
3
                if($new_code eq '') {
1379
1
1
                        $setup_code .= "\nmy \$obj = new_ok('$module');";
1380                } else {
1381
1
1
                        $setup_code .= "\nmy \$obj = new_ok('$module' => [ { $new_code } ] );";
1382                }
1383
2
2
                $call_code = "\$result = \$obj->$function(\$input);";
1384
2
2
                $position_code = "\$result = \$obj->$function(\@alist);";
1385        } elsif(defined($module)) {
1386
8
13
                $call_code = "\$result = $module\->$function(\$input);";
1387
8
10
                $position_code = "\$result = $module\->$function(\@alist);";
1388        } else {
1389
3
3
                $call_code = "\$result = $function(\$input);";
1390
3
2
                $position_code = "\$result = $function(\@alist);";
1391        }
1392
1393        # Build static corpus code
1394
13
9
        my $corpus_code = '';
1395
13
15
        if (%all_cases) {
1396
2
2
                $corpus_code = "\n# --- Static Corpus Tests ---\n" .
1397                        "diag('Running " . scalar(keys %all_cases) . " corpus tests');\n";
1398
1399
2
4
                for my $expected (sort keys %all_cases) {
1400
5
2
                        my $inputs = $all_cases{$expected};
1401
5
8
                        next unless($inputs);
1402
1403
5
5
                        my $expected_str = perl_quote($expected);
1404
5
9
                        my $status = ((ref($inputs) eq 'HASH') && $inputs->{'_STATUS'}) // 'OK';
1405
5
8
                        if($expected_str eq "'_STATUS:DIES'") {
1406
0
0
                                $status = 'DIES';
1407                        } elsif($expected_str eq "'_STATUS:WARNS'") {
1408
0
0
                                $status = 'WARNS';
1409                        }
1410
1411
5
4
                        if(ref($inputs) eq 'HASH') {
1412
0
0
                                $inputs = $inputs->{'input'};
1413                        }
1414
5
4
                        my $input_str;
1415
5
4
                        if(ref($inputs) eq 'ARRAY') {
1416
5
8
5
5
5
3
                                $input_str = join(', ', map { perl_quote($_) } @{$inputs});
1417                        } elsif(ref($inputs) eq 'HASH') {
1418
0
0
                                $input_str = Dumper($inputs);
1419
0
0
                                $input_str =~ s/\$VAR1 =//;
1420
0
0
                                $input_str =~ s/;//;
1421
0
0
                                $input_str =~ s/=> 'undef'/=> undef/gms;
1422                        } else {
1423
0
0
                                $input_str = $inputs;
1424                        }
1425
5
6
                        if(($input_str eq 'undef') && (!$config{'test_undefs'})) {
1426
0
0
                                carp('corpus case set to undef, yet test_undefs is not set in config');
1427                        }
1428
5
5
                        if ($new) {
1429
0
0
                                if($status eq 'DIES') {
1430                                        $corpus_code .= "dies_ok { \$obj->$function($input_str) } " .
1431
0
0
0
0
                                                        "'$function(" . join(', ', map { $_ // '' } @$inputs ) . ") dies';\n";
1432                                } elsif($status eq 'WARNS') {
1433                                        $corpus_code .= "warnings_exist { \$obj->$function($input_str) } qr/./, " .
1434
0
0
0
0
                                                        "'$function(" . join(', ', map { $_ // '' } @$inputs ) . ") warns';\n";
1435                                } else {
1436                                        my $desc = sprintf("$function(%s) returns %s",
1437
0
0
0
0
                                                perl_quote(join(', ', map { $_ // '' } @$inputs )),
1438                                                $expected_str
1439                                        );
1440
0
0
                                        $corpus_code .= "is(\$obj->$function($input_str), $expected_str, " . q_wrap($desc) . ");\n";
1441                                }
1442                        } else {
1443
5
5
                                if($status eq 'DIES') {
1444
0
0
                                        $corpus_code .= "dies_ok { $module\::$function($input_str) } " .
1445                                                "'Corpus $expected dies';\n";
1446                                } elsif($status eq 'WARNS') {
1447
0
0
                                        $corpus_code .= "warnings_exist { $module\::$function($input_str) } qr/./, " .
1448                                                "'Corpus $expected warns';\n";
1449                                } else {
1450                                        my $desc = sprintf("$function(%s) returns %s",
1451
5
8
5
10
10
4
                                                perl_quote((ref $inputs eq 'ARRAY') ? (join(', ', map { $_ // '' } @{$inputs})) : $inputs),
1452                                                $expected_str
1453                                        );
1454
5
11
                                        $corpus_code .= "is($module\::$function($input_str), $expected_str, 'Corpus $expected works');\n";
1455                                }
1456                        }
1457                }
1458        }
1459
1460        # Prepare seed/iterations code fragment for the generated test
1461
13
12
        my $seed_code = '';
1462
13
11
        if (defined $seed) {
1463                # ensure integer-ish
1464
0
0
                $seed = int($seed);
1465
0
0
                $seed_code = "srand($seed);\n";
1466        }
1467        # Generate the test content
1468
13
46
        my $tt = Template->new({ ENCODING => 'utf8', TRIM => 1 });
1469
1470        # Read template from DATA handle
1471
13
38160
        my $template_package = __PACKAGE__ . '::Template';
1472
13
38
        my $template = $template_package->get_data_section('test.tt');
1473
1474        my $vars = {
1475                setup_code => $setup_code,
1476                edge_cases_code => $edge_cases_code,
1477                edge_case_array_code => $edge_case_array_code,
1478                type_edge_cases_code => $type_edge_cases_code,
1479                config_code => $config_code,
1480                seed_code => $seed_code,
1481                input_code => $input_code,
1482                output_code => $output_code,
1483                transforms_code => $transforms_code,
1484                corpus_code => $corpus_code,
1485                call_code => $call_code,
1486                position_code => $position_code,
1487                function => $function,
1488                iterations_code => int($iterations),
1489                use_properties => $use_properties,
1490                transform_properties_code => $transform_properties_code,
1491
13
8920
                property_trials => $config{properties}{trials} // DEFAULT_PROPERTY_TRIALS,
1492                module => $module
1493        };
1494
1495
13
9
        my $test;
1496
13
24
        $tt->process($template, $vars, \$test) or die $tt->error();
1497
1498
13
167184
        if ($test_file) {
1499
11
21
                open my $fh, '>:encoding(UTF-8)', $test_file or die "Cannot open $test_file: $!";
1500
11
5756
                print $fh "$test\n";
1501
11
22
                close $fh;
1502
11
1335
                if($module) {
1503
8
122
                        print "Generated $test_file for $module\::$function with fuzzing + corpus support\n";
1504                } else {
1505
3
52
                        print "Generated $test_file for $function with fuzzing + corpus support\n";
1506                }
1507        } else {
1508
2
507
                print "$test\n";
1509        }
1510}
1511
1512# --- Helpers for rendering data structures into Perl code for the generated test ---
1513
1514sub _load_schema {
1515
20
16
        my $schema_file = $_[0];
1516
1517
20
107
        if(!-r $schema_file) {
1518
0
0
                croak(__PACKAGE__, ": generate($schema_file): $!");
1519        }
1520
1521        # --- Load configuration safely (require so config can use 'our' variables) ---
1522        # FIXME:  would be better to use Config::Abstraction, since requiring the user's config could execute arbitrary code
1523        # my $abs = $schema_file;
1524        # $abs = "./$abs" unless $abs =~ m{^/};
1525        # require $abs;
1526
1527
20
77
        if(my $config = Config::Abstraction->new(config_dirs => ['.', ''], config_file => $schema_file)) {
1528
20
16165
                $config = $config->all();
1529
20
127
                if(defined($config->{'$module'}) || defined($config->{'our $module'}) || !defined($config->{'module'})) {
1530
0
0
                        croak("$schema_file: Loading perl files as configs is no longer supported");
1531                }
1532
20
70
                return $config;
1533        }
1534}
1535
1536sub _load_schema_section
1537{
1538
57
50
        my($schema, $section, $schema_file) = @_;
1539
1540
57
52
        if(exists($schema->{$section})) {
1541
34
42
                if(ref($schema->{$section}) eq 'HASH') {
1542
30
47
                        return $schema->{$section};
1543                } elsif(defined($schema->{$section}) && ($schema->{$section} ne 'undef')) {
1544                        # carp(Dumper($schema));
1545
3
5
                        if(ref($schema->{$section}) && length($schema->{$section})) {
1546
0
0
                                croak("$schema_file: $section should be a hash, not ", ref($schema->{$section}));
1547                        } else {
1548
3
19
                                croak("$schema_file: $section should be a hash, not ", $schema->{$section});
1549                        }
1550                }
1551        }
1552
24
26
        return {};
1553}
1554
1555# Input validation for configuration
1556sub _validate_config {
1557
17
10
        my $config = $_[0];
1558
1559
17
21
        if((!defined($config->{'module'})) && (!defined($config->{'function'}))) {
1560                # Can't work out what should be tested
1561
0
0
                croak('At least one of function and module must be defined');
1562        }
1563
1564
17
19
        if((!defined($config->{'input'})) && (!defined($config->{'output'}))) {
1565                # Routine takes no input and no output, so there's nothing that would be gained using this software
1566
2
14
                croak('You must specify at least one of input and output');
1567        }
1568
15
35
        if(($config->{'input'}) && (ref($config->{input}) ne 'HASH')) {
1569
0
0
                if($config->{'input'} eq 'undef') {
1570
0
0
                        delete $config->{'input'};
1571                } else {
1572
0
0
                        croak('Invalid input specification')
1573                }
1574        }
1575
1576        # Validate types, constraints, etc.
1577
15
15
9
19
        for my $param (keys %{$config->{input}}) {
1578
14
14
                my $spec = $config->{input}{$param};
1579
14
15
                if(ref($spec)) {
1580
9
12
                        croak "Invalid type '$spec->{type}' for parameter '$param'" unless _valid_type($spec->{type});
1581                } else {
1582
5
6
                        croak "Invalid type '$spec' for parameter '$param'" unless _valid_type($spec);
1583                }
1584        }
1585
1586        # Check if using positional arguments
1587
14
33
        my $has_positions = 0;
1588
14
10
        my %positions;
1589
1590
14
14
9
16
        for my $param (keys %{$config->{input}}) {
1591
13
10
                my $spec = $config->{input}{$param};
1592
13
37
                if (ref($spec) eq 'HASH' && defined($spec->{position})) {
1593
1
1
                        $has_positions = 1;
1594
1
1
                        my $pos = $spec->{position};
1595
1596                        # Validate position is non-negative integer
1597
1
7
                        croak "Position for '$param' must be a non-negative integer" unless $pos =~ /^\d+$/;
1598
1599                        # Check for duplicate positions
1600
1
1
                        croak "Duplicate position $pos for parameters '$positions{$pos}' and '$param'" if exists $positions{$pos};
1601
1602
1
2
                        $positions{$pos} = $param;
1603                }
1604        }
1605
1606        # If using positions, all params must have positions
1607
14
19
        if ($has_positions) {
1608
1
1
1
1
                for my $param (keys %{$config->{input}}) {
1609
1
1
                        my $spec = $config->{input}{$param};
1610
1
2
                        unless (ref($spec) eq 'HASH' && defined($spec->{position})) {
1611
0
0
                                croak "Parameter '$param' missing position (all params must have positions if any do)";
1612                        }
1613                }
1614
1615                # Check for gaps in positions (0, 1, 3 - missing 2)
1616
1
0
2
0
                my @sorted = sort { $a <=> $b } keys %positions;
1617
1
2
                for my $i (0..$#sorted) {
1618
1
2
                        if ($sorted[$i] != $i) {
1619
0
0
                                carp "Warning: Position sequence has gaps (positions: @sorted)";
1620
0
0
                                last;
1621                        }
1622                }
1623        }
1624
1625        # Validate semantic types
1626
14
42
        my $semantic_generators = _get_semantic_generators();
1627
14
14
11
13
        for my $param (keys %{$config->{input}}) {
1628
13
12
                my $spec = $config->{input}{$param};
1629
13
26
                if (ref($spec) eq 'HASH' && defined($spec->{semantic})) {
1630
0
0
                        my $semantic = $spec->{semantic};
1631
0
0
                        unless (exists $semantic_generators->{$semantic}) {
1632
0
0
                                carp "Warning: Unknown semantic type '$semantic' for parameter '$param'. Available types: ",
1633                                        join(', ', sort keys %$semantic_generators);
1634                        }
1635                }
1636        }
1637
1638        # Validate custom properties in transforms
1639
14
56
        if (exists $config->{transforms} && ref($config->{transforms}) eq 'HASH') {
1640
1
3
                my $builtin_props = _get_builtin_properties();
1641
1642
1
1
1
1
                for my $transform_name (keys %{$config->{transforms}}) {
1643
2
2
                        my $transform = $config->{transforms}{$transform_name};
1644
1645
2
19
                        if (exists $transform->{properties}) {
1646
0
0
                                unless (ref($transform->{properties}) eq 'ARRAY') {
1647
0
0
                                        croak "Transform '$transform_name': properties must be an array";
1648                                }
1649
1650
0
0
0
0
                                for my $prop (@{$transform->{properties}}) {
1651
0
0
                                        if (!ref($prop)) {
1652                                                # Check if builtin exists
1653
0
0
                                                unless (exists $builtin_props->{$prop}) {
1654
0
0
                                                        carp "Transform '$transform_name': unknown built-in property '$prop'. Available: ",
1655                                                                join(', ', sort keys %$builtin_props);
1656                                                }
1657                                        }
1658                                        elsif (ref($prop) eq 'HASH') {
1659                                                # Validate custom property structure
1660
0
0
                                                unless ($prop->{name} && $prop->{code}) {
1661
0
0
                                                        croak "Transform '$transform_name': custom properties must have 'name' and 'code' fields";
1662                                                }
1663                                        }
1664                                        else {
1665
0
0
                                                croak "Transform '$transform_name': invalid property definition";
1666                                        }
1667                                }
1668                        }
1669                }
1670        }
1671}
1672
1673sub _valid_type
1674{
1675
14
10
        my $type = $_[0];
1676
1677
14
45
        return(($type eq 'string') ||
1678                ($type eq 'boolean') ||
1679                ($type eq 'integer') ||
1680                ($type eq 'number') ||
1681                ($type eq 'float') ||
1682                ($type eq 'hashref') ||
1683                ($type eq 'arrayref') ||
1684                ($type eq 'object'));
1685}
1686
1687sub _validate_module {
1688
11
10
        my ($module, $schema_file) = @_;
1689
1690
11
11
        return 1 unless $module;        # No module to validate (builtin functions)
1691
1692        # Check if the module can be found
1693
11
15
        my $mod_info = check_install(module => $module);
1694
1695
11
30280
        if (!$mod_info) {
1696                # Module not found - this is just a warning, not an error
1697                # The module might not be installed on the generation machine
1698                # but could be on the test machine
1699
1
18
                carp("Warning: Module '$module' not found in \@INC during generation.");
1700
1
105
                carp("  Config file: $schema_file");
1701
1
86
                carp("  This is OK if the module will be available when tests run.");
1702
1
75
                carp('  If this is unexpected, check your module name and installation.');
1703
1
64
                return 0;       # Not found, but not fatal
1704        }
1705
1706        # Module was found
1707
10
28
        if ($ENV{TEST_VERBOSE} || $ENV{GENERATOR_VERBOSE}) {
1708                print STDERR "Found module '$module' at: $mod_info->{file}\n",
1709
0
0
                        '  Version: ', ($mod_info->{version} || 'unknown'), "\n";
1710        }
1711
1712        # Optionally try to load it (disabled by default since it can have side effects)
1713
10
10
        if ($ENV{GENERATOR_VALIDATE_LOAD}) {
1714
0
0
                my $loaded = can_load(modules => { $module => undef }, verbose => 0);
1715
1716
0
0
                if (!$loaded) {
1717
0
0
                        carp("Warning: Module '$module' found but failed to load: $Module::Load::Conditional::ERROR");
1718
0
0
                        carp('  This might indicate a broken installation or missing dependencies.');
1719
0
0
                        return 0;
1720                }
1721
1722
0
0
                if ($ENV{TEST_VERBOSE} || $ENV{GENERATOR_VERBOSE}) {
1723
0
0
                        print STDERR "Successfully loaded module '$module'\n";
1724                }
1725        }
1726
1727
10
17
        return 1;
1728}
1729
1730sub perl_sq {
1731
67
47
        my $s = $_[0];
1732
67
67
67
67
67
36
43
34
41
32
        $s =~ s/\\/\\\\/g; $s =~ s/'/\\'/g; $s =~ s/\n/\\n/g; $s =~ s/\r/\\r/g; $s =~ s/\t/\\t/g;
1733
67
94
        return $s;
1734}
1735
1736sub perl_quote {
1737
73
45
        my $v = $_[0];
1738
73
49
        return 'undef' unless defined $v;
1739
73
54
        if(ref($v)) {
1740
2
4
                if(ref($v) eq 'ARRAY') {
1741
0
0
0
0
0
0
                        my @quoted_v = map { perl_quote($_) } @{$v};
1742
0
0
                        return '[ ' . join(', ', @quoted_v) . ' ]';
1743                }
1744
2
5
                if(ref($v) eq 'Regexp') {
1745
0
0
                        my $s = "$v";
1746
1747                        # default to qr{...}
1748
0
0
                        return "qr{$s}" unless $s =~ /[{}]/;
1749
1750                        # fallback: quote with slash if no slash inside
1751
0
0
                        return "qr/$s/" unless $s =~ m{/};
1752
1753                        # fallback: quote with # if slash inside
1754
0
0
                        return "qr#$s#";
1755                }
1756                # Generic fallback
1757
2
2
                $v = Dumper($v);
1758
2
66
                $v =~ s/\$VAR1 =//;
1759
2
2
                $v =~ s/;//;
1760
2
4
                return $v;
1761        }
1762
71
39
        $v =~ s/\\/\\\\/g;
1763        # return $v =~ /^-?\d+(\.\d+)?$/ ? $v : "'" . ( $v =~ s/'/\\'/gr ) . "'";
1764
71
91
        return $v =~ /^-?\d+(\.\d+)?$/ ? $v : "'" . perl_sq($v) . "'";
1765}
1766
1767sub render_hash {
1768
9
9
        my $href = $_[0];
1769
9
18
        return '' unless $href && ref($href) eq 'HASH';
1770
9
6
        my @lines;
1771
9
11
        for my $k (sort keys %$href) {
1772
8
13
                my $def = $href->{$k} // {};
1773
8
22
                next unless ref $def eq 'HASH';
1774
7
5
                my @pairs;
1775
7
10
                for my $subk (sort keys %$def) {
1776
8
8
                        next unless defined $def->{$subk};
1777
8
10
                        if(ref($def->{$subk})) {
1778
0
0
                                unless((ref($def->{$subk}) eq 'ARRAY') || (ref($def->{$subk}) eq 'Regexp')) {
1779
0
0
                                        croak(__PACKAGE__, ": schema_file, $subk is a nested element, not yet supported (", ref($def->{$subk}), ')');
1780                                }
1781                        }
1782
8
14
                        if(($subk eq 'matches') || ($subk eq 'nomatch')) {
1783
0
0
                                push @pairs, "$subk => qr/$def->{$subk}/";
1784                        } else {
1785
8
12
                                push @pairs, "$subk => " . perl_quote($def->{$subk});
1786                        }
1787                }
1788
7
11
                push @lines, '      ' . perl_quote($k) . " => { " . join(", ", @pairs) . " }";
1789        }
1790
9
11
        return join(",\n", @lines);
1791}
1792
1793sub render_args_hash {
1794
18
16
        my $href = $_[0];
1795
18
25
        return '' unless $href && ref($href) eq 'HASH';
1796
18
20
24
15
        my @pairs = map { perl_quote($_) . ' => ' . perl_quote($href->{$_}) } sort keys %$href;
1797
18
28
        return join(', ', @pairs);
1798}
1799
1800sub render_arrayref_map {
1801
26
16
        my $href = $_[0];
1802
26
45
        return '()' unless $href && ref($href) eq 'HASH';
1803
26
23
        my @entries;
1804
26
28
        for my $k (sort keys %$href) {
1805
0
0
                my $aref = $href->{$k};
1806
0
0
                next unless ref $aref eq 'ARRAY';
1807
0
0
0
0
                my $vals = join(', ', map { perl_quote($_) } @$aref);
1808
0
0
                push @entries, '    ' . perl_quote($k) . " => [ $vals ]";
1809        }
1810
26
33
        return join(",\n", @entries);
1811}
1812
1813# Robustly quote a string (GitHub#1)
1814sub q_wrap {
1815
6
5
        my $s = $_[0];
1816
6
11
        for my $p ( ['{','}'], ['(',')'], ['[',']'], ['<','>'] ) {
1817
6
5
                my ($l,$r) = @$p;
1818
6
45
                return "q$l$s$r" unless $s =~ /\Q$l\E|\Q$r\E/;
1819        }
1820
0
0
        for my $d ('~', '!', '%', '^', '=', '+', ':', ',', ';', '|', '/', '#') {
1821
0
0
                return "q$d$s$d" unless index($s, $d) >= 0;
1822        }
1823
0
0
        (my $esc = $s) =~ s/'/\\'/g;
1824
0
0
        return "'$esc'";
1825}
1826
1827
1828 - 1832
=head2 _generate_transform_properties

Converts transform specifications into LectroTest property definitions.

=cut
1833
1834sub _generate_transform_properties {
1835
1
2
        my ($transforms, $function, $module, $input, $config, $new) = @_;
1836
1837
1
1
        my @properties;
1838
1839
1
2
        for my $transform_name (sort keys %$transforms) {
1840
2
2
                my $transform = $transforms->{$transform_name};
1841
1842
2
2
                my $input_spec = $transform->{input};
1843
2
0
                my $output_spec = $transform->{output};
1844
1845                # Skip if input is 'undef'
1846
2
4
                if (!ref($input_spec) && $input_spec eq 'undef') {
1847
0
0
                        next;
1848                }
1849
1850                # Detect automatic properties from the transform spec
1851
2
2
                my @detected_props = _detect_transform_properties(
1852                        $transform_name,
1853                        $input_spec,
1854                        $output_spec
1855                );
1856
1857                # Process custom properties from schema
1858
2
2
                my @custom_props = ();
1859
2
3
                if (exists $transform->{properties} && ref($transform->{properties}) eq 'ARRAY') {
1860                        @custom_props = _process_custom_properties(
1861                                $transform->{properties},
1862
0
0
                                $function,
1863                                $module,
1864                                $input_spec,
1865                                $output_spec,
1866                                $new
1867                        );
1868                }
1869
1870                # Combine detected and custom properties
1871
2
1
                my @all_props = (@detected_props, @custom_props);
1872
1873                # Skip if no properties detected or defined
1874
2
2
                next unless @all_props;
1875
1876                # Build LectroTest generator specification
1877
2
2
                my @generators;
1878                my @var_names;
1879
1880
2
2
                for my $field (sort keys %$input_spec) {
1881
2
1
                        my $spec = $input_spec->{$field};
1882
2
3
                        next unless ref($spec) eq 'HASH';
1883
1884
2
4
                        my $gen = _schema_to_lectrotest_generator($field, $spec);
1885
2
2
                        push @generators, $gen;
1886
2
3
                        push @var_names, $field;
1887                }
1888
1889
2
2
                my $gen_spec = join(', ', @generators);
1890
1891                # Build the call code
1892
2
1
                my $call_code;
1893
2
2
                if ($module) {
1894                        # $call_code = "$module\::$function";
1895
0
0
                        $call_code = "$module->$function";
1896                } else {
1897
2
1
                        $call_code = $function;
1898                }
1899
1900                # Build argument list (respect positions if defined)
1901
2
1
                my @args;
1902
2
2
                if (_has_positions($input_spec)) {
1903                        my @sorted = sort {
1904
2
3
                                $input_spec->{$a}{position} <=> $input_spec->{$b}{position}
1905
0
0
                        } keys %$input_spec;
1906
2
2
2
3
                        @args = map { "\$$_" } @sorted;
1907                } else {
1908
0
0
0
0
                        @args = map { "\$$_" } @var_names;
1909                }
1910
2
3
                my $args_str = join(', ', @args);
1911
1912                # Build property checks
1913
2
7
1
5
                my @checks = map { $_->{code} } @all_props;
1914
2
3
                my $property_checks = join(" &&\n\t", @checks);
1915
1916                # Handle _STATUS in output
1917
2
4
                my $should_die = ($output_spec->{_STATUS} // '') eq 'DIES';
1918
1919                push @properties, {
1920                        name => $transform_name,
1921                        generator_spec => $gen_spec,
1922                        call_code => "$call_code($args_str)",
1923                        property_checks => $property_checks,
1924                        should_die => $should_die,
1925
2
10
                        trials => $config->{properties}{trials} // DEFAULT_PROPERTY_TRIALS,
1926                };
1927        }
1928
1929
1
1
        return \@properties;
1930}
1931
1932 - 1936
=head2 _process_custom_properties

Processes custom property definitions from the schema.

=cut
1937
1938sub _process_custom_properties {
1939
0
0
        my ($properties_spec, $function, $module, $input_spec, $output_spec, $schema) = @_;
1940
1941
0
0
        my @properties;
1942
0
0
        my $builtin_properties = _get_builtin_properties();
1943
0
0
        my $new = defined($schema->{'new'}) ? $schema->{new} : '_UNDEF';
1944
1945
0
0
        for my $prop_def (@$properties_spec) {
1946
0
0
                my $prop_name;
1947                my $prop_code;
1948
0
0
                my $prop_desc;
1949
1950
0
0
                if (!ref($prop_def)) {
1951                        # Simple string - lookup builtin property
1952
0
0
                        $prop_name = $prop_def;
1953
1954
0
0
                        if (exists $builtin_properties->{$prop_name}) {
1955
0
0
                                my $builtin = $builtin_properties->{$prop_name};
1956
1957                                # Get input variable names
1958
0
0
                                my @var_names = sort keys %$input_spec;
1959
1960                                # Build call code
1961
0
0
                                my $call_code;
1962                                # Check if this is OO mode
1963
0
0
                                if($module && defined($new)) {
1964
0
0
                                        $call_code = "my \$obj = new_ok('$module');";
1965
0
0
                                        $call_code .= "\$obj->$function";  # Method call
1966                                } elsif($module && $module ne 'builtin') {
1967
0
0
                                        $call_code = "$module\::$function";   # Function call
1968                                } else {
1969
0
0
                                        $call_code = $function; # Builtin
1970                                }
1971
1972                                # Build args
1973
0
0
                                my @args;
1974
0
0
                                if (_has_positions($input_spec)) {
1975                                        my @sorted = sort {
1976
0
0
                                                $input_spec->{$a}{position} <=> $input_spec->{$b}{position}
1977
0
0
                                        } @var_names;
1978
0
0
0
0
                                        @args = map { "\$$_" } @sorted;
1979                                } else {
1980
0
0
0
0
                                        @args = map { "\$$_" } @var_names;
1981                                }
1982
0
0
                                $call_code .= '(' . join(', ', @args) . ')';
1983
1984                                # Generate property code from template
1985
0
0
                                $prop_code = $builtin->{code_template}->($function, $call_code, \@var_names);
1986
0
0
                                $prop_desc = $builtin->{description};
1987                        } else {
1988
0
0
                                carp "Unknown built-in property '$prop_name', skipping";
1989
0
0
                                next;
1990                        }
1991                }
1992                elsif (ref($prop_def) eq 'HASH') {
1993                        # Custom property with code
1994
0
0
                        $prop_name = $prop_def->{name} || 'custom_property';
1995
0
0
                        $prop_code = $prop_def->{code};
1996
0
0
                        $prop_desc = $prop_def->{description} || "Custom property: $prop_name";
1997
1998
0
0
                        unless ($prop_code) {
1999
0
0
                                carp "Custom property '$prop_name' missing 'code' field, skipping";
2000
0
0
                                next;
2001                        }
2002
2003                        # Validate that the code looks reasonable
2004
0
0
                        unless ($prop_code =~ /\$/ || $prop_code =~ /\w+/) {
2005
0
0
                                carp "Custom property '$prop_name' code looks invalid: $prop_code";
2006
0
0
                                next;
2007                        }
2008                }
2009                else {
2010
0
0
                        carp 'Invalid property definition: ', Dumper($prop_def);
2011
0
0
                        next;
2012                }
2013
2014
0
0
                push @properties, {
2015                        name => $prop_name,
2016                        code => $prop_code,
2017                        description => $prop_desc,
2018                };
2019        }
2020
2021
0
0
        return @properties;
2022}
2023
2024 - 2028
=head2 _detect_transform_properties

Automatically detects testable properties from transform input/output specs.

=cut
2029
2030sub _detect_transform_properties {
2031
2
2
        my ($transform_name, $input_spec, $output_spec) = @_;
2032
2033
2
1
        my @properties;
2034
2035        # Skip if input is 'undef'
2036
2
3
        return @properties if (!ref($input_spec) && $input_spec eq 'undef');
2037
2038        # Property 1: Output range constraints (numeric)
2039
2
2
        if (_is_numeric_transform($input_spec, $output_spec)) {
2040
2
4
                if (defined $output_spec->{min}) {
2041
2
2
                        my $min = $output_spec->{min};
2042
2
3
                        push @properties, {
2043                                name => 'min_constraint',
2044                                code => "\$result >= $min"
2045                        };
2046                }
2047
2048
2
3
                if (defined $output_spec->{max}) {
2049
0
0
                        my $max = $output_spec->{max};
2050
0
0
                        push @properties, {
2051                                name => 'max_constraint',
2052                                code => "\$result <= $max"
2053                        };
2054                }
2055
2056                # For transforms, add idempotence check where appropriate
2057                # e.g., abs(abs(x)) == abs(x)
2058
2
4
                if ($transform_name =~ /positive/i) {
2059
1
1
                        push @properties, {
2060                                name => 'non_negative',
2061                                code => "\$result >= 0"
2062                        };
2063                }
2064        }
2065
2066        # Property 2: Specific value output
2067
2
4
        if (defined $output_spec->{value}) {
2068
0
0
                my $expected = $output_spec->{value};
2069
0
0
                push @properties, {
2070                        name => 'exact_value',
2071                        code => "\$result == $expected"
2072                };
2073        }
2074
2075        # Property 3: String length constraints
2076
2
2
        if (_is_string_transform($input_spec, $output_spec)) {
2077
0
0
                if (defined $output_spec->{min}) {
2078
0
0
                        push @properties, {
2079                                name => 'min_length',
2080                                code => "length(\$result) >= $output_spec->{min}"
2081                        };
2082                }
2083
2084
0
0
                if (defined $output_spec->{max}) {
2085
0
0
                        push @properties, {
2086                                name => 'max_length',
2087                                code => "length(\$result) <= $output_spec->{max}"
2088                        };
2089                }
2090
2091
0
0
                if (defined $output_spec->{matches}) {
2092
0
0
                        my $pattern = $output_spec->{matches};
2093
0
0
                        push @properties, {
2094                                name => 'pattern_match',
2095                                code => "\$result =~ qr/$pattern/"
2096                        };
2097                }
2098        }
2099
2100        # Property 4: Type preservation
2101
2
2
        if (_same_type($input_spec, $output_spec)) {
2102
2
2
                my $type = _get_dominant_type($output_spec);
2103
2104
2
3
                if ($type eq 'number' || $type eq 'integer' || $type eq 'float') {
2105
2
2
                        push @properties, {
2106                                name => 'numeric_type',
2107                                code => "looks_like_number(\$result)"
2108                        };
2109                }
2110        }
2111
2112        # Property 5: Definedness (unless output can be undef)
2113
2
3
        unless (($output_spec->{type} // '') eq 'undef') {
2114
2
2
                push @properties, {
2115                        name => 'defined',
2116                        code => "defined(\$result)"
2117                };
2118        }
2119
2120
2
3
        return @properties;
2121}
2122
2123 - 2127
=head2 _get_semantic_generators

Returns a hash of built-in semantic generators for common data types.

=cut
2128
2129sub _get_semantic_generators {
2130        return {
2131
14
189
                email => {
2132                        code => q{
2133                                Gen {
2134                                        my $len = 5 + int(rand(10));
2135                                        my @addr;
2136                                        my @tlds = qw(com org net edu gov io co uk de fr);
2137
2138                                        for(my $i = 0; $i < $len; $i++) {
2139                                                push @addr, pack('c', (int(rand 26))+97);
2140                                        }
2141                                        push @addr, '@';
2142                                        $len = 5 + int(rand(10));
2143                                        for(my $i = 0; $i < $len; $i++) {
2144                                                push @addr, pack('c', (int(rand 26))+97);
2145                                        }
2146                                        push @addr, '.';
2147                                        $len = rand($#tlds+1);
2148                                        push @addr, $tlds[$len];
2149                                        return join('', @addr);
2150                                }
2151                        },
2152                        description => 'Valid email addresses',
2153                },
2154                url => {
2155                        code => q{
2156                                Gen {
2157                                        my @schemes = qw(http https);
2158                                        my @tlds = qw(com org net io);
2159                                        my $scheme = $schemes[int(rand(@schemes))];
2160                                        my $domain = join('', map { ('a'..'z')[int(rand(26))] } 1..(5 + int(rand(10))));
2161                                        my $tld = $tlds[int(rand(@tlds))];
2162                                        my $path = join('', map { ('a'..'z', '0'..'9', '-', '_')[int(rand(38))] } 1..int(rand(20)));
2163
2164                                        return "$scheme://$domain.$tld" . ($path ? "/$path" : '');
2165                                }
2166                        },
2167                        description => 'Valid HTTP/HTTPS URLs',
2168                },
2169
2170                uuid => {
2171                        code => q{
2172                                Gen {
2173                                        sprintf('%08x-%04x-%04x-%04x-%012x',
2174                                                int(rand(0xffffffff)),
2175                                                int(rand(0xffff)),
2176                                                (int(rand(0xffff)) & 0x0fff) | 0x4000,
2177                                                (int(rand(0xffff)) & 0x3fff) | 0x8000,
2178                                                int(rand(0x1000000000000))
2179                                        );
2180                                }
2181                        },
2182                        description => 'Valid UUIDv4 identifiers',
2183                },
2184
2185                phone_us => {
2186                        code => q{
2187                                Gen {
2188                                        my $area = 200 + int(rand(800));
2189                                        my $exchange = 200 + int(rand(800));
2190                                        my $subscriber = int(rand(10000));
2191                                        sprintf('%03d-%03d-%04d', $area, $exchange, $subscriber);
2192                                }
2193                        },
2194                        description => 'US phone numbers (XXX-XXX-XXXX format)',
2195                },
2196
2197                phone_e164 => {
2198                        code => q{
2199                                Gen {
2200                                        my $country = 1 + int(rand(999));
2201                                        my $area = 100 + int(rand(900));
2202                                        my $number = int(rand(10000000));
2203                                        sprintf('+%d%03d%07d', $country, $area, $number);
2204                                }
2205                        },
2206                        description => 'E.164 international phone numbers',
2207                },
2208
2209                ipv4 => {
2210                        code => q{
2211                                Gen {
2212                                        join('.', map { int(rand(256)) } 1..4);
2213                                }
2214                        },
2215                        description => 'IPv4 addresses',
2216                },
2217
2218                ipv6 => {
2219                        code => q{
2220                                Gen {
2221                                        join(':', map { sprintf('%04x', int(rand(0x10000))) } 1..8);
2222                                }
2223                        },
2224                        description => 'IPv6 addresses',
2225                },
2226
2227                username => {
2228                        code => q{
2229                                Gen {
2230                                        my $len = 3 + int(rand(13));
2231                                        my @chars = ('a'..'z', '0'..'9', '_', '-');
2232                                        my $first = ('a'..'z')[int(rand(26))];
2233                                        $first . join('', map { $chars[int(rand(@chars))] } 1..($len-1));
2234                                }
2235                        },
2236                        description => 'Valid usernames (alphanumeric with _ and -)',
2237                },
2238
2239                slug => {
2240                        code => q{
2241                                Gen {
2242                                        my @words = qw(quick brown fox jumps over lazy dog hello world test data);
2243                                        my $count = 1 + int(rand(4));
2244                                        join('-', map { $words[int(rand(@words))] } 1..$count);
2245                                }
2246                        },
2247                        description => 'URL slugs (lowercase words separated by hyphens)',
2248                },
2249
2250                hex_color => {
2251                        code => q{
2252                                Gen {
2253                                        sprintf('#%06x', int(rand(0x1000000)));
2254                                }
2255                        },
2256                        description => 'Hex color codes (#RRGGBB)',
2257                },
2258
2259                iso_date => {
2260                        code => q{
2261                                Gen {
2262                                        my $year = 2000 + int(rand(25));
2263                                        my $month = 1 + int(rand(12));
2264                                        my $day = 1 + int(rand(28));
2265                                        sprintf('%04d-%02d-%02d', $year, $month, $day);
2266                                }
2267                        },
2268                        description => 'ISO 8601 date format (YYYY-MM-DD)',
2269                },
2270
2271                iso_datetime => {
2272                        code => q{
2273                                Gen {
2274                                        my $year = 2000 + int(rand(25));
2275                                        my $month = 1 + int(rand(12));
2276                                        my $day = 1 + int(rand(28));
2277                                        my $hour = int(rand(24));
2278                                        my $minute = int(rand(60));
2279                                        my $second = int(rand(60));
2280                                        sprintf('%04d-%02d-%02dT%02d:%02d:%02dZ',
2281                                                $year, $month, $day, $hour, $minute, $second);
2282                                }
2283                        },
2284                        description => 'ISO 8601 datetime format (YYYY-MM-DDTHH:MM:SSZ)',
2285                },
2286
2287                semver => {
2288                        code => q{
2289                                Gen {
2290                                        my $major = int(rand(10));
2291                                        my $minor = int(rand(20));
2292                                        my $patch = int(rand(50));
2293                                        "$major.$minor.$patch";
2294                                }
2295                        },
2296                        description => 'Semantic version strings (major.minor.patch)',
2297                },
2298
2299                jwt => {
2300                        code => q{
2301                                Gen {
2302                                        my @chars = ('A'..'Z', 'a'..'z', '0'..'9', '-', '_');
2303                                        my $header = join('', map { $chars[int(rand(@chars))] } 1..20);
2304                                        my $payload = join('', map { $chars[int(rand(@chars))] } 1..40);
2305                                        my $signature = join('', map { $chars[int(rand(@chars))] } 1..30);
2306                                        "$header.$payload.$signature";
2307                                }
2308                        },
2309                        description => 'JWT-like tokens (base64url format)',
2310                },
2311
2312                json => {
2313                        code => q{
2314                                Gen {
2315                                        my @keys = qw(id name value status count);
2316                                        my $key = $keys[int(rand(@keys))];
2317                                        my $value = 1 + int(rand(1000));
2318                                        qq({"$key":$value});
2319                                }
2320                        },
2321                        description => 'Simple JSON objects',
2322                },
2323
2324                base64 => {
2325                        code => q{
2326                                Gen {
2327                                        my @chars = ('A'..'Z', 'a'..'z', '0'..'9', '+', '/');
2328                                        my $len = 12 + int(rand(20));
2329                                        my $str = join('', map { $chars[int(rand(@chars))] } 1..$len);
2330                                        $str .= '=' x (4 - ($len % 4)) if $len % 4;
2331                                        $str;
2332                                }
2333                        },
2334                        description => 'Base64-encoded strings',
2335                },
2336
2337                md5 => {
2338                        code => q{
2339                                Gen {
2340                                        join('', map { sprintf('%x', int(rand(16))) } 1..32);
2341                                }
2342                        },
2343                        description => 'MD5 hashes (32 hex characters)',
2344                },
2345
2346                sha256 => {
2347                        code => q{
2348                                Gen {
2349                                        join('', map { sprintf('%x', int(rand(16))) } 1..64);
2350                                }
2351                        },
2352                        description => 'SHA-256 hashes (64 hex characters)',
2353                },
2354        };
2355}
2356
2357 - 2361
=head2 _get_builtin_properties

Returns a hash of built-in property templates that can be applied to transforms.

=cut
2362
2363sub _get_builtin_properties {
2364        return {
2365                idempotent => {
2366                        description => 'Function is idempotent: f(f(x)) == f(x)',
2367                        code_template => sub {
2368
0
0
                                my ($function, $call_code, $input_vars) = @_;
2369                                # Use string comparison - works for all types in Perl
2370
0
0
                                return "do { my \$tmp = $call_code; \$result eq \$tmp }";
2371                        },
2372                        applicable_to => ['all'],
2373                },
2374
2375                non_negative => {
2376                        description => 'Result is always non-negative',
2377                        code_template => sub {
2378
0
0
                                my ($function, $call_code, $input_vars) = @_;
2379
0
0
                                return '$result >= 0';
2380                        },
2381                        applicable_to => ['number', 'integer', 'float'],
2382                },
2383
2384                positive => {
2385                        description => 'Result is always positive (> 0)',
2386                        code_template => sub {
2387
0
0
                                my ($function, $call_code, $input_vars) = @_;
2388
0
0
                                return '$result > 0';
2389                        },
2390                        applicable_to => ['number', 'integer', 'float'],
2391                },
2392
2393                non_empty => {
2394                        description => 'Result is never empty',
2395                        code_template => sub {
2396
0
0
                                my ($function, $call_code, $input_vars) = @_;
2397
0
0
                                return 'length($result) > 0';
2398                        },
2399                        applicable_to => ['string'],
2400                },
2401
2402                length_preserved => {
2403                        description => 'Output length equals input length',
2404                        code_template => sub {
2405
0
0
                                my ($function, $call_code, $input_vars) = @_;
2406
0
0
                                my $first_var = $input_vars->[0];
2407
0
0
                                return "length(\$result) == length(\$$first_var)";
2408                        },
2409                        applicable_to => ['string'],
2410                },
2411
2412                uppercase => {
2413                        description => 'Result is all uppercase',
2414                        code_template => sub {
2415
0
0
                                my ($function, $call_code, $input_vars) = @_;
2416
0
0
                                return '$result eq uc($result)';
2417                        },
2418                        applicable_to => ['string'],
2419                },
2420
2421                lowercase => {
2422                        description => 'Result is all lowercase',
2423                        code_template => sub {
2424
0
0
                                my ($function, $call_code, $input_vars) = @_;
2425
0
0
                                return '$result eq lc($result)';
2426                        },
2427                        applicable_to => ['string'],
2428                },
2429
2430                trimmed => {
2431                        description => 'Result has no leading/trailing whitespace',
2432                        code_template => sub {
2433
0
0
                                my ($function, $call_code, $input_vars) = @_;
2434
0
0
                                return '$result !~ /^\s/ && $result !~ /\s$/';
2435                        },
2436                        applicable_to => ['string'],
2437                },
2438
2439                sorted_ascending => {
2440                        description => 'Array is sorted in ascending order',
2441                        code_template => sub {
2442
0
0
                                my ($function, $call_code, $input_vars) = @_;
2443
0
0
                                return 'do { my @arr = @$result; my $sorted = 1; for my $i (1..$#arr) { $sorted = 0 if $arr[$i] < $arr[$i-1]; } $sorted }';
2444                        },
2445                        applicable_to => ['arrayref'],
2446                },
2447
2448                sorted_descending => {
2449                        description => 'Array is sorted in descending order',
2450                        code_template => sub {
2451
0
0
                                my ($function, $call_code, $input_vars) = @_;
2452
0
0
                                return 'do { my @arr = @$result; my $sorted = 1; for my $i (1..$#arr) { $sorted = 0 if $arr[$i] > $arr[$i-1]; } $sorted }';
2453                        },
2454                        applicable_to => ['arrayref'],
2455                },
2456
2457                unique_elements => {
2458                        description => 'Array has no duplicate elements',
2459                        code_template => sub {
2460
0
0
                                my ($function, $call_code, $input_vars) = @_;
2461
0
0
                                return 'do { my @arr = @$result; my %seen; !grep { $seen{$_}++ } @arr }';
2462                        },
2463                        applicable_to => ['arrayref'],
2464                },
2465
2466                preserves_keys => {
2467                        description => 'Hash has same keys as input',
2468                        code_template => sub {
2469
0
0
                                my ($function, $call_code, $input_vars) = @_;
2470
0
0
                                my $first_var = $input_vars->[0];
2471
0
0
                                return 'do { my @in = sort keys %{$' . $first_var . '}; my @out = sort keys %$result; join(",", @in) eq join(",", @out) }';
2472                        },
2473                        applicable_to => ['hashref'],
2474                },
2475
2476                monotonic_increasing => {
2477                        description => 'For x <= y, f(x) <= f(y)',
2478                        code_template => sub {
2479
0
0
                                my ($function, $call_code, $input_vars) = @_;
2480                                # This would need multiple inputs - complex
2481
0
0
                                return '1';     # Placeholder
2482                        },
2483
1
34
                        applicable_to => ['number', 'integer'],
2484                },
2485        };
2486}
2487
2488 - 2492
=head2 _schema_to_lectrotest_generator

Converts a schema field spec to a LectroTest generator string.

=cut
2493
2494sub _schema_to_lectrotest_generator {
2495
2
1
        my ($field_name, $spec) = @_;
2496
2497
2
3
        my $type = $spec->{type} || 'string';
2498
2499        # Check for semantic generator first
2500
2
2
        if ($type eq 'string' && defined $spec->{semantic}) {
2501
0
0
                my $semantic_type = $spec->{semantic};
2502
0
0
                my $generators = _get_semantic_generators();
2503
2504
0
0
                if (exists $generators->{$semantic_type}) {
2505
0
0
                        my $gen_code = $generators->{$semantic_type}{code};
2506                        # Remove leading/trailing whitespace and compress
2507
0
0
                        $gen_code =~ s/^\s+//;
2508
0
0
                        $gen_code =~ s/\s+$//;
2509
0
0
                        $gen_code =~ s/\n\s+/ /g;
2510
0
0
                        return "$field_name <- $gen_code";
2511                } else {
2512
0
0
                        carp "Unknown semantic type '$semantic_type', falling back to regular string generator";
2513                        # Fall through to regular string generation
2514                }
2515        }
2516
2517
2
5
        if ($type eq 'integer') {
2518
0
0
                my $min = $spec->{min};
2519
0
0
                my $max = $spec->{max};
2520
2521
0
0
                if (!defined($min) && !defined($max)) {
2522
0
0
                        return "$field_name <- Int";
2523                } elsif (!defined($min)) {
2524
0
0
                        return "$field_name <- Int(sized => sub { int(rand($max + 1)) })";
2525                } elsif (!defined($max)) {
2526
0
0
                        return "$field_name <- Int(sized => sub { $min + int(rand(1000)) })";
2527                } else {
2528
0
0
                        my $range = $max - $min;
2529
0
0
                        return "$field_name <- Int(sized => sub { $min + int(rand($range + 1)) })";
2530                }
2531        }
2532        elsif ($type eq 'number' || $type eq 'float') {
2533
2
2
                my $min = $spec->{min};
2534
2
2
                my $max = $spec->{max};
2535
2536
2
3
                if (!defined($min) && !defined($max)) {
2537                        # No constraints - full range
2538
0
0
                        return "$field_name <- Float(sized => sub { rand(1000) - 500 })";
2539                } elsif (!defined($min)) {
2540                        # Only max defined
2541
1
1
                        if ($max == 0) {
2542                                # max=0 means negative numbers only
2543
1
12
                                return "$field_name <- Float(sized => sub { -rand(1000) })";
2544                        } elsif ($max > 0) {
2545                                # Positive max, generate 0 to max
2546
0
0
                                return "$field_name <- Float(sized => sub { rand($max) })";
2547                        } else {
2548                                # Negative max, generate from some negative to max
2549
0
0
                                return "$field_name <- Float(sized => sub { ($max - 1000) + rand(1000 + $max) })";
2550                        }
2551                } elsif (!defined($max)) {
2552                        # Only min defined
2553
1
1
                        if ($min == 0) {
2554                                # min=0 means positive numbers only
2555
1
4
                                return "$field_name <- Float(sized => sub { rand(1000) })";
2556                        } elsif ($min > 0) {
2557                                # Positive min
2558
0
0
                                return "$field_name <- Float(sized => sub { $min + rand(1000) })";
2559                        } else {
2560                                # Negative min
2561
0
0
                                return "$field_name <- Float(sized => sub { $min + rand(-$min + 1000) })";
2562                        }
2563                } else {
2564                        # Both min and max defined
2565
0
0
                        my $range = $max - $min;
2566
0
0
                        if ($range <= 0) {
2567
0
0
                                carp "Invalid range: min=$min, max=$max";
2568
0
0
                                return "$field_name <- Float(sized => sub { $min })";
2569                        }
2570
0
0
                        return "$field_name <- Float(sized => sub { $min + rand($range) })";
2571                }
2572        }
2573        elsif ($type eq 'string') {
2574
0
0
                my $min_len = $spec->{min} // 0;
2575
0
0
                my $max_len = $spec->{max} // 100;
2576
2577                # Handle regex patterns
2578
0
0
                if (defined $spec->{matches}) {
2579
0
0
                        my $pattern = $spec->{matches};
2580
2581                        # Build generator using Data::Random::String::Matches
2582
0
0
                        if (defined $spec->{max}) {
2583
0
0
                                return "$field_name <- Gen { Data::Random::String::Matches->create_random_string({ regex => qr/$pattern/, length => $spec->{max} }) }";
2584                        } elsif (defined $spec->{min}) {
2585
0
0
                                return "$field_name <- Gen { Data::Random::String::Matches->create_random_string({ regex => qr/$pattern/, length => $spec->{min} }) }";
2586                        } else {
2587
0
0
                                return "$field_name <- Gen { Data::Random::String::Matches->create_random_string({ regex => qr/$pattern/ }) }";
2588                        }
2589                }
2590
2591
0
0
                return "$field_name <- String(length => [$min_len, $max_len])";
2592        } elsif ($type eq 'boolean') {
2593
0
0
                return "$field_name <- Bool";
2594        }
2595        elsif ($type eq 'arrayref') {
2596
0
0
                my $min_size = $spec->{min} // 0;
2597
0
0
                my $max_size = $spec->{max} // 10;
2598
0
0
                return "$field_name <- List(Int, length => [$min_size, $max_size])";
2599        }
2600        elsif ($type eq 'hashref') {
2601                # LectroTest doesn't have built-in Hash, use custom generator
2602
0
0
                my $min_keys = $spec->{min} // 0;
2603
0
0
                my $max_keys = $spec->{max} // 10;
2604
0
0
                return "$field_name <- Elements(map { my \%h; for (1..\$_) { \$h{'key'.\$_} = \$_ }; \\\%h } $min_keys..$max_keys)";
2605        }
2606        else {
2607
0
0
                carp "Unknown type '$type' for LectroTest generator, using String";
2608
0
0
                return "$field_name <- String";
2609        }
2610}
2611
2612 - 2614
=head2 Helper functions for type detection

=cut
2615
2616sub _is_numeric_transform {
2617
2
1
        my ($input_spec, $output_spec) = @_;
2618
2619
2
2
        my $out_type = $output_spec->{type} // '';
2620
2
7
        return $out_type eq 'number' || $out_type eq 'integer' || $out_type eq 'float';
2621}
2622
2623sub _is_string_transform {
2624
2
2
        my ($input_spec, $output_spec) = @_;
2625
2626
2
3
        my $out_type = $output_spec->{type} // '';
2627
2
3
        return $out_type eq 'string';
2628}
2629
2630sub _same_type {
2631
2
1
        my ($input_spec, $output_spec) = @_;
2632
2633        # Simplified - would need more sophisticated logic for multiple inputs
2634
2
5
        my $in_type = _get_dominant_type($input_spec);
2635
2
2
        my $out_type = _get_dominant_type($output_spec);
2636
2637
2
3
        return $in_type eq $out_type;
2638}
2639
2640sub _get_dominant_type {
2641
6
2
        my $spec = $_[0];
2642
2643
6
6
        return $spec->{type} if defined $spec->{type};
2644
2645        # For multi-field specs, return the first type found
2646
2
2
        for my $field (keys %$spec) {
2647
2
5
                next unless ref($spec->{$field}) eq 'HASH';
2648
2
3
                return $spec->{$field}{type} if defined $spec->{$field}{type};
2649        }
2650
2651
0
0
        return 'string';        # Default
2652}
2653
2654sub _has_positions {
2655
2
2
        my $spec = $_[0];
2656
2657
2
1
        for my $field (keys %$spec) {
2658
2
2
                next unless ref($spec->{$field}) eq 'HASH';
2659
2
3
                return 1 if defined $spec->{$field}{position};
2660        }
2661
2662
0
0
        return 0;
2663}
2664
2665 - 2669
=head2 _render_properties

Renders property definitions into Perl code for the template.

=cut
2670
2671sub _render_properties {
2672
1
0
        my $properties = $_[0];
2673
2674
1
1
        my $code = "use_ok('Test::LectroTest::Compat');\n\n";
2675
2676
1
2
        for my $prop (@$properties) {
2677
2
4
                $code .= "# Transform property: $prop->{name}\n";
2678
2
2
                $code .= "my \$$prop->{name} = Property {\n";
2679
2
1
                $code .= "    ##[ $prop->{generator_spec} ]##\n";
2680
2
1
                $code .= "    \n";
2681
2
1
                $code .= "    my \$result = eval { $prop->{call_code} };\n";
2682
2683
2
3
                if ($prop->{should_die}) {
2684
0
0
                        $code .= "    my \$died = defined(\$\@) && \$\@;\n";
2685
0
0
                        $code .= "    \$died;\n";
2686                } else {
2687
2
2
                        $code .= "    my \$error = \$\@;\n";
2688                        # $code .= "    diag(\"\$$prop->{name} -> \$error; \") if(\$ENV{'TEST_VERBOSE'});\n";
2689
2
1
                        $code .= "    \n";
2690
2
1
                        $code .= "    !\$error && (\n";
2691
2
1
                        $code .= "        $prop->{property_checks}\n";
2692
2
2
                        $code .= "    );\n";
2693                }
2694
2695
2
1
                $code .= "}, name => '$prop->{name}', trials => $prop->{trials};\n\n";
2696
2697
2
2
                $code .= "holds(\$$prop->{name});\n";
2698        }
2699
2700
1
4
        return $code;
2701}
2702
27031;
2704
2705 - 2740
=head1 NOTES

C<seed> and C<iterations> really should be within C<config>.

=head1 SEE ALSO

=over 4

=item * L<https://nigelhorne.github.io/App-Test-Generator/coverage/>: Test Coverage Report

=item * L<App::Test::Generator::Template> - Template of the file of tests created by C<App::Test::Generator>

=item * L<App::Test::Generator::SchemaExtractor> - Project to create schemas from Perl programs

=item * L<Params::Validate::Strict>: Schema Definition

=item * L<Params::Get>: Input validation

=item * L<Return::Set>: Output validation

=item * L<Test::LectroTest>

=item * L<Test::Most>

=item * L<YAML::XS>

=back

=head1 AUTHOR

Nigel Horne, C<< <njh at nigelhorne.com> >>

Portions of this module's initial design and documentation were created with the
assistance of AI.

=cut
2741