File Coverage

File:blib/lib/CGI/Info.pm
Coverage:84.9%

linestmtbrancondsubtimecode
1package CGI::Info;
2
3# TODO: remove the expect argument
4# TODO: look into params::check or params::validate
5
6
47
47
47
2150294
40
1003
use warnings;
7
47
47
47
63
27
344
use strict;
8
9
47
47
47
6953
19991
65
use boolean;
10
47
47
47
1285
44
919
use Carp;
11
47
47
47
10571
2077795
644
use Object::Configure 0.19;
12
47
47
47
127
30
478
use File::Spec;
13
47
47
47
68
212
372
use Log::Abstraction 0.10;
14
47
47
47
86
187
647
use Params::Get 0.13;
15
47
47
47
70
196
454
use Params::Validate::Strict 0.21;
16
47
47
47
10867
114061
1020
use Net::CIDR;
17
47
47
47
99
25
484
use Return::Set;
18
47
47
47
63
35
465
use Scalar::Util;
19
47
47
47
75
46
7988
use Socket;     # For AF_INET
20
47
47
352
57
use 5.008;
21# use Cwd;
22# use JSON::Parse;
23
47
47
47
108
63
324
use List::Util ();      # Can go when expect goes
24# use Sub::Private;
25
47
47
47
9112
578606
582
use Sys::Path;
26
27
47
47
47
8845
267189
127
use namespace::clean;
28
29sub _sanitise_input($);
30
31 - 39
=head1 NAME

CGI::Info - Information about the CGI environment

=head1 VERSION

Version 1.13

=cut
40
41our $VERSION = '1.13';
42
43 - 144
=head1 SYNOPSIS

The C<CGI::Info> module is a Perl library designed to provide information about the environment in which a CGI script operates.
It aims to eliminate hard-coded script details,
enhancing code readability and portability.
Additionally, it offers a simple web application firewall to add a layer of security.

All too often,
Perl programs have information such as the script's name
hard-coded into their source.
Generally speaking,
hard-coding is a bad style since it can make programs difficult to read and reduces readability and portability.
CGI::Info attempts to remove that.

Furthermore, to aid script debugging, CGI::Info attempts to do sensible
things when you're not running the program in a CGI environment.

Whilst you shouldn't rely on it alone to provide security to your website,
it is another layer and every little helps.

    use CGI::Info;

    my $info = CGI::Info->new(allow => { id => qr/^\d+$/ });
    my $params = $info->params();

    if($info->is_mobile()) {
        print "Mobile view\n";
    } else {
        print "Desktop view\n";
    }

    my $id = $info->param('id');     # Validated against allow schema

=head1 SUBROUTINES/METHODS

=head2 new

Creates a CGI::Info object.

It takes four optional arguments: allow, logger, expect and upload_dir,
which are documented in the params() method.

It takes other optional parameters:

=over 4

=item * C<auto_load>

Enable/disable the AUTOLOAD feature.
The default is to have it enabled.

=item * C<config_dirs>

Where to look for C<config_file>

=item * C<config_file>

Points to a configuration file which contains the parameters to C<new()>.
The file can be in any common format,
including C<YAML>, C<XML>, and C<INI>.
This allows the parameters to be set at run time.

On non-Windows system,
the class can be configured using environment variables starting with "CGI::Info::".
For example:

  export CGI::Info::max_upload_size=65536

It doesn't work on Windows because of the case-insensitive nature of that system.

If the configuration file has a section called C<CGI::Info>,
only that section,
and the C<global> section,
if any exists,
is used.

=item * C<syslog>

Takes an optional parameter syslog, to log messages to
L<Sys::Syslog>.
It can be a boolean to enable/disable logging to syslog, or a reference
to a hash to be given to Sys::Syslog::setlogsock.

=item * C<cache>

An object that is used to cache IP lookups.
This cache object is an object that understands get() and set() messages,
such as a L<CHI> object.

=item * C<max_upload_size>

The maximum file size you can upload (-1 for no limit), the default is 512MB.

=back

The class can be configured at runtime using environments and configuration files,
for example,
setting C<$ENV{'CGI__INFO__carp_on_warn'}> causes warnings to use L<Carp>.
For more information about configuring object constructors at runtime,
see L<Object::Configure>.

=cut
145
146our $stdin_data;        # Class variable storing STDIN in case the class
147                        # is instantiated more than once
148
149sub new
150{
151
676
2083578
        my $class = shift;
152
153        # Handle hash or hashref arguments
154
676
935
        my $params = Params::Get::get_params(undef, @_);
155
156
675
6670
        if(!defined($class)) {
157
3
4
                if(defined($params)) {
158                        # Using CGI::Info:new(), not CGI::Info->new()
159
1
54
                        croak(__PACKAGE__, ' use ->new() not ::new() to instantiate');
160                }
161
162                # FIXME: this only works when no arguments are given
163
2
2
                $class = __PACKAGE__;
164        } elsif(Scalar::Util::blessed($class)) {
165                # If $class is an object, clone it with new arguments
166
8
35
                $params ||= {};
167
8
8
8
7
8
27
                return bless { %{$class}, %{$params} }, ref($class);
168        }
169
170        # Load the configuration from a config file, if provided
171
666
924
        $params = Object::Configure::configure($class, $params);
172
173        # Validate logger object has required methods
174
665
1914274
        if(defined $params->{'logger'}) {
175
665
4193
                unless(Scalar::Util::blessed($params->{'logger'}) && $params->{'logger'}->can('warn') && $params->{'logger'}->can('info') && $params->{'logger'}->can('error')) {
176
0
0
                        Carp::croak("Logger must be an object with info() and error() methods");
177                }
178        }
179
180
665
1421
        if(defined($params->{'expect'})) {
181                # if(ref($params->{expect}) ne 'ARRAY') {
182                        # Carp::croak(__PACKAGE__, ': expect must be a reference to an array');
183                # }
184                # # warn __PACKAGE__, ': expect is deprecated, use allow instead';
185
4
10
                if(my $logger = $params->{'logger'}) {
186
4
11
                        $logger->error("$class: expect has been deprecated, use allow instead");
187                }
188
4
1529
                Carp::croak("$class: expect has been deprecated, use allow instead");
189        }
190
191        # Return the blessed object
192        return bless {
193                max_upload_size => 512 * 1024,
194                allow => undef,
195                upload_dir => undef,
196
661
661
454
1749
                %{$params}      # Overwrite defaults with given arguments
197        }, $class;
198}
199
200 - 227
=head2 script_name

Retrieves the name of the executing CGI script.
This is useful for POSTing,
thus avoiding hard-coded paths into forms.

        use CGI::Info;

        my $info = CGI::Info->new();
        my $script_name = $info->script_name();
        # ...
        print "<form method=\"POST\" action=$script_name name=\"my_form\">\n";

=head3 API SPECIFICATION

=head4 INPUT

None.

=head4 OUTPUT

  {
    type => 'string',
    'min' => 1,
    'nomatch' => qr/^[\/\\]/ # Does not return absolute path
  }

=cut
228
229sub script_name
230{
231
34
948
        my $self = shift;
232
233
34
46
        unless($self->{script_name}) {
234
24
30
                $self->_find_paths();
235        }
236
34
85
        return $self->{script_name};
237}
238
239sub _find_paths {
240
51
42
        my $self = shift;
241
242
51
112
        if(!UNIVERSAL::isa((caller)[0], __PACKAGE__)) {
243
0
0
                Carp::croak('Illegal Operation: This method can only be called by a subclass or ourself');
244        }
245
246
51
322
        $self->_trace(__PACKAGE__ . ': entering _find_paths');
247
248
51
781
        require File::Basename && File::Basename->import() unless File::Basename->can('basename');
249
250        # Determine script name
251
51
643
        my $script_name = $self->_get_env('SCRIPT_NAME') // $0;
252
51
772
        $self->{script_name} = $self->_untaint_filename({
253                filename => File::Basename::basename($script_name)
254        });
255
256        # Determine script path
257
51
67
        if(my $script_path = $self->_get_env('SCRIPT_FILENAME')) {
258
16
21
                $self->{script_path} = $script_path;
259        } elsif($script_name = $self->_get_env('SCRIPT_NAME')) {
260
19
16
                if(my $document_root = $self->_get_env('DOCUMENT_ROOT')) {
261
8
7
                        $script_name = $self->_get_env('SCRIPT_NAME');
262
263                        # It's usually the case, e.g. /cgi-bin/foo.pl
264
8
13
                        $script_name =~ s{^/}{};
265
266
8
36
                        $self->{script_path} = File::Spec->catfile($document_root, $script_name);
267                } else {
268
11
97
                        if(File::Spec->file_name_is_absolute($script_name) && (-r $script_name)) {
269                                # Called from a command line with a full path
270
1
2
                                $self->{script_path} = $script_name;
271                        } else {
272
10
43
                                require Cwd unless Cwd->can('abs_path');
273
274
10
17
                                if($script_name =~ /^\/(.+)/) {
275                                        # It's usually the case, e.g. /cgi-bin/foo.pl
276
7
6
                                        $script_name = $1;
277                                }
278
279
10
92
                                $self->{script_path} = File::Spec->catfile(Cwd::abs_path(), $script_name);
280                        }
281                }
282        } elsif(File::Spec->file_name_is_absolute($0)) {
283                # Called from a command line with a full path
284
3
5
                $self->{script_path} = $0;
285        } else {
286
13
188
                $self->{script_path} = File::Spec->rel2abs($0);
287        }
288
289        # Untaint and finalize script path
290        $self->{script_path} = $self->_untaint_filename({
291                filename => $self->{script_path}
292
51
93
        });
293}
294
295 - 312
=head2 script_path

Finds the full path name of the script.

        use CGI::Info;

        my $info = CGI::Info->new();
        my $fullname = $info->script_path();
        my @statb = stat($fullname);

        if(@statb) {
                my $mtime = localtime $statb[9];
                print "Last-Modified: $mtime\n";
                # TODO: only for HTTP/1.1 connections
                # $etag = Digest::MD5::md5_hex($html);
                printf "ETag: \"%x\"\n", $statb[9];
        }
=cut
313
314sub script_path {
315
42
3917
        my $self = shift;
316
317
42
64
        unless($self->{script_path}) {
318
18
27
                $self->_find_paths();
319        }
320
42
152
        return $self->{script_path};
321}
322
323 - 337
=head2 script_dir

Returns the file system directory containing the script.

        use CGI::Info;
        use File::Spec;

        my $info = CGI::Info->new();

        print 'HTML files are normally stored in ', $info->script_dir(), '/', File::Spec->updir(), "\n";

        # or
        use lib CGI::Info::script_dir() . '../lib';

=cut
338
339sub script_dir
340{
341
26
52
        my $self = shift;
342
343        # Ensure $self is an object
344
26
35
        $self = __PACKAGE__->new() unless ref $self;
345
346        # Set script path if it is not already defined
347
26
50
        $self->_find_paths() unless $self->{script_path};
348
349        # Extract directory from script path based on OS
350        # Don't use File::Spec->splitpath() since that can leave the trailing slash
351
26
61
        my $dir_regex = $^O eq 'MSWin32' ? qr{(.+)\\.+?$} : qr{(.+)/.+?$};
352
353
26
184
        return $self->{script_path} =~ $dir_regex ? $1 : $self->{script_path};
354}
355
356 - 375
=head2 host_name

Return the host-name of the current web server, according to CGI.
If the name can't be determined from the web server, the system's host-name
is used as a fall back.
This may not be the same as the machine that the CGI script is running on,
some ISPs and other sites run scripts on different machines from those
delivering static content.
There is a good chance that this will be domain_name() prepended with either
'www' or 'cgi'.

        use CGI::Info;

        my $info = CGI::Info->new();
        my $host_name = $info->host_name();
        my $protocol = $info->protocol();
        # ...
        print "Thank you for visiting our <A HREF=\"$protocol://$host_name\">Website!</A>";

=cut
376
377sub host_name {
378
129
858
        my $self = shift;
379
380
129
96
        unless($self->{site}) {
381
21
33
                $self->_find_site_details();
382        }
383
384
129
260
        return $self->{site};
385}
386
387sub _find_site_details
388{
389
48
35
        my $self = shift;
390
391        # Log entry to the routine
392
48
63
        $self->_trace('Entering _find_site_details');
393
394
48
519
        return if $self->{site} && $self->{cgi_site};
395
396        # Determine cgi_site using environment variables or hostname
397
42
116
        if (my $host = ($ENV{'HTTP_HOST'} || $ENV{'SERVER_NAME'} || $ENV{'SSL_TLS_SNI'})) {
398                # Import necessary module
399
33
1482
                        require URI::Heuristic unless URI::Heuristic->can('uf_uristr');
400
401
33
5702
                $self->{cgi_site} = URI::Heuristic::uf_uristr($host);
402                # Remove trailing dots from the name.  They are legal in URLs
403                # and some sites link using them to avoid spoofing (nice)
404
33
486
                $self->{cgi_site} =~ s/\.+$//;       # Trim trailing dots
405
406
33
68
                if($ENV{'SERVER_NAME'} && ($host eq $ENV{'SERVER_NAME'}) && (my $protocol = $self->protocol()) && $self->protocol() ne 'http') {
407
2
3
                        $self->{cgi_site} =~ s/^http/$protocol/;
408                }
409        } else {
410                # Import necessary module
411
9
50
                require Sys::Hostname unless Sys::Hostname->can('hostname');
412
413
9
19
                $self->_debug('Falling back to using hostname');
414
9
110
                $self->{cgi_site} = Sys::Hostname::hostname();
415        }
416
417        # Set site details if not already defined
418
42
138
        $self->{site} ||= $self->{cgi_site};
419
42
82
        $self->{site} =~ s/^https?:\/\/(.+)/$1/;
420        $self->{cgi_site} = ($self->protocol() || 'http') . '://' . $self->{cgi_site}
421
42
99
                unless $self->{cgi_site} =~ /^https?:\/\//;
422
423        # Warn if site details could not be determined
424
42
96
        $self->_warn('Could not determine site name') unless($self->{site} && $self->{cgi_site});
425
426        # Log exit
427
42
48
        $self->_trace('Leaving _find_site_details');
428}
429
430 - 437
=head2 domain_name

Domain_name is the name of the controlling domain for this website.
Usually it will be similar to host_name, but will lack the http:// or www prefixes.

Can be called as a class method.

=cut
438
439sub domain_name {
440
21
187
        my $self = shift;
441
442
21
27
        if(!ref($self)) {
443
3
16
                $self = __PACKAGE__->new();
444        }
445
21
34
        return $self->{domain} if $self->{domain};
446
447
16
24
        $self->_find_site_details();
448
449
16
84
        if(my $site = $self->{site}) {
450
16
36
                $self->{domain} = ($site =~ /^www\.(.+)/) ? $1 : $site;
451        }
452
453
16
30
        return $self->{domain};
454}
455
456 - 460
=head2 cgi_host_url

Return the URL of the machine running the CGI script.

=cut
461
462sub cgi_host_url {
463
17
47
        my $self = shift;
464
465
17
23
        unless($self->{cgi_site}) {
466
11
14
                $self->_find_site_details();
467        }
468
469
17
106
        return $self->{cgi_site};
470}
471
472 - 627
=head2 params

Returns a reference to a hash list of the CGI arguments.

CGI::Info helps you to test your script before deployment on a website:
if it is not in a CGI environment (e.g., the script is being tested from the
command line), the program's command line arguments (a list of key=value pairs)
are used, if there are no command line arguments,
then they are read from stdin as a list of key=value lines.
Also,
you can give one of --tablet, --search-engine,
--mobile and --robot to mimic those agents. For example:

        ./script.cgi --mobile name=Nigel

Returns undef if the parameters can't be determined or if none were given.

If an argument is given twice or more, then the values are put in a comma
separated string.

The returned hash value can be passed into L<CGI::Untaint>.

Takes four optional parameters: allow, logger and upload_dir.
The parameters are passed in a hash, or a reference to a hash.
The latter is more efficient since it puts less on the stack.

Allow is a reference to a hash list of CGI parameters that you will allow.
The value for each entry is either a permitted value,
a regular expression of permitted values for
the key,
a code reference,
or a hash of L<Params::Validate::Strict> rules.
Subroutine exceptions propagate normally, allowing custom error handling.
This works alongside existing regex and Params::Validate::Strict patterns.
A undef value means that any value will be allowed.
Arguments not in the list are silently ignored.
This is useful to help to block attacks on your site.

Upload_dir is a string containing a directory where files being uploaded are to
be stored.
It must be a writeable directory in the temporary area.

Takes an optional parameter logger, which is used for warnings and traces.
It can be an object that understands warn() and trace() messages,
such as a L<Log::Log4perl> or L<Log::Any> object,
a reference to code,
a reference to an array,
or a filename.

The allow, logger and upload_dir arguments can also be passed to the
constructor.

        use CGI::Info;
        use CGI::Untaint;
        # ...
        my $info = CGI::Info->new();
        my %params;
        if($info->params()) {
                %params = %{$info->params()};
        }
        # ...
        foreach(keys %params) {
                print "$_ => $params{$_}\n";
        }
        my $u = CGI::Untaint->new(%params);

        use CGI::Info;
        use CGI::IDS;
        # ...
        my $info = CGI::Info->new();
        my $allowed = {
                foo => qr/^\d*$/,    # foo must be a number, or empty
                bar => undef,                # bar can be given and be any value
                xyzzy => qr/^[\w\s-]+$/,     # must be alphanumeric
                                                # to prevent XSS, and non-empty
                                                # as a sanity check
        };
        # or
        $allowed = {
                email => { type => 'string', matches => qr/^[^@]+@[^@]+\.[^@]+$/ }, # String, basic email format check
                age => { type => 'integer', min => 0, max => 150 }, # Integer between 0 and 150
                bio => { type => 'string', optional => 1 }, # String, optional
                ip_address => { type => 'string', matches => qr/^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$/ }, #Basic IPv4 validation
        };
        my $paramsref = $info->params(allow => $allowed);
        if(defined($paramsref)) {
                my $ids = CGI::IDS->new();
                $ids->set_scan_keys(scan_keys => 1);
                if($ids->detect_attacks(request => $paramsref) > 0) {
                        die 'horribly';
                }
        }

If the request is an XML request (i.e. the content type of the POST is text/xml),
CGI::Info will put the request into the params element 'XML', thus:

        use CGI::Info;
        # ...
        my $info = CGI::Info->new();
        my $paramsref = $info->params();     # See BUGS below
        my $xml = $$paramsref{'XML'};
        # ... parse and process the XML request in $xml

Carp if logger is not set and we detect something serious.

Blocks some attacks,
such as SQL and XSS injections,
mustleak and directory traversals,
thus creating a primitive web application firewall (WAF).
Warning - this is an extra layer, not a replacement for your other security layers.

=head3 Validation Subroutine Support

The C<allow> parameter accepts subroutine references for dynamic validation,
enabling complex parameter checks beyond static regex patterns.
These callbacks:

=over 4

=item * Receive three arguments: the parameter key, value and the C<CGI::Info> instance

=item * Must return a true value to allow the parameter, false to reject

=item * Can access other parameters through the instance for contextual validation

=back

Basic usage:

    CGI::Info->new(
        allow => {
            # Simple value check
            even_number => sub { ($_[1] % 2) == 0 },

            # Context-aware validation
            child_age => sub {
                my ($key, $value, $info) = @_;
                $info->param('is_parent') ? $value <= 18 : 0
            }
        }
    );

Advanced features:

    # Combine with regex validation
    mixed_validation => {
        email => qr/@/,  # Regex check
        promo_code => \&validate_promo_code  # Subroutine check
    }

    # Throw custom exceptions
    dangerous_param => sub {
        die 'Hacking attempt!' if $_[1] =~ /DROP TABLE/;
        return 1;
    }
=cut
628
629sub params {
630
1046
6498
        my $self = shift;
631
632
1046
955
        my $params = Params::Get::get_params(undef, @_);
633
634
1046
7781
        if((defined($self->{paramref})) && ((!defined($params->{'allow'})) || defined($self->{allow}) && ($params->{'allow'} eq $self->{allow}))) {
635
181
175
                return $self->{paramref};
636        }
637
638
865
840
        if(defined($params->{allow})) {
639
55
45
                $self->{allow} = $params->{allow};
640        }
641        # if(defined($params->{expect})) {
642                # if(ref($params->{expect}) eq 'ARRAY') {
643                        # $self->{expect} = $params->{expect};
644                        # $self->_warn('expect is deprecated, use allow instead');
645                # } else {
646                        # $self->_warn('expect must be a reference to an array');
647                # }
648        # }
649
865
731
        if(defined($params->{upload_dir})) {
650
4
5
                $self->{upload_dir} = $params->{upload_dir};
651        }
652
865
762
        if(defined($params->{'logger'})) {
653
2
3
                $self->set_logger($params->{'logger'});
654        }
655
865
1061
        $self->_trace('Entering params');
656
657
865
14545
        my @pairs;
658
865
725
        my $content_type = $ENV{'CONTENT_TYPE'};
659
865
491
        my %FORM;
660
661
865
1511
        if((!$ENV{'GATEWAY_INTERFACE'}) || (!$ENV{'REQUEST_METHOD'})) {
662                # require IO::Interactive;
663                # IO::Interactive->import();
664
665
557
552
                if(@ARGV) {
666
19
18
                        @pairs = @ARGV;
667
19
23
                        if(defined($pairs[0])) {
668
19
42
                                if($pairs[0] eq '--robot') {
669
3
3
                                        $self->{is_robot} = 1;
670
3
3
                                        shift @pairs;
671                                } elsif($pairs[0] eq '--mobile') {
672
4
5
                                        $self->{is_mobile} = 1;
673
4
5
                                        shift @pairs;
674                                } elsif($pairs[0] eq '--search-engine') {
675
3
5
                                        $self->{is_search_engine} = 1;
676
3
3
                                        shift @pairs;
677                                } elsif($pairs[0] eq '--tablet') {
678
3
4
                                        $self->{is_tablet} = 1;
679
3
3
                                        shift @pairs;
680                                }
681                        }
682                } elsif($stdin_data) {
683
0
0
                        @pairs = split(/\n/, $stdin_data);
684                # } elsif(IO::Interactive::is_interactive() && !$self->{args_read}) {
685
0
0
                } elsif(0) {
686                        # TODO:  Do I really need this anymore?
687                        my $oldfh = select(STDOUT);
688                        print "Entering debug mode\n",
689                                "Enter key=value pairs - end with quit\n";
690                        select($oldfh);
691
692                        # Avoid prompting for the arguments more than once
693                        # if just 'quit' is entered
694                        $self->{args_read} = 1;
695
696                        while(<STDIN>) {
697                                chop(my $line = $_);
698                                $line =~ s/[\r\n]//g;
699                                last if $line eq 'quit';
700                                push(@pairs, $line);
701                                $stdin_data .= "$line\n";
702                        }
703                }
704        } elsif(($ENV{'REQUEST_METHOD'} eq 'GET') || ($ENV{'REQUEST_METHOD'} eq 'HEAD')) {
705
244
278
                if(my $query = $ENV{'QUERY_STRING'}) {
706
236
238
                        if((defined($content_type)) && ($content_type =~ /multipart\/form-data/i)) {
707
2
2
                                if($ENV{'REMOTE_ADDR'}) {
708
1
6
                                        $self->_warn({ warning => "$ENV{REMOTE_ADDR}: Multipart/form-data not supported for GET (query string = $query)" });
709                                } else {
710
1
2
                                        $self->_warn('Multipart/form-data not supported for GET');
711                                }
712
1
2
                                $self->status(501);  # Not implemented
713
1
2
                                return;
714                        }
715
234
208
                        $query =~ s/\\u0026/\&/g;
716
234
307
                        @pairs = split(/&/, $query);
717                } else {
718
8
20
                        return;
719                }
720        } elsif($ENV{'REQUEST_METHOD'} eq 'POST') {
721
54
75
                my $content_length = $self->_get_env('CONTENT_LENGTH');
722
54
121
                if((!defined($content_length)) || ($content_length =~ /\D/)) {
723
8
10
                        $self->{status} = 411;
724
8
16
                        return;
725                }
726
46
127
                if(($self->{max_upload_size} >= 0) && ($content_length > $self->{max_upload_size})) {       # Set maximum posts
727                        # TODO: Design a way to tell the caller to send HTTP
728                        # status 413
729
7
12
                        $self->{status} = 413;
730
7
11
                        $self->_warn('Large upload prohibited');
731
7
14
                        return;
732                }
733
734
39
140
                if((!defined($content_type)) || ($content_type =~ /application\/x-www-form-urlencoded/)) {
735
12
7
                        my $buffer;
736
12
16
                        if($stdin_data) {
737
8
6
                                $buffer = $stdin_data;
738                        } else {
739
4
13
                                if(read(STDIN, $buffer, $content_length) != $content_length) {
740
1
1
                                        $self->_warn('POST failed: something else may have read STDIN');
741                                }
742
4
5
                                $stdin_data = $buffer;
743                        }
744
12
17
                        @pairs = split(/&/, $buffer);
745
746                        # if($ENV{'QUERY_STRING'}) {
747                                # my @getpairs = split(/&/, $ENV{'QUERY_STRING'});
748                                # push(@pairs, @getpairs);
749                        # }
750                } elsif($content_type =~ /multipart\/form-data/i) {
751
19
31
                        if(!defined($self->{upload_dir})) {
752
2
4
                                if($ENV{'REMOTE_ADDR'}) {
753                                        # This could be an attack
754
1
4
                                        $self->_warn({ warning => "$ENV{REMOTE_ADDR}: Attempt to upload a file of $content_length bytes when upload_dir has not been set" });
755                                } else {
756
1
2
                                        $self->_warn({ warning => 'Attempt to upload a file when upload_dir has not been set' });
757                                }
758
1
3
                                $self->status(501);  # Not implemented
759
1
2
                                return;
760                        }
761
762                        # Validate 'upload_dir'
763                        # Ensure the upload directory is safe and accessible
764                        # - Check permissions
765                        # - Validate path to prevent directory traversal attacks
766                        # TODO: Consider using a temporary directory for uploads and moving them later
767
17
69
                        if(!File::Spec->file_name_is_absolute($self->{upload_dir})) {
768
4
12
                                $self->_warn({
769                                        warning => "upload_dir $self->{upload_dir} isn't a full pathname"
770                                });
771
3
7
                                $self->status(500);
772
3
2
                                delete $self->{upload_dir};
773
3
6
                                return;
774                        }
775
13
84
                        if(!-d $self->{upload_dir}) {
776
4
21
                                $self->_warn({
777                                        warning => "upload_dir $self->{upload_dir} isn't a directory"
778                                });
779
2
5
                                $self->status(500);
780
2
2
                                delete $self->{upload_dir};
781
2
4
                                return;
782                        }
783
9
49
                        if(!-w $self->{upload_dir}) {
784
2
2
                                delete $self->{paramref};
785
2
7
                                $self->_warn({
786                                        warning => "upload_dir $self->{upload_dir} isn't writeable"
787                                });
788
1
4
                                $self->status(500);
789
1
1
                                delete $self->{upload_dir};
790
1
2
                                return;
791                        }
792
7
13
                        my $tmpdir = $self->tmpdir();
793
7
48
                        if($self->{'upload_dir'} !~ /^\Q$tmpdir\E/) {
794                                $self->_warn({
795
1
4
                                        warning => 'upload_dir ' . $self->{'upload_dir'} . " isn't somewhere in the temporary area $tmpdir"
796                                });
797
1
3
                                $self->status(500);
798
1
0
                                delete $self->{upload_dir};
799
1
3
                                return;
800                        }
801
6
13
                        if($content_type =~ /boundary=(\S+)$/) {
802
6
18
                                @pairs = $self->_multipart_data({
803                                        length => $content_length,
804                                        boundary => $1
805                                });
806                        }
807                } elsif($content_type =~ /text\/xml/i) {
808
4
4
                        my $buffer;
809
4
6
                        if($stdin_data) {
810
3
3
                                $buffer = $stdin_data;
811                        } else {
812
1
5
                                if(read(STDIN, $buffer, $content_length) != $content_length) {
813
0
0
                                        $self->_warn({
814                                                warning => 'XML failed: something else may have read STDIN'
815                                        });
816                                }
817
1
1
                                $stdin_data = $buffer;
818                        }
819
820
4
7
                        $FORM{XML} = $buffer;
821
822
4
5
                        $self->{paramref} = \%FORM;
823
824
4
9
                        return \%FORM;
825                } elsif($content_type =~ /application\/json/i) {
826
2
48
                        require JSON::MaybeXS && JSON::MaybeXS->import() unless JSON::MaybeXS->can('parse_json');
827                        # require JSON::MaybeXS;
828                        # JSON::MaybeXS->import();
829
830
2
1
                        my $buffer;
831
832
2
3
                        if($stdin_data) {
833
1
1
                                $buffer = $stdin_data;
834                        } else {
835
1
3
                                if(read(STDIN, $buffer, $content_length) != $content_length) {
836
0
0
                                        $self->_warn({
837                                                warning => 'read failed: something else may have read STDIN'
838                                        });
839                                }
840
1
1
                                $stdin_data = $buffer;
841                        }
842                        # JSON::Parse::assert_valid_json($buffer);
843                        # my $paramref = JSON::Parse::parse_json($buffer);
844
2
13
                        my $paramref = decode_json($buffer);
845
2
2
2
4
                        foreach my $key(keys(%{$paramref})) {
846
4
7
                                push @pairs, "$key=" . $paramref->{$key};
847                        }
848                } else {
849
2
1
                        my $buffer;
850
2
4
                        if($stdin_data) {
851
1
1
                                $buffer = $stdin_data;
852                        } else {
853
1
2
                                if(read(STDIN, $buffer, $content_length) != $content_length) {
854
0
0
                                        $self->_warn({
855                                                warning => 'read failed: something else may have read STDIN'
856                                        });
857                                }
858
1
1
                                $stdin_data = $buffer;
859                        }
860
861
2
6
                        $self->_warn({
862                                warning => "POST: Invalid or unsupported content type: $content_type: $buffer",
863                        });
864                }
865        } elsif($ENV{'REQUEST_METHOD'} eq 'OPTIONS') {
866
3
6
                $self->{status} = 405;
867
3
5
                return;
868        } elsif($ENV{'REQUEST_METHOD'} eq 'DELETE') {
869
4
6
                $self->{status} = 405;
870
4
6
                return;
871        } else {
872                # TODO: Design a way to tell the caller to send HTTP
873                # status 501
874
3
3
                $self->{status} = 501;
875
3
9
                $self->_warn({
876                        warning => 'Use POST, GET or HEAD'
877                });
878        }
879
880
812
724
        unless(scalar @pairs) {
881
548
1422
                return;
882        }
883
884
264
3385
        require String::Clean::XSS;
885
264
72659
        String::Clean::XSS->import();
886        # require String::EscapeCage;
887        # String::EscapeCage->import();
888
889
264
250
        foreach my $arg (@pairs) {
890
930
816
                my($key, $value) = split(/=/, $arg, 2);
891
892
930
680
                next unless($key);
893
894
921
551
                $key =~ s/\0//g;        # Strip encoded NUL byte poison
895
921
479
                $key =~ s/%00//g;       # Strip NUL byte poison
896
921
1
483
3
                $key =~ s/%([a-fA-F\d][a-fA-F\d])/pack("C", hex($1))/eg;
897
921
536
                $key =~ tr/+/ /;
898
921
599
                if(defined($value)) {
899
920
487
                        $value =~ s/%00//g;   # Strip encoded NUL byte poison
900
920
207
532
295
                        $value =~ s/%([a-fA-F\d][a-fA-F\d])/pack("C", hex($1))/eg;   # URL-decode
901
920
450
                        $value =~ tr/+/ /;
902
920
497
                        $value =~ s/\0//g;    # Strip NUL again: %2500 -> %00 -> \0 after second pass
903
920
536
                        $value =~ s/%00//g;   # Strip literal %00 again: catches %2500 -> %00
904                } else {
905
1
1
                        $value = '';
906                }
907
908
921
580
                $key = _sanitise_input($key);
909
910
921
56705
                if($self->{allow}) {
911                        # Is this a permitted argument?
912
160
152
                        if(!exists($self->{allow}->{$key})) {
913
28
58
                                $self->_info("Discard unallowed argument '$key'");
914
28
286
                                $self->status(422);
915
28
23
                                next;   # Skip to the next parameter
916                        }
917
918                        # Do we allow any value, or must it be validated?
919
132
140
                        if(defined(my $schema = $self->{allow}->{$key})) {        # Get the schema for this key
920
116
171
                                if(!ref($schema)) {
921                                        # Can only contain one value
922
8
11
                                        if($value ne $schema) {
923
4
6
                                                $self->_info("Block $key = $value");
924
4
38
                                                $self->status(422);
925
4
4
                                                next;   # Skip to the next parameter
926                                        }
927                                } elsif(ref($schema) eq 'Regexp') {
928
50
120
                                        if($value !~ $schema) {
929                                                # Simple regex
930
25
44
                                                $self->_info("Block $key = $value");
931
25
217
                                                $self->status(422);
932
25
29
                                                next;   # Skip to the next parameter
933                                        }
934                                } elsif(ref($schema) eq 'CODE') {
935
23
28
                                        unless($schema->($key, $value, $self)) {
936
8
126
                                                $self->_info("Block $key = $value");
937
8
59
                                                next;
938                                        }
939                                } else {
940                                        # Set of rules
941
35
23
                                        eval {
942                                                $value = Params::Validate::Strict::validate_strict({
943                                                        schema => { $key => $schema },
944                                                        args => { $key => $value },
945                                                        unknown_parameter_handler => 'die',
946
35
67
                                                        logger => $self->{'logger'}
947                                                });
948                                        };
949
35
9925
                                        if($@) {
950
9
25
                                                $self->_info("Block $key = $value: $@");
951
9
120
                                                $self->status(422);
952
9
11
                                                next;   # Skip to the next parameter
953                                        }
954
26
26
15
22
                                        if(scalar keys %{$value}) {
955
26
20
                                                $value = $value->{$key};
956                                        } else {
957
0
0
                                                $self->_info("Block $key = $value");
958
0
0
                                                $self->status(422);
959
0
0
                                                next;   # Skip to the next parameter
960                                        }
961                                }
962                        }
963                }
964
965                # if($self->{expect} && (List::Util::none { $_ eq $key } @{$self->{expect}})) {
966                        # next;
967                # }
968
845
830
                my $orig_value = $value;
969
845
946
                $value = _sanitise_input($value);
970
845
44397
                if((!defined($ENV{'REQUEST_METHOD'})) || ($ENV{'REQUEST_METHOD'} eq 'GET')) {
971                           # ($value =~ /\/AND\/.++\(SELECT\//) || # United/**/States)/**/AND/**/(SELECT/**/6734/**/FROM/**/(SELECT(SLEEP(5)))lRNi)/**/AND/**/(8984=8984
972                        # From http://www.symantec.com/connect/articles/detection-sql-injection-and-cross-site-scripting-attacks
973                        # Facebook FBCLID can have "--"
974
975                        # Pre-filter: only run quote-based regexes if value contains injection chars.
976
977                        # Compute pre-filter flags from orig_value so quotes stripped by
978                        # convert_XSS don't cause injection patterns to be missed
979
980
814
836
                        my $has_quote  = index($orig_value, "'")    >= 0 || index($orig_value, '%27') >= 0;
981
814
1248
                        my $has_hash   = index($orig_value, '#')    >= 0 || index($orig_value, '%23') >= 0;
982
814
674
                        my $has_equals = index($orig_value, '=')    >= 0 || index($orig_value, '%3D') >= 0;
983
814
732
                        my $has_semi   = index($orig_value, ';')    >= 0 || index($orig_value, '%3B') >= 0;
984
814
412
                        my $has_dash   = index($orig_value, '--')   >= 0;
985
986                        # All WAF patterns run on $orig_value (pre-XSS-sanitisation)
987                        # convert_XSS encodes ', =, < etc. as HTML entities, which would hide
988                        # injection patterns from the WAF if we checked $value instead.
989
814
1061
                        if($has_quote || $has_hash || ($has_equals && $has_dash)) {
990
17
163
                                if(($orig_value =~ /(\%27)|(\')|(\%23)|(\#)/ix) ||
991                                   (($has_equals && ($has_quote || $has_semi || $has_dash)) &&
992                                   $orig_value =~ /((\%3D)|(=))[^-]*+((\%27)|(\')|(\-\-)|(\%3B)|(;))/i) ||
993                                   ($has_quote &&
994                                    $orig_value =~ /\w*((\%27)|(\'))((\%6F)|o|(\%4F))((\%72)|r|(\%52))\s*(OR|AND|UNION|SELECT|--)/ix) ||
995                                    ($has_quote &&
996                                    $orig_value =~ /((\%27)|(\'))union/ix)) {
997
17
27
                                        $self->status(403);
998
17
23
                                        if($ENV{'REMOTE_ADDR'}) {
999
1
2
                                                $self->_warn($ENV{'REMOTE_ADDR'} . ": SQL injection attempt blocked for '$key=$orig_value'");
1000                                        } else {
1001
16
28
                                                $self->_warn("SQL injection attempt blocked for '$key=$orig_value'");
1002                                        }
1003
17
38
                                        return;
1004                                }
1005                        }
1006
1007
797
711
                        my $has_select = index($orig_value, 'SELECT') >= 0 || index($orig_value, 'select') >= 0;
1008
797
408
                        my $has_dump   = index($orig_value, 'var_dump') >= 0;
1009
797
419
                        my $has_exec   = index($orig_value, 'exec') >= 0;
1010
797
393
                        my $has_or  = index($orig_value, ' OR ')  >= 0;
1011
797
395
                        my $has_and = index($orig_value, ' AND ') >= 0;
1012
797
696
                        my $has_slash  = index($orig_value, '/**/') >= 0 || index($orig_value, '/AND/') >= 0;
1013
1014
797
2709
                        if(($has_select && $orig_value =~ /select[[a-z]\s\*]from/ix) ||
1015                           ($has_and    && $orig_value =~ /\sAND\s1=1/ix) ||
1016                           ($has_or && $has_and && $orig_value =~ /\sOR\s.*\sAND\s/) ||
1017                           ($has_slash  && $orig_value =~ /\/\*\*\/ORDER\/\*\*\/BY\/\*\*/ix) ||
1018                           ($has_dump   && $orig_value =~ /var_dump[^m]*+md5/) ||
1019                           ($has_slash  && $has_select && $orig_value =~ /\/AND\/[^(]*+\(SELECT\//) ||
1020                           ($has_exec   && $orig_value =~ /exec(\s|\+)++(s|x)p\w+/ix)) {
1021
8
11
                                $self->status(403);
1022
8
7
                                if($ENV{'REMOTE_ADDR'}) {
1023
0
0
                                        $self->_warn($ENV{'REMOTE_ADDR'} . ": SQL injection attempt blocked for '$key=$orig_value'");
1024                                } else {
1025
8
14
                                        $self->_warn("SQL injection attempt blocked for '$key=$orig_value'");
1026                                }
1027
8
17
                                return;
1028                        }
1029
1030
789
583
                        if(my $agent = $ENV{'HTTP_USER_AGENT'}) {
1031
8
40
                                if(($agent =~ /SELECT.+AND.+/) || ($agent =~ /ORDER BY /) || ($agent =~ / OR NOT /) || ($agent =~ / AND \d+=\d+/) || ($agent =~ /THEN.+ELSE.+END/) || ($agent =~ /.+AND.+SELECT.+/) || ($agent =~ /\sAND\s.+\sAND\s/)) {
1032
2
4
                                        $self->status(403);
1033
2
3
                                        if($ENV{'REMOTE_ADDR'}) {
1034
2
6
                                                $self->_warn($ENV{'REMOTE_ADDR'} . ": SQL injection attempt blocked for '$agent'");
1035                                        } else {
1036
0
0
                                                $self->_warn("SQL injection attempt blocked for '$agent'");
1037                                        }
1038
2
11
                                        return;
1039                                }
1040                        }
1041
1042
787
3716
                        if(($value =~ /((\%3C)|<)((\%2F)|\/)*[a-z0-9\%]+((\%3E)|>)/ix) ||
1043                           ($value =~ /((\%3C)|<)[^\n]+((\%3E)|>)/i) ||
1044                           ($orig_value =~ /((\%3C)|<)((\%2F)|\/)*[a-z0-9\%]+((\%3E)|>)/ix) ||
1045                           ($orig_value =~ /((\%3C)|<)[^\n]+((\%3E)|>)/i)) {
1046
11
19
                                $self->status(403);
1047
11
21
                                $self->_warn("XSS injection attempt blocked for '$value'");
1048
11
24
                                return;
1049                        }
1050
1051
776
518
                        if($value =~ /mustleak\.com\//) {
1052
4
7
                                $self->status(403);
1053
4
8
                                $self->_warn("Blocked mustleak attack for '$key'");
1054
4
9
                                return;
1055                        }
1056
1057
772
711
                        if($value =~ /\.\.\//) {
1058
8
13
                                $self->status(403);
1059
8
14
                                $self->_warn("Blocked directory traversal attack for '$key'");
1060
7
15
                                return;
1061                        }
1062                }
1063
795
493
                if(length($value) > 0) {
1064                        # Don't add if it's already there
1065
786
685
                        if($FORM{$key} && ($FORM{$key} ne $value)) {
1066
5
7
                                $FORM{$key} .= ",$value";
1067                        } else {
1068
781
759
                                $FORM{$key} = $value;
1069                        }
1070                }
1071        }
1072
1073
212
180
        unless(%FORM) {
1074
33
61
                return;
1075        }
1076
1077
179
161
        if($self->{'logger'}) {
1078
179
280
                while(my ($key,$value) = each %FORM) {
1079
771
3849
                        $self->_debug("$key=$value");
1080                }
1081        }
1082
1083
179
1654
        $self->{paramref} = \%FORM;
1084
1085
179
283
        return Return::Set::set_return(\%FORM, { type => 'hashref', min => 1 });
1086}
1087
1088 - 1120
=head2 param($field)

Get a single parameter from the query string.
Takes an optional single string parameter which is the argument to return. If
that parameter is not given param() is a wrapper to params() with no arguments.

        use CGI::Info;
        # ...
        my $info = CGI::Info->new();
        my $bar = $info->param('foo');

If the requested parameter isn't in the allowed list, an error message will
be thrown:

        use CGI::Info;
        my $allowed = {
                foo => qr/\d+/
        };
        my $xyzzy = $info->params(allow => $allowed);
        my $bar = $info->param('bar');  # Gives an error message

Returns undef if the requested parameter was not given

=over 4

=item $field

Optional field to be retrieved.
If omitted, all the parameters are returned.

=back

=cut
1121
1122sub param {
1123
74
9571
        my ($self, $field) = @_;
1124
1125
74
79
        if(!defined($field)) {
1126
4
6
                return $self->params();
1127        }
1128        # Is this a permitted argument?
1129
70
156
        if($self->{allow} && !exists($self->{allow}->{$field})) {
1130
9
19
                $self->_warn({
1131                        warning => "param: $field isn't in the allow list"
1132                });
1133
5
10
                return;
1134        }
1135
1136        # Prevent deep recursion which can happen when a validation routine calls param()
1137
61
41
        my $allow;
1138
61
68
        if($self->{in_param} && $self->{allow}) {
1139
4
5
                $allow = delete $self->{allow};
1140        }
1141
61
61
        $self->{in_param} = 1;
1142
1143
61
88
        my $params = $self->params();
1144
1145
61
1795
        $self->{in_param} = 0;
1146
61
57
        $self->{allow} = $allow if($allow);
1147
1148
61
74
        if($params) {
1149
54
78
                return Return::Set::set_return($params->{$field}, { type => 'string' });
1150        }
1151}
1152
1153sub _sanitise_input($) {
1154
1766
950
        my $arg = shift;
1155
1156
1766
1063
        return if(!defined($arg));
1157
1158        # Remove hacking attempts and spaces
1159
1766
1080
        $arg =~ s/[\r\n]//g;
1160
1766
990
        $arg =~ s/\s+$//;
1161
1766
1067
        $arg =~ s/^\s//;
1162
1163
1766
809
        $arg =~ s/<!--.*-->//g;
1164        # Allow :
1165        # $arg =~ s/[;<>\*|`&\$!?#\(\)\[\]\{\}'"\\\r]//g;
1166
1167        # return $arg;
1168        # return String::EscapeCage->new(convert_XSS($arg))->escapecstring();
1169
1766
1224
        return convert_XSS($arg);
1170}
1171
1172sub _multipart_data {
1173
6
6
        my ($self, $args) = @_;
1174
1175
6
7
        $self->_trace('Entering _multipart_data');
1176
1177
6
65
        my $total_bytes = $$args{length};
1178
1179
6
10
        $self->_debug("_multipart_data: total_bytes = $total_bytes");
1180
1181
6
62
        if($total_bytes == 0) {
1182
0
0
                return;
1183        }
1184
1185
6
8
        unless($stdin_data) {
1186
6
15
                while(<STDIN>) {
1187
54
39
                        chop(my $line = $_);
1188
54
27
                        $line =~ s/[\r\n]//g;
1189
54
50
                        $stdin_data .= "$line\n";
1190                }
1191
6
9
                if(!$stdin_data) {
1192
0
0
                        return;
1193                }
1194        }
1195
1196
6
8
        my $boundary = $$args{boundary};
1197
1198
6
3
        my @pairs;
1199
6
5
        my $writing_file = 0;
1200
6
3
        my $key;
1201        my $value;
1202
6
5
        my $in_header = 0;
1203
6
3
        my $fout;
1204
1205
6
14
        foreach my $line(split(/\n/, $stdin_data)) {
1206
44
55
                if($line =~ /^--\Q$boundary\E--$/) {
1207
2
3
                        last;
1208                }
1209
42
64
                if($line =~ /^--\Q$boundary\E$/) {
1210
10
10
                        if($writing_file) {
1211
0
0
                                close $fout;
1212
0
0
                                $writing_file = 0;
1213                        } elsif(defined($key)) {
1214
4
5
                                push(@pairs, "$key=$value");
1215
4
2
                                $value = undef;
1216                        }
1217
10
12
                        $in_header = 1;
1218                } elsif($in_header) {
1219
20
30
                        if(length($line) == 0) {
1220
8
5
                                $in_header = 0;
1221                        } elsif($line =~ /^Content-Disposition: (.+)/i) {
1222
10
11
                                my $field = $1;
1223
10
17
                                if($field =~ /name="(.+?)"/) {
1224
10
9
                                        $key = $1;
1225                                }
1226
10
43
                                if($field =~ /filename="(.+)?"/) {
1227
6
4
                                        my $filename = $1;
1228
6
27
                                        unless(defined($filename)) {
1229
0
0
                                                $self->_warn('No upload filename given');
1230
0
0
                                        } elsif($filename =~ /[\\\/\|]/) {
1231
2
3
                                                $self->_warn("Disallowing invalid filename: $filename");
1232                                        } else {
1233
4
8
                                                $filename = $self->_create_file_name({
1234                                                        filename => $filename
1235                                                });
1236
1237                                                # Don't do this since it taints the string and I can't work out how to untaint it
1238                                                # my $full_path = Cwd::realpath(File::Spec->catfile($self->{upload_dir}, $filename));
1239                                                # $full_path =~ m/^(\/[\w\.]+)$/;
1240
4
18
                                                my $full_path = File::Spec->catfile($self->{upload_dir}, $filename);
1241
4
183
                                                unless(open($fout, '>', $full_path)) {
1242
0
0
                                                        $self->_warn("Can't open $full_path");
1243                                                }
1244
4
4
                                                $writing_file = 1;
1245
4
9
                                                push(@pairs, "$key=$filename");
1246                                        }
1247                                }
1248                        }
1249                        # TODO: handle Content-Type: text/plain, etc.
1250                } else {
1251
12
10
                        if($writing_file) {
1252
8
22
                                print $fout "$line\n";
1253                        } else {
1254
4
3
                                $value .= $line;
1255                        }
1256                }
1257        }
1258
1259
4
8
        if($writing_file) {
1260
4
106
                close $fout;
1261        }
1262
1263
4
28
        $self->_trace('Leaving _multipart_data');
1264
1265
4
66
        return @pairs;
1266}
1267
1268# Robust filename generation (preventing overwriting)
1269sub _create_file_name {
1270
4
4
        my ($self, $args) = @_;
1271
4
6
        my $filename = $$args{filename} . '_' . time;
1272
1273
4
3
        my $counter = 0;
1274
4
3
        my $rc;
1275
1276
4
3
        do {
1277
4
6
                $rc = $filename . ($counter ? "_$counter" : '');
1278
4
40
                $counter++;
1279        } until(! -e $rc);      # Check if file exists
1280
1281
4
5
        return $rc;
1282}
1283
1284# Untaint a filename. Regex from CGI::Untaint::Filenames
1285sub _untaint_filename {
1286
112
92
        my ($self, $args) = @_;
1287
1288
112
245
        if($$args{filename} =~ /(^[\w\+_\040\#\(\)\{\}\[\]\/\-\^,\.:;&%@\\~]+\$?$)/) {
1289
112
179
                return $1;
1290        }
1291        # return undef;
1292}
1293
1294 - 1302
=head2 is_mobile

Returns a boolean if the website is being viewed on a mobile
device such as a smartphone.
All tablets are mobile, but not all mobile devices are tablets.

Can be overriden by the IS_MOBILE environment setting

=cut
1303
1304sub is_mobile {
1305
192
1763
        my $self = shift;
1306
1307
192
179
        if(defined($self->{is_mobile})) {
1308
122
87
                return $self->{is_mobile};
1309        }
1310
1311
70
96
        if($ENV{'IS_MOBILE'}) {
1312
4
8
                return $ENV{'IS_MOBILE'}
1313        }
1314
1315        # Support Sec-CH-UA-Mobile
1316
66
95
        if(my $ch_ua_mobile = $ENV{'HTTP_SEC_CH_UA_MOBILE'}) {
1317
6
10
                if($ch_ua_mobile eq '?1') {
1318
3
2
                        $self->{is_mobile} = 1;
1319
3
8
                        return 1;
1320                }
1321        }
1322
1323
63
89
        if($ENV{'HTTP_X_WAP_PROFILE'}) {
1324                # E.g. Blackberry
1325                # TODO: Check the sanity of this variable
1326
3
3
                $self->{is_mobile} = 1;
1327
3
8
                return 1;
1328        }
1329
1330
60
84
        if(my $agent = $ENV{'HTTP_USER_AGENT'}) {
1331
46
467953
                if($agent =~ /.+(Android|iPhone).+/) {
1332
14
16
                        $self->{is_mobile} = 1;
1333
14
27
                        return 1;
1334                }
1335
1336                # From http://detectmobilebrowsers.com/
1337
32
2041
                if($agent =~ m/(android|bb\d+|meego).+mobile|avantgo|bada\/|blackberry|blazer|compal|elaine|fennec|hiptop|iemobile|ip(hone|od)|iris|kindle|lge |maemo|midp|mmp|mobile.+firefox|netfront|opera m(ob|in)i|palm( os)?|phone|p(ixi|re)\/|plucker|pocket|psp|series(4|6)0|symbian|treo|up\.(browser|link)|vodafone|wap|windows ce|xda|xiino/i || substr($ENV{'HTTP_USER_AGENT'}, 0, 4) =~ m/1207|6310|6590|3gso|4thp|50[1-6]i|770s|802s|a wa|abac|ac(er|oo|s\-)|ai(ko|rn)|al(av|ca|co)|amoi|an(ex|ny|yw)|aptu|ar(ch|go)|as(te|us)|attw|au(di|\-m|r |s )|avan|be(ck|ll|nq)|bi(lb|rd)|bl(ac|az)|br(e|v)w|bumb|bw\-(n|u)|c55\/|capi|ccwa|cdm\-|cell|chtm|cldc|cmd\-|co(mp|nd)|craw|da(it|ll|ng)|dbte|dc\-s|devi|dica|dmob|do(c|p)o|ds(12|\-d)|el(49|ai)|em(l2|ul)|er(ic|k0)|esl8|ez([4-7]0|os|wa|ze)|fetc|fly(\-|_)|g1 u|g560|gene|gf\-5|g\-mo|go(\.w|od)|gr(ad|un)|haie|hcit|hd\-(m|p|t)|hei\-|hi(pt|ta)|hp( i|ip)|hs\-c|ht(c(\-| |_|a|g|p|s|t)|tp)|hu(aw|tc)|i\-(20|go|ma)|i230|iac( |\-|\/)|ibro|idea|ig01|ikom|im1k|inno|ipaq|iris|ja(t|v)a|jbro|jemu|jigs|kddi|keji|kgt( |\/)|klon|kpt |kwc\-|kyo(c|k)|le(no|xi)|lg( g|\/(k|l|u)|50|54|\-[a-w])|libw|lynx|m1\-w|m3ga|m50\/|ma(te|ui|xo)|mc(01|21|ca)|m\-cr|me(rc|ri)|mi(o8|oa|ts)|mmef|mo(01|02|bi|de|do|t(\-| |o|v)|zz)|mt(50|p1|v )|mwbp|mywa|n10[0-2]|n20[2-3]|n30(0|2)|n50(0|2|5)|n7(0(0|1)|10)|ne((c|m)\-|on|tf|wf|wg|wt)|nok(6|i)|nzph|o2im|op(ti|wv)|oran|owg1|p800|pan(a|d|t)|pdxg|pg(13|\-([1-8]|c))|phil|pire|pl(ay|uc)|pn\-2|po(ck|rt|se)|prox|psio|pt\-g|qa\-a|qc(07|12|21|32|60|\-[2-7]|i\-)|qtek|r380|r600|raks|rim9|ro(ve|zo)|s55\/|sa(ge|ma|mm|ms|ny|va)|sc(01|h\-|oo|p\-)|sdk\/|se(c(\-|0|1)|47|mc|nd|ri)|sgh\-|shar|sie(\-|m)|sk\-0|sl(45|id)|sm(al|ar|b3|it|t5)|so(ft|ny)|sp(01|h\-|v\-|v )|sy(01|mb)|t2(18|50)|t6(00|10|18)|ta(gt|lk)|tcl\-|tdg\-|tel(i|m)|tim\-|t\-mo|to(pl|sh)|ts(70|m\-|m3|m5)|tx\-9|up(\.b|g1|si)|utst|v400|v750|veri|vi(rg|te)|vk(40|5[0-3]|\-v)|vm40|voda|vulc|vx(52|53|60|61|70|80|81|83|85|98)|w3c(\-| )|webc|whit|wi(g |nc|nw)|wmlb|wonu|x700|yas\-|your|zeto|zte\-/i) {
1338
1
2
                        $self->{is_mobile} = 1;
1339
1
5
                        return 1;
1340                }
1341
1342                # Save loading and calling HTTP::BrowserDetect
1343
31
32
                my $remote = $ENV{'REMOTE_ADDR'};
1344
31
93
                if(defined($remote) && $self->{cache}) {
1345
2
5
                        if(my $type = $self->{cache}->get("$remote/$agent")) {
1346
2
9
                                return $self->{is_mobile} = ($type eq 'mobile');
1347                        }
1348                }
1349
1350
29
36
                unless($self->{browser_detect}) {
1351
23
23
18
3685
                        if(eval { require HTTP::BrowserDetect; }) {
1352
23
64250
                                HTTP::BrowserDetect->import();
1353
23
45
                                $self->{browser_detect} = HTTP::BrowserDetect->new($agent);
1354                        }
1355                }
1356
1357
29
2011
                if($self->{browser_detect}) {
1358
29
49
                        my $device = $self->{browser_detect}->device();
1359                        # Without the ?1:0 it will set to the empty string not 0
1360
29
144
                        my $is_mobile = (defined($device) && ($device =~ /blackberry|webos|iphone|ipod|ipad|android/i)) ? 1 : 0;
1361
29
43
                        if($is_mobile && $self->{cache} && defined($remote)) {
1362
0
0
                                $self->{cache}->set("$remote/$agent", 'mobile', '1 day');
1363                        }
1364
29
71
                        return $self->{is_mobile} = $is_mobile;
1365                }
1366        }
1367
1368
14
25
        return 0;
1369}
1370
1371 - 1375
=head2 is_tablet

Returns a boolean if the website is being viewed on a tablet such as an iPad.

=cut
1376
1377sub is_tablet {
1378
18
117
        my $self = shift;
1379
1380
18
21
        if(defined($self->{is_tablet})) {
1381
3
7
                return $self->{is_tablet};
1382        }
1383
1384
15
304
        if($ENV{'HTTP_USER_AGENT'} && ($ENV{'HTTP_USER_AGENT'} =~ /.+(iPad|TabletPC).+/)) {
1385                # TODO: add others when I see some nice user_agents
1386
5
5
                $self->{is_tablet} = 1;
1387        } else {
1388
10
9
                $self->{is_tablet} = 0;
1389        }
1390
1391
15
29
        return $self->{is_tablet};
1392}
1393
1394 - 1420
=head2 as_string

Converts CGI parameters into a formatted string representation with optional raw mode (no escaping of special characters).
Useful for debugging or generating keys for a cache.

    my $string_representation = $info->as_string();
    my $raw_string = $info->as_string({ raw => 1 });

=head3 API SPECIFICATION

=head4 INPUT

  {
    raw => {
      'type' => 'boolean',
      'optional' => 1,
    }
  }

=head4 OUTPUT

  {
    type => 'string',
    optional => 1,
  }

=cut
1421
1422sub as_string
1423{
1424
898
10442
        my $self = shift;
1425
1426
898
1011
        my $args = Params::Validate::Strict::validate_strict({
1427                args => Params::Get::get_params(undef, @_) || {},
1428                schema => {
1429                        raw => {
1430                                'type' => 'boolean',
1431                                'optional' => 1
1432                        }
1433                }
1434        });
1435
1436        # Retrieve object parameters
1437
593
43425
        my $params = $self->params() || return '';
1438
1439
46
245
        my $rc;
1440
1441
46
50
        if($args->{'raw'}) {
1442                # Raw mode: return key=value pairs without escaping
1443                $rc = join '; ', map {
1444
8
17
                        "$_=" . $params->{$_}
1445
6
6
12
7
                } sort keys %{$params};
1446        } else {
1447                # Escaped mode: escape special characters
1448                $rc = join '; ', map {
1449
62
44
                        my $value = $params->{$_};
1450
1451
62
66
                        $value =~ s/\\/\\\\/g;  # Escape backslashes
1452
62
94
                        $value =~ s/(;|=)/\\$1/g;       # Escape semicolons and equals signs
1453
62
91
                        "$_=$value"
1454
40
40
41
64
                } sort keys %{$params};
1455        }
1456
1457
46
56
        $rc ||= '';
1458
1459
46
64
        $self->_trace("as_string: returning '$rc'");
1460
1461
46
494
        return $rc;
1462}
1463
1464 - 1469
=head2 protocol

Returns the connection protocol, presumably 'http' or 'https', or undef if
it can't be determined.

=cut
1470
1471sub protocol {
1472
54
765
        my $self = shift;
1473
1474
54
99
        if($ENV{'SCRIPT_URI'} && ($ENV{'SCRIPT_URI'} =~ /^(.+):\/\/.+/)) {
1475
7
18
                return $1;
1476        }
1477
47
89
        if($ENV{'SERVER_PROTOCOL'} && ($ENV{'SERVER_PROTOCOL'} =~ /^HTTP\//)) {
1478
6
14
                return 'http';
1479        }
1480
1481
41
64
        if(my $port = $ENV{'SERVER_PORT'}) {
1482
23
1294
                if(defined(my $name = getservbyport($port, 'tcp'))) {
1483
19
44
                        if($name =~ /https?/) {
1484
15
42
                                return $name;
1485                        } elsif($name eq 'www') {
1486                                # e.g. NetBSD and OpenBSD
1487
0
0
                                return 'http';
1488                        }
1489                        # Return an error, maybe missing something
1490                } elsif($port == 80) {
1491                        # e.g. Solaris
1492
0
0
                        return 'http';
1493                } elsif($port == 443) {
1494
0
0
                        return 'https';
1495                }
1496        }
1497
1498
26
596
        if($ENV{'REMOTE_ADDR'}) {
1499
1
3
                $self->_warn("Can't determine the calling protocol");
1500        }
1501
26
71
        return;
1502}
1503
1504 - 1529
=head2 tmpdir

Returns the name of a directory that you can use to create temporary files
in.

The routine is preferable to L<File::Spec/tmpdir> since CGI programs are
often running on shared servers.  Having said that, tmpdir will fall back
to File::Spec->tmpdir() if it can't find somewhere better.

If the parameter 'default' is given, then use that directory as a
fall-back rather than the value in File::Spec->tmpdir().
No sanity tests are done, so if you give the default value of
'/non-existant', that will be returned.

Tmpdir allows a reference of the options to be passed.

        use CGI::Info;

        my $info = CGI::Info->new();
        my $dir = $info->tmpdir(default => '/var/tmp');
        $dir = $info->tmpdir({ default => '/var/tmp' });

        # or

        my $dir = CGI::Info->tmpdir();
=cut
1530
1531sub tmpdir {
1532
1636
2705
        my $self = shift;
1533
1534
1636
1333
        my $name = 'tmp';
1535
1636
2374
        if($^O eq 'MSWin32') {
1536
0
0
                $name = 'temp';
1537        }
1538
1539
1636
1052
        my $dir;
1540
1541
1636
1800
        if(!ref($self)) {
1542
5
7
                $self = __PACKAGE__->new();
1543        }
1544
1636
1978
        my $params = Params::Get::get_params(undef, @_);
1545
1546
1636
16943
        if($ENV{'C_DOCUMENT_ROOT'} && (-d $ENV{'C_DOCUMENT_ROOT'})) {
1547
9
27
                $dir = File::Spec->catdir($ENV{'C_DOCUMENT_ROOT'}, $name);
1548
9
67
                if((-d $dir) && (-w $dir)) {
1549
3
6
                        return $self->_untaint_filename({ filename => $dir });
1550                }
1551
6
5
                $dir = $ENV{'C_DOCUMENT_ROOT'};
1552
6
61
                if((-d $dir) && (-w $dir)) {
1553
6
15
                        return $self->_untaint_filename({ filename => $dir });
1554                }
1555        }
1556
1627
2096
        if($ENV{'DOCUMENT_ROOT'} && (-d $ENV{'DOCUMENT_ROOT'})) {
1557
2
14
                $dir = File::Spec->catdir($ENV{'DOCUMENT_ROOT'}, File::Spec->updir(), $name);
1558
2
18
                if((-d $dir) && (-w $dir)) {
1559
1
13
                        return $self->_untaint_filename({ filename => $dir });
1560                }
1561        }
1562
1626
3057
        if($params->{'default'} && ref($params->{'default'})) {
1563
608
5492
                croak(ref($self), ': tmpdir must be given a scalar');
1564        }
1565
1018
5000
        return $params->{default} ? $params->{default} : File::Spec->tmpdir();
1566}
1567
1568 - 1580
=head2 rootdir

Returns the document root.  This is preferable to looking at DOCUMENT_ROOT
in the environment because it will also work when we're not running as a CGI
script, which is useful for script debugging.

This can be run as a class or object method.

        use CGI::Info;

        print CGI::Info->rootdir();

=cut
1581
1582sub rootdir {
1583
30
1843
        if($ENV{'C_DOCUMENT_ROOT'} && (-d $ENV{'C_DOCUMENT_ROOT'})) {
1584
14
25
                return $ENV{'C_DOCUMENT_ROOT'};
1585        } elsif($ENV{'DOCUMENT_ROOT'} && (-d $ENV{'DOCUMENT_ROOT'})) {
1586
4
18
                return $ENV{'DOCUMENT_ROOT'};
1587        }
1588
12
13
        my $script_name = $0;
1589
1590
12
25
        unless(File::Spec->file_name_is_absolute($script_name)) {
1591
12
79
                $script_name = File::Spec->rel2abs($script_name);
1592        }
1593
12
16
        if($script_name =~ /.cgi\-bin.*/) {     # kludge for outside CGI environment
1594
0
0
                $script_name =~ s/.cgi\-bin.*//;
1595        }
1596
12
48
        if(-f $script_name) {   # More kludge
1597
12
11
                if($^O eq 'MSWin32') {
1598
0
0
                        if($script_name =~ /(.+)\\.+?$/) {
1599
0
0
                                return $1;
1600                        }
1601                } else {
1602
12
27
                        if($script_name =~ /(.+)\/.+?$/) {
1603
12
18
                                return $1;
1604                        }
1605                }
1606        }
1607
0
0
        return $script_name;
1608}
1609
1610 - 1614
=head2 root_dir

Synonym of rootdir(), for compatibility with L<CHI>.

=cut
1615
1616sub root_dir
1617{
1618
7
720
        if($_[0] && ref($_[0])) {
1619
3
2
                my $self = shift;
1620
1621
3
4
                return $self->rootdir(@_);
1622        }
1623
4
14
        return __PACKAGE__->rootdir(@_);
1624}
1625
1626 - 1630
=head2 documentroot

Synonym of rootdir(), for compatibility with Apache.

=cut
1631
1632sub documentroot
1633{
1634
6
361
        if($_[0] && ref($_[0])) {
1635
2
1
                my $self = shift;
1636
1637
2
3
                return $self->rootdir(@_);
1638        }
1639
4
7
        return __PACKAGE__->rootdir(@_);
1640}
1641
1642 - 1654
=head2 logdir($dir)

Gets and sets the name of a directory where you can store logs.

=over 4

=item $dir

Path to the directory where logs will be stored.

=back

=cut
1655
1656sub logdir {
1657
20
1253
        my $self = shift;
1658
20
12
        my $dir = shift;
1659
1660
20
25
        if(!ref($self)) {
1661
1
2
                $self = __PACKAGE__->new();
1662        }
1663
1664
20
21
        if($dir) {
1665
10
101
                if(length($dir) && (-d $dir) && (-w $dir)) {
1666
5
17
                        return $self->{'logdir'} = $dir;
1667                }
1668
5
12
                $self->_warn("Invalid logdir: $dir");
1669
5
326
                Carp::croak("Invalid logdir: $dir");
1670        }
1671
1672
10
66
        foreach my $rc($self->{logdir}, $ENV{'LOGDIR'}, Sys::Path->logdir(), $self->tmpdir()) {
1673
26
130
                if(defined($rc) && length($rc) && (-d $rc) && (-w $rc)) {
1674
10
9
                        $dir = $rc;
1675
10
5
                        last;
1676                }
1677        }
1678
10
28
        $self->_warn("Can't determine logdir") if((!defined($dir)) || (length($dir) == 0));
1679
10
18
        $self->{logdir} ||= $dir;
1680
1681
10
17
        return $dir;
1682}
1683
1684 - 1699
=head2 is_robot

Is the visitor a real person or a robot?

        use CGI::Info;

        my $info = CGI::Info->new();
        unless($info->is_robot()) {
                # update site visitor statistics
        }

If the client is seen to be attempting an SQL injection,
set the HTTP status to 403,
and return 1.

=cut
1700
1701sub is_robot {
1702
52
573
        my $self = shift;
1703
1704
52
71
        if(defined($self->{is_robot})) {
1705
9
15
                return $self->{is_robot};
1706        }
1707
1708
43
48
        my $agent = $ENV{'HTTP_USER_AGENT'};
1709
43
48
        my $remote = $ENV{'REMOTE_ADDR'};
1710
1711
43
85
        unless($remote && $agent) {
1712                # Probably not running in CGI - assume real person
1713
11
19
                return 0;
1714        }
1715
1716        # See also params()
1717
32
251
        if(($agent =~ /SELECT.+AND.+/) || ($agent =~ /ORDER BY /) || ($agent =~ / OR NOT /) || ($agent =~ / AND \d+=\d+/) || ($agent =~ /THEN.+ELSE.+END/) || ($agent =~ /.+AND.+SELECT.+/) || ($agent =~ /\sAND\s.+\sAND\s/)) {
1718
4
8
                $self->status(403);
1719
4
5
                $self->{is_robot} = 1;
1720
4
9
                if($ENV{'REMOTE_ADDR'}) {
1721
4
13
                        $self->_warn($ENV{'REMOTE_ADDR'} . ": SQL injection attempt blocked for '$agent'");
1722                } else {
1723
0
0
                        $self->_warn("SQL injection attempt blocked for '$agent'");
1724                }
1725
4
9
                return 1;
1726        }
1727
28
123208
        if($agent =~ /.+bot|axios\/1\.6\.7|bidswitchbot|bytespider|ClaudeBot|Clickagy.Intelligence.Bot|msnptc|CriteoBot|is_archiver|backstreet|fuzz faster|linkfluence\.com|spider|scoutjet|gingersoftware|heritrix|dodnetdotcom|yandex|nutch|ezooms|plukkie|nova\.6scan\.com|Twitterbot|adscanner|Go-http-client|python-requests|Mediatoolkitbot|NetcraftSurveyAgent|Expanse|serpstatbot|DreamHost SiteMonitor|techiaith.cymru|trendictionbot|ias_crawler|WPsec|Yak\/1\.0|ZoominfoBot/i) {
1728
11
14
                $self->{is_robot} = 1;
1729
11
45
                return 1;
1730        }
1731
1732        # TODO:
1733        # Download and use list from
1734        #       https://raw.githubusercontent.com/mitchellkrogza/apache-ultimate-bad-bot-blocker/refs/heads/master/_generator_lists/bad-user-agents.list
1735
1736
17
46
        my $key = "$remote/$agent";
1737
1738
17
31
        if(my $referrer = $ENV{'HTTP_REFERER'}) {
1739                # https://agency.ohow.co/google-analytics-implementation-audit/google-analytics-historical-spam-list/
1740
4
16
                my @crawler_lists = (
1741                        'http://fix-website-errors.com',
1742                        'http://keywords-monitoring-your-success.com',
1743                        'http://free-video-tool.com',
1744                        'http://magnet-to-torrent.com',
1745                        'http://torrent-to-magnet.com',
1746                        'http://dogsrun.net',
1747                        'http://###.responsive-test.net',
1748                        'http://uptime.com',
1749                        'http://uptimechecker.com',
1750                        'http://top1-seo-service.com',
1751                        'http://fast-wordpress-start.com',
1752                        'http://wordpress-crew.net',
1753                        'http://dbutton.net',
1754                        'http://justprofit.xyz',
1755                        'http://video--production.com',
1756                        'http://buttons-for-website.com',
1757                        'http://buttons-for-your-website.com',
1758                        'http://success-seo.com',
1759                        'http://videos-for-your-business.com',
1760                        'http://semaltmedia.com',
1761                        'http://dailyrank.net',
1762                        'http://uptimebot.net',
1763                        'http://sitevaluation.org',
1764                        'http://100dollars-seo.com',
1765                        'http://forum69.info',
1766                        'http://partner.semalt.com',
1767                        'http://best-seo-offer.com',
1768                        'http://best-seo-solution.com',
1769                        'http://semalt.semalt.com',
1770                        'http://semalt.com',
1771                        'http://7makemoneyonline.com',
1772                        'http://anticrawler.org',
1773                        'http://baixar-musicas-gratis.com',
1774                        'http://descargar-musica-gratis.net',
1775
1776                        # Mine
1777                        'http://www.seokicks.de/robot.html',
1778                );
1779
4
5
                $referrer =~ s/\\/_/g;
1780
4
33
15
71
                if(($referrer =~ /\)/) || (List::Util::any { $_ =~ /^$referrer/ } @crawler_lists)) {
1781
4
9
                        $self->_debug("is_robot: blocked trawler $referrer");
1782
1783
4
24
                        if($self->{cache}) {
1784
0
0
                                $self->{cache}->set($key, 'robot', '1 day');
1785                        }
1786
4
3
                        $self->{is_robot} = 1;
1787
4
12
                        return 1;
1788                }
1789        }
1790
1791
13
40
        if(defined($remote) && $self->{cache}) {
1792
1
2
                if(my $type = $self->{cache}->get("$remote/$agent")) {
1793
1
6
                        return $self->{is_robot} = ($type eq 'robot');
1794                }
1795        }
1796
1797        # Don't use HTTP_USER_AGENT to detect more than we really have to since
1798        # that is easily spoofed
1799
12
51
        if($agent =~ /www\.majestic12\.co\.uk|facebookexternal/) {
1800                # Mark Facebook as a search engine, not a robot
1801
1
2
                if($self->{cache}) {
1802
0
0
                        $self->{cache}->set($key, 'search', '1 day');
1803                }
1804
1
15
                return 0;
1805        }
1806
1807
11
20
        unless($self->{browser_detect}) {
1808
4
4
6
459
                if(eval { require HTTP::BrowserDetect; }) {
1809
4
7861
                        HTTP::BrowserDetect->import();
1810
4
6
                        $self->{browser_detect} = HTTP::BrowserDetect->new($agent);
1811                }
1812        }
1813
11
350
        if($self->{browser_detect}) {
1814
11
26
                my $is_robot = $self->{browser_detect}->robot();
1815
11
1230
                if(defined($is_robot)) {
1816
2
7
                        $self->_debug("HTTP::BrowserDetect '$ENV{HTTP_USER_AGENT}' returns $is_robot");
1817                }
1818
11
60
                $is_robot = (defined($is_robot) && ($is_robot)) ? 1 : 0;
1819
11
30
                $self->_debug("is_robot: $is_robot");
1820
1821
11
137
                if($is_robot) {
1822
2
3
                        if($self->{cache}) {
1823
0
0
                                $self->{cache}->set($key, 'robot', '1 day');
1824                        }
1825
2
2
                        $self->{is_robot} = $is_robot;
1826
2
5
                        return $is_robot;
1827                }
1828        }
1829
1830
9
10
        if($self->{cache}) {
1831
0
0
                $self->{cache}->set($key, 'unknown', '1 day');
1832        }
1833
9
12
        $self->{is_robot} = 0;
1834
9
17
        return 0;
1835}
1836
1837 - 1849
=head2 is_search_engine

Is the visitor a search engine?

    if(CGI::Info->new()->is_search_engine()) {
        # display generic information about yourself
    } else {
        # allow the user to pick and choose something to display
    }

Can be overriden by the IS_SEARCH_ENGINE environment setting

=cut
1850
1851sub is_search_engine
1852{
1853
57
631
        my $self = shift;
1854
1855
57
76
        if(defined($self->{is_search_engine})) {
1856
9
16
                return $self->{is_search_engine};
1857        }
1858
1859
48
76
        if($ENV{'IS_SEARCH_ENGINE'}) {
1860
3
8
                return $ENV{'IS_SEARCH_ENGINE'}
1861        }
1862
1863
45
54
        my $remote = $ENV{'REMOTE_ADDR'};
1864
45
43
        my $agent = $ENV{'HTTP_USER_AGENT'};
1865
1866
45
106
        unless($remote && $agent) {
1867                # Probably not running in CGI - assume not a search engine
1868
13
22
                return 0;
1869        }
1870
1871
32
26
        my $key;
1872
1873
32
47
        if($self->{cache}) {
1874
1
2
                $key = "$remote/$agent";
1875
1
5
                if(defined($remote) && $self->{cache}) {
1876
1
5
                        if(my $type = $self->{cache}->get("$remote/$agent")) {
1877
1
554
                                return $self->{is_search} = ($type eq 'search');
1878                        }
1879                }
1880        }
1881
1882        # Don't use HTTP_USER_AGENT to detect more than we really have to since
1883        # that is easily spoofed
1884
31
93
        if($agent =~ /www\.majestic12\.co\.uk|facebookexternal/) {
1885                # Mark Facebook as a search engine, not a robot
1886
3
3
                if($self->{cache}) {
1887
0
0
                        $self->{cache}->set($key, 'search', '1 day');
1888                }
1889
3
4
                return 1;
1890        }
1891
1892
28
38
        unless($self->{browser_detect}) {
1893
14
14
15
516
                if(eval { require HTTP::BrowserDetect; }) {
1894
14
7800
                        HTTP::BrowserDetect->import();
1895
14
23
                        $self->{browser_detect} = HTTP::BrowserDetect->new($agent);
1896                }
1897        }
1898
28
1341
        if(my $browser = $self->{browser_detect}) {
1899
28
48
                my $is_search = ($browser->google() || $browser->msn() || $browser->baidu() || $browser->altavista() || $browser->yahoo() || $browser->bingbot());
1900
28
4366
                if(!$is_search) {
1901
19
69
                        if(($agent =~ /SeznamBot\//) ||
1902                           ($agent =~ /Google-InspectionTool\//) ||
1903                           ($agent =~ /Googlebot\//)) {
1904
3
2
                                $is_search = 1;
1905                        }
1906                }
1907
28
53
                if($is_search && $self->{cache}) {
1908
0
0
                        $self->{cache}->set($key, 'search', '1 day');
1909                }
1910
28
59
                return $self->{is_search_engine} = $is_search;
1911        }
1912
1913        # TODO: DNS lookup, not gethostbyaddr - though that will be slow
1914
0
0
        my $hostname = gethostbyaddr(inet_aton($remote), AF_INET) || $remote;
1915
1916
0
0
        my @cidr_blocks = ('47.235.0.0/12');    # Alibaba
1917
1918
0
0
        if((defined($hostname) && ($hostname =~ /google|msnbot|bingbot|amazonbot|GPTBot/) && ($hostname !~ /^google-proxy/)) ||
1919           (Net::CIDR::cidrlookup($remote, @cidr_blocks))) {
1920
0
0
                if($self->{cache}) {
1921
0
0
                        $self->{cache}->set($key, 'search', '1 day');
1922                }
1923
0
0
                $self->{is_search_engine} = 1;
1924
0
0
                return 1;
1925        }
1926
1927
0
0
        $self->{is_search_engine} = 0;
1928
0
0
        return 0;
1929}
1930
1931 - 1953
=head2 browser_type

Returns one of 'web', 'search', 'robot' and 'mobile'.

    # Code to display a different web page for a browser, search engine and
    # smartphone
    use Template;
    use CGI::Info;

    my $info = CGI::Info->new();
    my $dir = $info->rootdir() . '/templates/' . $info->browser_type();

    my $filename = ref($self);
    $filename =~ s/::/\//g;
    $filename = "$dir/$filename.tmpl";

    if((!-f $filename) || (!-r $filename)) {
        die "Can't open $filename";
    }
    my $template = Template->new();
    $template->process($filename, {}) || die $template->error();

=cut
1954
1955sub browser_type {
1956
38
680
        my $self = shift;
1957
1958
38
51
        if($self->is_mobile()) {
1959
13
49
                return 'mobile';
1960        }
1961
25
39
        if($self->is_search_engine()) {
1962
8
20
                return 'search';
1963        }
1964
17
42
        if($self->is_robot()) {
1965
6
12
                return 'robot';
1966        }
1967
11
27
        return 'web';
1968}
1969
1970 - 1985
=head2 get_cookie

Returns a cookie's value, or undef if no name is given, or the requested
cookie isn't in the jar.

Deprecated - use cookie() instead.

    use CGI::Info;

    my $i = CGI::Info->new();
    my $name = $i->get_cookie(cookie_name => 'name');
    print "Your name is $name\n";
    my $address = $i->get_cookie('address');
    print "Your address is $address\n";

=cut
1986
1987sub get_cookie {
1988
15
363
        my $self = shift;
1989
1990
15
15
        return $self->cookie(\@_);
1991}
1992
1993 - 2037
=head2 cookie

Returns a cookie's value, or undef if no name is given, or the requested
cookie isn't in the jar.
API is the same as "param",
it will replace the "get_cookie" method in the future.

    use CGI::Info;

    my $name = CGI::Info->new()->cookie('name');
    print "Your name is $name\n";


=head3 API SPECIFICATION

=head4 INPUT

  {
    cookie_name => {
      'type' => 'string',
      'min' => 1,
      'matches' => qr/^[!#-'*+\-.\^_`|~0-9A-Za-z]+$/     # RFC6265
    }
  }

=head4 OUTPUT

Cookie not set: C<undef>

Cookie set:

  {
    type => 'string',
    optional => 1,
    matches => qr/   # RFC6265
      ^
      (?:
        "[\x21\x23-\x2B\x2D-\x3A\x3C-\x5B\x5D-\x7E]*"   # quoted
      | [\x21\x23-\x2B\x2D-\x3A\x3C-\x5B\x5D-\x7E]*     # unquoted
      )
      $
    /x
  }

=cut
2038
2039sub cookie
2040{
2041
2128
5130
        my $self = shift;
2042
2128
2795
        my $params = Params::Validate::Strict::validate_strict({
2043                args => Params::Get::get_params('cookie_name', @_),
2044                schema => {
2045                        cookie_name => {
2046                                'type' => 'string',
2047                                'min' => 1,
2048                                'matches' => qr/^[!#-'*+\-.\^_`|~0-9A-Za-z]+$/   # RFC6265
2049                        }
2050                }
2051        });
2052
2053
388
34844
        my $field = $params->{'cookie_name'};
2054
2055        # Validate field argument
2056
388
539
        if(!defined($field)) {
2057
93
213
                $self->_error('what cookie do you want?');
2058
93
977
                Carp::croak('what cookie do you want?');
2059
0
0
                return;
2060        }
2061
295
284
        if(ref($field)) {
2062
0
0
                $self->_error('Cookie name should be a string');
2063
0
0
                Carp::croak('Cookie name should be a string');
2064
0
0
                return;
2065        }
2066
2067        # Load cookies if not already loaded
2068
295
394
        unless($self->{jar}) {
2069
30
34
                if(defined $ENV{'HTTP_COOKIE'}) {
2070                        # grep { /=/ } filters out malformed tokens (empty strings, bare
2071                        # semicolons, entries with no name=value separator) that would
2072                        # otherwise cause split(/=/, $_, 2) to return a single-element list
2073                        # and make the flattened list odd-length, corrupting the hash.
2074                        $self->{jar} = {
2075
48
74
                                map  { split(/=/, $_, 2) }
2076
49
61
                                grep { /=/ }
2077
26
48
                                split(/; /, $ENV{'HTTP_COOKIE'})
2078                        };
2079                }
2080        }
2081
2082        # Return the cookie value if it exists, otherwise return undef
2083
295
583
        return $self->{jar}{$field};
2084}
2085
2086 - 2101
=head2 status($status)

Sets or returns the status of the object,
200 for OK,
otherwise an HTTP error code

=over 4

=item $status

Optional integer value to be set or retrieved.
If omitted, the value is retrieved.

=back

=cut
2102
2103sub status
2104{
2105
252
16813
        my $self = shift;
2106
252
158
        my $status = shift;
2107
2108        # Set status if provided
2109
252
325
        return $self->{status} = $status if(defined($status));
2110
2111        # Determine status based on request method if status is not set
2112
117
149
        unless (defined $self->{status}) {
2113
32
37
                my $method = $ENV{'REQUEST_METHOD'};
2114
2115
32
94
                return 405 if $method && ($method eq 'OPTIONS' || $method eq 'DELETE');
2116
28
55
                return 411 if $method && ($method eq 'POST' && !defined $ENV{'CONTENT_LENGTH'});
2117
2118
25
58
                return 200;
2119        }
2120
2121        # Return current status or 200 by default
2122
85
210
        return $self->{status} || 200;
2123}
2124
2125 - 2137
=head2 messages

Returns the messages that the object has generated as a ref to an array of hashes.

    my @messages;
    if(my $w = $info->messages()) {
        @messages = map { $_->{'message'} } @{$w};
    } else {
        @messages = ();
    }
    print STDERR join(';', @messages), "\n";

=cut
2138
2139sub messages
2140{
2141
21
3055
        my $self = shift;
2142
2143
21
35
        return $self->{'messages'};
2144}
2145
2146 - 2150
=head2  messages_as_string

Returns the messages of that the object has generated as a string.

=cut
2151
2152sub messages_as_string
2153{
2154
8
333
        my $self = shift;
2155
2156
8
17
        if(scalar($self->{'messages'})) {
2157
4
12
4
4
13
5
                my @messages = map { $_->{'message'} } @{$self->{'messages'}};
2158
4
9
                return join('; ', @messages);
2159        }
2160
4
9
        return '';
2161}
2162
2163 - 2182
=head2 cache($cache)

Get/set the internal cache system.

Use this rather than pass the cache argument to C<new()> if you see these error messages,
"(in cleanup) Failed to get MD5_CTX pointer".
It's some obscure problem that I can't work out,
but calling this after C<new()> works.

=over 4

=item $cache

Optional cache object.
When not given,
returns the current cache object.

=back

=cut
2183
2184sub cache
2185{
2186
11
47
        my $self = shift;
2187
11
10
        my $cache = shift;
2188
2189
11
17
        if($cache) {
2190
4
201
                croak(ref($self), ':cache($cache) is not an object') if(!Scalar::Util::blessed($cache));
2191
2
9
                croak(ref($self), ':cache($cache) does not support the get() method') if(!$cache->can('get'));
2192
2
4
                croak(ref($self), ':cache($cache) does not support the set() method') if(!$cache->can('set'));
2193
2
2
                $self->{'cache'} = $cache;
2194        }
2195
9
14
        return $self->{'cache'};
2196}
2197
2198 - 2205
=head2 set_logger

Sets the class, array, code reference, or file that will be used for logging.

Sometimes you don't know what the logger is until you've instantiated the class.
This function fixes the catch-22 situation.

=cut
2206
2207sub set_logger
2208{
2209
11
43
        my $self = shift;
2210
11
20
        my $params = Params::Get::get_params('logger', @_);
2211
2212
11
170
        if(my $logger = $params->{'logger'}) {
2213
11
22
                if(Scalar::Util::blessed($logger)) {
2214
7
10
                        $self->{'logger'} = $logger;
2215                } else {
2216
4
9
                        $self->{'logger'} = Log::Abstraction->new($logger);
2217                }
2218        } else {
2219
0
0
                $self->{'logger'} = Log::Abstraction->new();
2220        }
2221
11
82
        return $self;
2222}
2223
2224# Log and remember a message
2225sub _log
2226{
2227
2135
1858
        my ($self, $level, @messages) = @_;
2228
2229
2135
1748
        if(scalar(@messages)) {
2230                # FIXME: add caller's function
2231                # if(($level eq 'warn') || ($level eq 'info')) {
2232
2135
2135
1197
4100
                        push @{$self->{'messages'}}, { level => $level, message => join(' ', grep defined, @messages) };
2233                # }
2234
2235
2135
3221
                if(scalar(@messages) && (my $logger = $self->{'logger'})) {
2236
2135
3548
                        $self->{'logger'}->$level(join('', grep defined, @messages));
2237                }
2238        }
2239}
2240
2241sub _debug {
2242
803
431
        my $self = shift;
2243
803
547
        $self->_log('debug', @_);
2244}
2245
2246sub _info {
2247
74
40
        my $self = shift;
2248
74
72
        $self->_log('info', @_);
2249}
2250
2251sub _notice {
2252
0
0
        my $self = shift;
2253
0
0
        $self->_log('notice', @_);
2254}
2255
2256sub _trace {
2257
1062
609
        my $self = shift;
2258
1062
1116
        $self->_log('trace', @_);
2259}
2260
2261# Emit a warning message somewhere
2262sub _warn {
2263
103
68
        my $self = shift;
2264
103
121
        my $params = Params::Get::get_params('warning', @_);
2265
2266
103
1105
        $self->_log('warn', $params->{'warning'});
2267
88
4332
        if(!defined($self->{'logger'})) {
2268
0
0
                Carp::carp($params->{'warning'});
2269        }
2270}
2271
2272# Emit an error message somewhere
2273sub _error {
2274
93
100
        my $self = shift;
2275
93
143
        my $params = Params::Get::get_params('warning', @_);
2276
2277
93
938
        $self->_log('error', $params->{'warning'});
2278
93
28581
        if(!defined($self->{'logger'})) {
2279
0
0
                Carp::croak($params->{'warning'});
2280        }
2281}
2282
2283# Ensure all environment variables are sanitized and validated before use.
2284# Use regular expressions to enforce strict input formats.
2285sub _get_env
2286{
2287
218
160
        my ($self, $var) = @_;
2288
2289
218
405
        return unless defined $ENV{$var};
2290
2291        # Strict sanitization: allow alphanumeric and limited special characters
2292
124
239
        if($ENV{$var} =~ /^[\w\.\-\/:\\]+$/) {
2293
120
159
                return $ENV{$var};
2294        }
2295
4
7
        $self->_warn("Invalid value in environment variable: $var");
2296
2297
4
5
        return undef;
2298}
2299
2300 - 2306
=head2 reset

Class method to reset the class.
You should do this in an FCGI environment before instantiating,
but nowhere else.

=cut
2307
2308sub reset {
2309
425
875467
        my $class = shift;
2310
2311
425
505
        unless($class eq __PACKAGE__) {
2312
2
75
                carp('Reset is a class method');
2313
1
3
                return;
2314        }
2315
2316
423
410
        $stdin_data = undef;
2317}
2318
2319sub AUTOLOAD
2320{
2321
668
166487
        our $AUTOLOAD;
2322
2323
668
797
        my $self = shift or return;
2324
2325
668
652
        return if(!defined($AUTOLOAD));
2326
2327        # Extract the method name from the AUTOLOAD variable
2328
667
1975
        my ($method) = $AUTOLOAD =~ /::(\w+)$/;
2329
2330        # Skip if called on destruction
2331
667
1823
        return if($method eq 'DESTROY');
2332
2333
20
22
        Carp::croak(__PACKAGE__, ": Unknown method $method") if(!ref($self));
2334
2335        # Allow the AUTOLOAD feature to be disabled
2336
20
45
        Carp::croak(__PACKAGE__, ": Unknown method $method") if(exists($self->{'auto_load'}) && boolean($self->{'auto_load'})->isFalse());
2337
2338        # Ensure the method is called on the correct package object or a subclass
2339
17
30
        return unless((ref($self) eq __PACKAGE__) || (UNIVERSAL::isa((caller)[0], __PACKAGE__)));
2340
2341        # Validate method name - only allow safe parameter names
2342
17
35
        Carp::croak(__PACKAGE__, ": Invalid method name: $method") unless $method =~ /^[a-zA-Z_][a-zA-Z0-9_]*$/;
2343
2344        # Delegate to the param method
2345
17
28
        return $self->param($method);
2346}
2347
2348 - 2424
=head1 AUTHOR

Nigel Horne, C<< <njh at nigelhorne.com> >>

=head1 BUGS

is_tablet() only currently detects the iPad and Windows PCs. Android strings
don't differ between tablets and smartphones.

params() returns a ref which means that calling routines can change the hash
for other routines.
Take a local copy before making amendments to the table if you don't want unexpected
things to happen.

=head1 SEE ALSO

=over 4

=item * L<Configure an Object at Runtime|Object::Configure>

=item * L<Test Dashboard|https://nigelhorne.github.io/CGI-Info/coverage/>

=item * L<HTTP::BrowserDetect>

=item * L<https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker>

=back

=head1 REPOSITORY

L<https://github.com/nigelhorne/CGI-Info>

=head1 SUPPORT

This module is provided as-is without any warranty.

Please report any bugs or feature requests to C<bug-cgi-info at rt.cpan.org>,
or through the web interface at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=CGI-Info>.
I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.

You can find documentation for this module with the perldoc command.

    perldoc CGI::Info

You can also look for information at:

=over 4

=item * MetaCPAN

L<https://metacpan.org/dist/CGI-Info>

=item * RT: CPAN's request tracker

L<https://rt.cpan.org/NoAuth/Bugs.html?Dist=CGI-Info>

=item * CPAN Testers' Matrix

L<http://matrix.cpantesters.org/?dist=CGI-Info>

=item * CPAN Testers Dependencies

L<http://deps.cpantesters.org/?module=CGI::Info>

=back

=head1 LICENCE AND COPYRIGHT

Copyright 2010-2026 Nigel Horne.

Usage is subject to the GPL2 licence terms.
If you use it,
please let me know.

=cut
2425
24261;